In summary, Ulla Kruhse-Lehtonen, CEO and Co-founder of DAIN Studios opened and chaired the meetup, starting with a survey of where participants were joining from, taking note of different data professionals work and interest area, and whether there was a preference in the future to do live streaming / webinars when the event returns to its physical meetup. It was not surprising that Data Scientists make up almost 50% of the data professionals / students that attend the webinar. Other Data professionals include Data Engineers, Data Strategists, Business Leaders, and Data Analysts. 86% of respondents responded positively to the idea of future online webinars in conjunction with the physical meetups.
The key takeaway from the presentation was that anonymization is hard for any high dimensional data asset, but that AI-generated synthetic data offers a novel solution for big data anonymization. It allows to automatically generate completely new, statistically representative data subjects, that are “as close as possible” to actual subjects, without being “too close” to any of them.
MOSTLY AI is an Austrian based deep-tech startup that has specialized in synthetic data. Their solutions enable organizations to instantly and safely collaborate on big data assets while keeping the privacy of their customers fully protected. This breakthrough in data protection is made possible by leveraging generative deep neural networks that extract patterns, structures, and variations from existing data to generate highly realistic & highly accurate synthetic customers. Watch Michael’s presentation on YouTube here.
Taneli Mielikäinen, Distinguished Engineer and Senior Director at Verizon Media/ Yahoo delivered a presentation on Data and Machine Learning at Yahoo: Past, Present, Future. You can view Taneli’s presentation on Youtube here.
Verizon Media is a division of Verizon Communications that focuses on media and online business. They host brands like Yahoo, HuffPost and TechCrunch. Verizon Media transforms how people stay informed and entertained, communicate, and transact.
Ville Tuulos, Manager, Machine Learning Infrastructure at Netflix, presented More Data Science, Less Engineering with Metaflow. Metaflow was originally developed at Netflix to address the needs of its data scientists who work on demanding real-life data science projects. Netflix open-sourced Metaflow in 2019.
The presentation covered how models are only a small part of an end-to-end data science project. Production-grade projects rely on a thick stack of infrastructure. At the minimum, projects need data and a way to perform computation on it. In a business environment like Netflix’s, a typical data science project touches all layers of the stack, from data warehouse to architecture, to model development. To learn more about Metaflow, there are tutorials online available on the ffollowing link. You can watch Ville’s presentation on YouTube here.
Netflix is an American media-services provider and production company headquartered in Los Gatos, California, founded in 1997 by Reed Hastings and Marc Randolph in Scotts Valley, California. The company’s primary business is its subscription-based streaming service which offers online streaming of a library of films and television programs, including those produced in-house.