Kasey Uhlenhuth is a product manager on the machine learning team at Databricks. Before Databricks, she worked on the Visual Studio and C# team at Microsoft building developer productivity tools. Kasey holds an MBA from the Stanford Graduate School of Business and a BA in Computer Science from Harvard University.
The pursuit of AI is one of the biggest priorities in data today. The Thursday morning keynote will be led by Databricks Cofounder and CEO Ali Ghodsi and cover advances in data science, machine learning, MLOps and more in both open source and the Databricks Lakehouse Platform.
We’ll also be joined by data leaders from McDonalds and Microsoft, as well as the legendary Bill Nye, a scientist, engineer, comedian and author.
In this session, the Databricks product team provides a deeper dive into the machine learning announcements. Join us for a detailed demo that gives you insights into the latest innovations that simplify the ML lifecycle — from preparing data, discovering features, and training and managing models in production.
[daisna21-sessions-od]
In this session, you will have the opportunity to meet the Databricks product team and ask them any question - trends in the market, emerging problems areas, product direction, customer use cases or anything machine learning related.
[daisna21-sessions-od]
November 18, 2020 04:00 PM PT
Matei Zaharia
Assistant Professor of Computer Science Original Creator of Apache Spark & MLflow, Databricks
Deploying and operating machine learning applications is challenging because they are highly dependent on input data and can fail in complex ways. Problems such as training/inference differences in data format, data skew, and misconfigured software environments can easily sneak into a production application and impact its quality. To address these types of problems, organizations are adopting ML Platform software and MLOps practices specifically for managing machine learning applications.
In this talk, I’ll present some of the latest functionality added for productionizing machine learning in MLflow, the popular open source machine learning platform started by Databricks in 2018. These include built-in support for model management and review using the Model Registry, APIs for automatic Continuous Integration and Delivery (CI/CD), model schemas to catch differences in a model’s expected data format, and integration with model explainability tools. I’ll also talk about other work happening in the open source MLflow community, including deep integration with PyTorch and its growing ecosystem of model productionization tools.
Kasey Uhlenhuth
Sr Product Manager, Machine Learning, Databricks
Lin Qiao
Engineering Director, PyTorch, Facebook
Lin Qiao, engineering director on the Facebook AI team, talks about bringing machine learning to production at scale, including the PyTorch integration with MLflow. She talks about the guiding principles for PyTorch and the goals set back in 2016 during initial development through the present day, with a focus on ecosystem compatibility.
Lin reviews the PyTorch production ecosystem and discusses how MLflow and PyTorch are integrated for tracking, models and model serving.
Clemens Mewald
Director of Product Management, Data Science and Machine Learning, Databricks
It is no longer a secret that data driven insights and decision making are essential in any company’s strategy to keep up with today’s rapid pace of change and remain relevant. Although we take this realization for granted, we are still in the very early stage of enabling data teams to deliver on their promise. One of the reasons is that we haven’t equipped this profession with the modern toolkit they deserve.
Existing solutions leave data teams with impossible trade-offs. Giving Data Scientists the freedom to use any open source tools on their laptops doesn’t provide a clear path to production and governance. Simply hosting those same tools in the Cloud may solve some of the data privacy and security issues, but doesn’t improve productivity nor collaboration. On the other hand, most robust and scalable production environments hinder innovation and experimentation by slowing Data Scientists down.
In this talk we will give an update on the next generation Data Science Workspace on Databricks, originally unveiled at Spark + AI Summit 2020. Specifically, we will cover new capabilities added to Databricks Notebooks as well as Git-based Databricks Projects. Until now, the industry has assumed that collaborative notebooks are for experimentation only, and not for production. Our approach solved for these challenges and, for the first time, provides a single platform for data teams to rapidly and confidently move from experimentation to production.
In this talk, we will unveil the next generation of the Databricks Data Science Workspace: An open and unified experience for modern data teams specifically designed to address these hard tradeoffs. We will introduce new features that leverage the open source tools you are familiar with to give you a laptop-like experience that provides the flexibility to experiment and the robustness to create reliable and reproducible production solutions.
Stephan Schwarz
Production Planning: Manager Smart Data Processing (Mercedes Operations), Daimler
Sebastian Findeisen
Data Scientist, Daimler
When we think about luxury cars, what first comes to mind is often the end product-- the sleek design, how fast it goes, and so on. But we often overlook the enormous amount of effort it takes before that car rolls off the assembly line. In this talk, Daimler will give us a peek into how data and ML is playing a critical role to drive car production automation, with MLOps and tools like MLflow being leveraged to automate a number of complex processes, and provide insights that create production efficiencies.
Rohan Kumar
Corporate Vice President, Azure Data, Microsoft
Responsible ML is the most talked about field in AI at the moment. With the growing importance of ML, it is even more important for us to exercise ethical AI practices and ensure that the models we create live up to the highest standards of inclusiveness and transparency. Join Rohan Kumar, as he talks about how Microsoft brings cutting-edge research into the hands of customers to make them more accountable for their models and responsible in their use of AI. For the AI community, this is an open invitation to collaborate and contribute to shape the future of Responsible ML. This keynote is brought to you as an encore presentation from the global Summit.
Sarah Bird
Principal Program Manager, Microsoft Azure AI
Keynote from Mae Jemison
First woman of color in the world to go into space, former NASA astronaut
Exploration of the opportunities and obstacles encountered and clarity of purpose needed to achieve an extraordinary future -- such as human interstellar travel or a sustainable human existence on planet Earth -- and what roles can big data and advancing IT play.