DATA + AI SUMMIT • JUNE 27 — 30, 2022 • FORMERLY SPARK+AI SUMMIT
Data + AI Summit, the world’s largest conference for the open modern data stack community, will bring together tens of thousands of data practitioners in person and online from June 27-30, 2022. All presenters will be expected to speak live from San Francisco, and your talks will be available online and promoted on our channels.
We invite you to share your expertise and stories with fellow data scientists, data engineers, data analysts and data leaders. Your solutions using the modern data stack are defined by open technologies that help deliver advanced data analytics, build data pipelines, and develop AI applications and machine learning models. Your experience solving these problems will be extremely valuable to your peers, whether you’re using technologies like Apache Spark™, Delta Lake and the lakehouse pattern, MLflow, TensorFlow, PyTorch, Scikit-learn, BI and SQL analytics, deep learning or machine learning frameworks.
Our global Summit community of over 50,000 would love to hear from you. So pen down your proposal for a 15-minute lightning talk, 40-minute session or 90-minute technical deep dive on how-to-and-why. We’d love to put your ideas, case studies or production use cases, best practices and technical knowledge in front of the largest gathering of Data + AI professionals. Submit your talk today and share your experience building and working on an open, modern data stack.
Themes and topics
Data scientists, data engineers, analysts, developers, researchers and ML practitioners all attend Summit to learn from the world’s leading experts on topics such as:
Share your experience building robust data pipelines with both batch and real-time streaming data architectures. From data ingestion to cleaning to processing for analytics and ML, we know you have a tough job. Share your insight on the architectures, challenges and best practices you’ve learned along the way.
Technologies/topic ideas: Delta Lake, CDC, medallion architecture, DLT, DBT, data munging, ETL/ELT, lakehouses, data lakes, Apache Spark internals, Spark performance optimizations and more
Recommendations and decisions in businesses and software are increasingly informed by data science, machine learning and deep learning. If you have real-world experience in these areas, help others learn through the data science analyses, machine learning and deep learning models you’ve built. Share your tips and tricks, triumphs and challenges.
We’re also looking for great sessions on productionizing machine learning projects and pipelines.
Technologies/topic ideas: MLflow, PyTorch, TensorFlow, Keras, XGBoost, Fastai, scikit-learn, Python and R ecosystems, MLlib and other Apache Spark for ML pipelines, trustworthy AI, explainable AI (xAI), model monitoring, model and concept drift and more
Data without analysis is wasted. Often that analysis comes in the form of reports and visualization, which are needed for companies to make decisions. If you have experience building analysis pipelines, integrations, tooling or infrastructure for data analytics, SQL, BI and visualization, the Summit audience would love to learn from you.
Technologies/topic ideas: SQL, Redash, Tableau, Power BI, visualization techniques, Spark SQL and DataFrames, data integration
How do we protect data from improper access by external and internal actors, safely share data with others, understand data lineage, and satisfy compliance needs? If you deal with challenges in data security and governance, we’d love to hear how you overcome those challenges through technology and business processes.
Technologies/topic ideas: Encryption, identity federation, data sharing, compliance controls, monitoring and auditing
Data analytics, machine learning and AI are having a profound impact on how organizations across industries are solving their toughest data challenges. In this track, we’ll explore how open source technologies, data analytics and AI are being applied to solve business challenges in the hottest industries, including topics like personalized healthcare, cyber threat protection, supply chain forecasting and fraud prevention.
If you have an interesting application of data analytics or AI in your business and want to share your journey of delivering data-driven innovation, then this thematic category is for you.
Recommendation: This talk type works best when presenting with a speaker from the technical side (e.g., data scientist) and someone from the line of business.
Technologies we are looking for:
A maximum of 2 speakers will be accepted per presentation. You’ll need to include the following information for each proposal:
Help us understand why your presentation is the right one for Summit. Please keep in mind that this event is by and for professionals. All presentations and supporting materials must be respectful and inclusive. Here is some advice on how to write a good conference proposal.