Scaling Cross-Cloud Analytics with Azure Databricks for AI-Driven Insights

AI & Data Innovation

Talk

Session Code

Sess-154

Day 3

11:35 - 12:05 EST

About the Session

As organizations adopt AI and machine learning at scale, the need for robust, real-time data infrastructure has never been more critical. This session presents a real-world engineering case study of a cross-cloud campaign analytics platform built to support massive data volumes and real-time decision-making—laying the foundation for AI-driven insights. Using Azure Databricks for distributed processing, Azure Data Factory for orchestration, and Amazon S3 as the ingestion layer, the platform delivers actionable intelligence through Power BI dashboards and integrations with downstream ML workflows. The system handles both historical and streaming data, enabling responsive marketing optimization and dynamic attribution modeling. Attendees will explore the full architectural blueprint—from Spark-based data transformation and performance tuning to orchestrating over 500 daily jobs with dependency control and auto-scaling compute clusters. Emphasis will be placed on designing metadata-driven pipelines, optimizing query performance through Z-order partitioning and file pruning, and ensuring reliability across cloud environments. This session will be especially valuable for data professionals, architects, and AI/ML practitioners building platforms to power enterprise intelligence. You’ll learn how this system achieved high availability, efficient resource utilization, and scalable automation—empowering teams to make smarter, faster decisions. Whether you’re enabling advanced analytics, preparing data for AI pipelines, or modernizing legacy reporting systems, you’ll leave with practical strategies to architect large-scale, cloud-native data platforms ready for AI innovation.

Speaker

Sruthi Erra Hareram

Independent Researcher, Canada