A transformative year for the Lakehouse Platform
2025 was a landmark year for Databricks, delivering a wave of enhancements that strengthened its Lakehouse vision across performance, governance, interoperability, and AI. The platform didn’t just grow in features – it matured in its ability to solve real enterprise challenges: data silos, operational complexity, governance overhead, and the increasing demand for trustworthy AI.
Each innovation was tied directly to business outcomes: faster insights, greater scalability, reduced costs, stronger compliance, and smoother collaboration. Databricks pushed its Data Intelligence Platform forward on every front, helping organizations consolidate tooling, accelerate time-to-value, and make data-driven decisions with more confidence.
The sections below highlight the most meaningful updates from 2025 — from engine acceleration and unified governance to AI-native capabilities and new ecosystem extensions — and how these advances support productivity, agility, and secure collaboration.
We end with a forward-looking wishlist for 2026, reflecting the next wave of needs for data-driven enterprises.
Performance and scale: The 2025 breakthroughs
In 2025, Databricks delivered major gains in speed, scalability, and efficiency. In particular:
- Photon got faster and cheaper. Predictive Query Execution and Vectorized Shuffle continued the trend of accelerating queries while cutting costs — in some cases up to 50% for heavy workloads. Teams got faster dashboards with zero manual tuning.
- Spark 4.0 & Runtime 17. Better SQL, new functions, VARIANT for JSON, expanded APIs, and streaming improvements — all translating to cleaner pipelines, fewer errors, and more unified batch/streaming handling.
- Lakehouse Federation GA. One of the biggest painkillers of the year: query BigQuery, Oracle, Teradata, and more without copying data. One governance layer, no siloed ETLs, and much lower latency.
- Serverless for CPU and GPU. BI teams got instant SQL performance with SQL Serverless; AI teams got GPU Serverless for on-demand model training and inference. No cluster management, no idle cost.
Governance and compliance: Unity Catalog became the control plane
Unity Catalog evolved into the centralized governance backbone of the Lakehouse. Specifically:
- Tag-based governance and automated classification. ABAC and auto-masking replaced thousands of manual rules, giving large organizations scalable and consistent security.
- Open format interoperability (Delta + Iceberg). Unity Catalog now governs Iceberg as well, ensuring open formats and freedom from vendor lock-in.
- One semantic layer for business metrics. Unity Catalog Metrics ended the multiversion KPI nightmare. A metric is defined once and reused everywhere.
- Curated discovery and request-access. Domain catalogs, certifications, quality indicators, and streamlined access requests boosted adoption across business teams.
- Governance for AI. Model lineage, prompt lineage, auditability, and centralized access control brought compliance and oversight to the LLM era.
AI and ML: Making the Lakehouse truly AI-Native
Databricks moved to embed governance, evaluation, and scalability directly into the Lakehouse for enterprise-ready generative AI.
- MLflow 3.0 (LLMOps). Evaluation with “LLM judges,” trace capture, and feedback loops — finally giving enterprises a real operational framework for generative AI quality and reliability.
- Agent Bricks. A governed framework for building AI agents that perform tasks and self-evaluate against company policies. Automation with guardrails built in.
- Genie + Databricks Assistant. Natural language analytics for the masses and AI-assisted SQL for technical users. Insight delivery sped up across the board.
- Vector search and unstructured data. Native embeddings + similarity search = enterprise RAG, semantic search, and AI apps without needing a separate vector DB.
- Foundation models + GPU Serverless. Train, fine-tune, and serve LLMs securely inside the Lakehouse.
Data engineering and productivity: Less plumbing, more dDelivery
New abstractions and automation significantly reduced engineering friction, enabling teams to build, deploy, and operate data pipelines reliably and at speed.
- Lakeflow (GA). Unified ingestion + declarative ETL + orchestration. Plus: a no-code pipeline designer that turns business intent into production pipelines. Fewer tools, faster builds.
- Asset Bundles + native CI/CD. Version-controlled data projects, reproducible deployments, and software-grade practices applied to analytics.
- Databricks One. A simplified portal for business consumers — governed dashboards, metrics, and apps in one unified space.
Ecosystem expansion and strategic moves
Strategic platform expansions reinforced Databricks’ ambition to be the unified foundation for data, analytics, AI, and applications.
- Lakebase. A Postgres-compatible transactional database built on the Lakehouse. Ideal for AI-native applications where operational and analytical data must live side by side.
- Databricks Apps. Deploy internal or external apps directly inside Databricks with enterprise auth, permissions, and audit logging out of the box.
- Free Tier + training investments. A larger talent pool and easier experimentation for organizations.
- A continued commitment to open standards. Iceberg, Delta Sharing, MLflow, Spark — openness remains core to Databricks’ strategy.
Looking ahead: The 2026 wishlist
Looking ahead to 2026, here’s our list of 12 things we’ll be looking out for:


