Why Lakebase is the answer to AI Agent memory needs

Databricks Lakebase enables stateful, production-grade AI agents by bringing a managed, serverless PostgreSQL OLTP engine directly into the Databricks Lakehouse. Using Qubika’s Databricks Setup Accelerator as a real-world example, we show how Lakebase eliminates the need for external databases. As a result, AI agents can remember, reason, and resume complex tasks with minimal infrastructure, reduced operational overhead, and tighter integration with analytics and AI workloads on a single platform.

Lakebase: OLTP, Analytics and AI on One Platform

With Databricks Lakebase, transactional workloads are no longer a separate concern from analytics and AI. Lakebase brings a managed, serverless PostgreSQL engine directly into the Databricks Lakehouse, making OLTP a first-class citizen alongside Delta Lake, MLflow, and Databricks Apps.

At Qubika, we already explored the fundamentals of this shift in a previous post — Bringing OLTP into your Lakehouse: Why Databricks Lakebase is a game changer — where we dive deeper into what Lakebase is, why Databricks introduced it at the Databricks Data + AI Summit 2025, and how it reshapes transactional workloads on the Lakehouse.

This post builds on that foundation and focuses on one of the most compelling real-world fits of Lakebase in practice: enabling stateful AI agents without adding infrastructure, pipelines, or operational complexity.

Context: The Business Case Behind the Accelerator

Before diving into how Lakebase enables agent memory and why it became such a strong fit for our architecture, it’s important to set the context of the business problem we were solving.

At Qubika, we built the Databricks Setup Accelerator to help enterprises configure, migrate, and optimize Databricks environments at scale. Setting up Databricks correctly is not a trivial task: it involves governance decisions, Unity Catalog configuration, permission models, naming conventions, and CI/CD alignment — all of which are often manual, error-prone, and inconsistent across environments.

Databricks Lakebase

Our accelerator automates this process end to end by combining:

ERD and schema parsing
Governance and Unity Catalog policy enforcement
SQL generation for catalogs, schemas, tables, and permissions
Deployment through Databricks Apps and CI/CD pipelines

To make this work in real enterprise scenarios, the accelerator relies on an AI agent capable of handling long-running, multi-step workflows — not a stateless chatbot, but a system that needs to remember, reason, and resume.

AI Agents Need Persistent Memory — and Lakebase Is a Perfect Fit

For many years, building stateful AI applications meant juggling two separate infrastructure stacks. AI agents and LLMs lived in the analytics world of notebooks and pipelines, while the conversation history and user state they needed lived in external PostgreSQL databases, Redis caches, or document stores.

This split resulted in complex authentication flows, duplicated infrastructure, cross-network latency, and significant DevOps overhead.

Lakebase is a new category of operational database that bridges the long-standing division between transactional and analytical systems. By bringing online processing directly into the Databricks Lakehouse, teams can build interactive apps, dashboards, and AI agents on the same foundation as their analytics and models.

The Challenge: AI Agents Without Memory Are Stateless

When we started building the Databricks Setup Accelerator — an AI-powered tool that helps enterprises configure, migrate, and optimize their Databricks environments — we faced a fundamental challenge: AI agents need persistent memory.

Our accelerator uses LangGraph for intelligent workflow orchestration, and it needs to:

Remember conversation context across user sessions
Track workflow state for multi-step configuration tasks
Manage user isolation for multi-tenant enterprise deployments
Associate uploaded files with specific conversations

Without a proper OLTP backend, this would have required provisioning external PostgreSQL clusters, managing credentials, handling backups, and building custom integrations — all while maintaining millisecond-level response times for real-time interaction.

Lakebase removed that entire layer of complexity.

Why Lakebase Was the Right Choice

Lakebase fit our needs with zero architectural compromise:

PostgreSQL compatibility: Drop‑in replacement using SQLAlchemy and standard drivers
Serverless by default: No sizing, patching, or scaling decisions
Same‑workspace deployment: OLTP lives next to notebooks, models, and apps
Unity Catalog governance: Conversation data follows the same policies as analytics
Native OAuth authentication: No database passwords, no secrets to rotate

From an engineering perspective, this is the key shift: AI memory becomes just another governed asset in the Lakehouse.

Deep Dive: Agent Memory Architecture on Lakebase

Agent Memory Architecture on Databricks Lakebase

In the Databricks Setup Accelerator, Lakebase is responsible for three critical layers of memory:

1. User Management

Each Databricks user is mapped to an internal user record. Lakebase enables fast lookups and strict isolation in multi‑tenant setups.

User identity tied to Databricks authentication
Last activity tracking
No external IAM system required

2. Conversation & Workflow State

LangGraph checkpoints are stored as JSON in Lakebase tables. This allows users to pause and resume complex workflows without losing context.

Deterministic recovery of agent state
Support for long‑running, multi‑step tasks
No in‑memory hacks or fragile caches

3. Interaction History

Full conversation history is persisted with ordering and roles (user, assistant, system), enabling context‑aware responses and auditability.

This design gives us durable, queryable, governed AI memory — something that is extremely difficult to achieve with external databases.

Security & OAuth: A Quiet Game Changer

One of the most impactful (and underrated) features is Lakebase’s native OAuth integration.

Our Databricks App authenticates to Lakebase using the workspace’s OAuth token. No credentials are stored, no secrets are injected, and no passwords ever exist.

This single feature:

Removes an entire attack surface
Simplifies deployment pipelines
Aligns perfectly with enterprise security requirements

Before Lakebase	After Lakebase
External PostgreSQL cluster	Same-workspace Lakebase instance
Password rotation workflows	Native OAuth authentication
Separate monitoring stack	Unified Databricks observability
Manual backup configuration	Serverless, automatic durability
Cross-cloud networking	No network hops required

Final Thoughts: Lakebase as the Missing Layer for AI Agents

Lakebase is not just “PostgreSQL inside Databricks”. For AI workloads, it is the missing memory layer that finally makes production‑grade agents viable on the Lakehouse.

For Qubika’s Databricks Setup Accelerator, Lakebase enabled something fundamental: AI agents that remember, reason, and evolve — without adding infrastructure complexity.

If you’re building AI agents, interactive applications, or real‑time systems on Databricks, Lakebase is the foundation that lets you do it right.

“Lakebase was the missing piece for building production AI agents on Databricks. Our accelerator needs persistent memory across sessions and workflows. With Lakebase, we achieved native PostgreSQL OLTP with zero operational overhead.”

— Facundo Sentena, Sr. AI Engineer, Qubika