This past June, the Databricks Meetup in Montevideo brought together industry leaders and data professionals for a deep dive into the world of Databricks and AI-driven analytics. Hosted in Qubika’s offices in Buceo, the event featured an insightful session led by BASF Uruguay’s Data & Analytics team.As winter began in Uruguay, our team prepared the Buceo Headquarters with winter hot chocolate, a fireplace light-up, and our chef’s signature arepas — creating a comfortable and cozy environment for all attendees.
The speakers at the event included:
- Martín Curbelo – Automation & Data Analytics Manager, BASF
- Lucas Passera – Data Analytics Specialist, BASF
- Diego Correa – Senior Data Engineer, BASF
- Sebastián Díaz – Head of Data & AI, Qubika
They shared their experience with Databricks as their central platform for building data pipelines and driving AI innovation within the company.
The meetup also included a special presentation by Qubika, which provided an overview of key announcements from the Data+AI Summit 2025 in San Francisco. With a focus on cutting-edge technology and data management practices, the event provided attendees with valuable insights into how Databricks is reshaping data engineering and AI capabilities across industries.
Databricks Journey at BASF: Transforming Data Pipelines and Analytics
At the recent Databricks meetup in Montevideo, the Data & Analytics team at BASF Uruguay shared their insightful journey of adopting Databricks as the central platform for building data pipelines. They provided a deep dive into their transition from traditional data management systems to the world of scalable, AI-driven analytics with Databricks. This transition has not only improved their ability to manage vast amounts of data but also facilitated smarter, faster decision-making at all levels.
The session explored how BASF has implemented Databricks to overcome their data management challenges, achieve cost optimization, and enhance collaboration across teams. Here’s an extended look into their journey:
1. Building a Robust Data Pipeline with Databricks
Before the adoption of Databricks, BASF’s data pipeline was complex and fragmented, relying on various disparate systems for data collection, transformation, and storage. However, with Databricks, they were able to consolidate these processes into a unified, scalable platform that could handle vast amounts of data efficiently.
Key steps in BASF’s Databricks adoption include:
Enterprise Data Lake (EDL) Setup: BASF began by setting up their Enterprise Data Lake within Databricks. The platform enabled them to centralize data storage, making it easier to manage and analyze large datasets from different sources, including manufacturing data, sales, and finance.
AI-Assisted Developments: Leveraging the power of AI within Databricks, BASF started experimenting with machine learning models to optimize everything from supply chain management to customer insights. For instance, predictive analytics and anomaly detection models were integrated into their pipeline to automate and enhance operational decision-making.
Delta Lake Technology: The team extensively utilized Delta Lake for optimized data storage. With Delta Lake, they can efficiently manage data transactions, ensuring high performance and consistency across all their pipelines. This allowed them to track every data change and quickly recover from any discrepancies, giving them the ability to “travel back in time” and restore previous versions of data.
2. Data Governance and Unified Metadata Management with Unity Catalog
One of the standout features of Databricks that BASF adopted was the Unity Catalog. Initially, managing metadata and data access across various teams was a complex and cumbersome task, particularly in a global organization like BASF, where different business units require different levels of access to data.
Unity Catalog became a game-changer by providing a centralized layer for managing all metadata, tables, and data governance rules. It allowed BASF to:
Simplify Data Sharing: By using Unity Catalog, BASF could easily share data between different teams and workspaces. This eliminated the need to duplicate data, thus saving time and storage costs.
Improve Data Security: The Unity Catalog also improved data security by ensuring that only authorized users could access specific datasets, based on role-based access controls. This was especially important for a large enterprise like BASF, where different departments often work with sensitive data.
Track Data Lineage: BASF could now track the entire data lifecycle — from data ingestion to transformation and final use — making it easier to monitor and audit data workflows.
3. Medallion Architecture for Efficient Data Organization
Another critical concept that BASF implemented with Databricks is the Medallion Architecture, which organizes data into three layers: Bronze, Silver, and Gold.
Bronze Layer: This layer stores raw data, exactly as it is received from external sources. For BASF, this meant that they could securely store data from manufacturing systems, customer interactions, and finance data, without having to immediately transform it.
Silver Layer: Once the data was stored in its raw form, the next step was to clean, transform, and standardize it. BASF used Databricks to run ETL processes, transforming this raw data into a more structured form that could be used for analysis. This made their data much more reliable and easier to query.
Gold Layer: Finally, the Gold Layer holds the cleaned and highly refined data, ready for reporting and business intelligence. This data is used in dashboards and reports, providing key insights that are then used by BASF’s executives and business units.
This structure allowed BASF to organize their data more effectively, reduce duplication, and enhance the overall efficiency of their data processing pipeline.
4. Scalability and Cost Optimization with Databricks
As BASF scaled their data operations, one of the most significant challenges was ensuring the scalability of their infrastructure to handle increasing volumes of data. Databricks helped BASF address this challenge by providing a cloud-native solution capable of scaling on demand.
Performance at Scale: As BASF’s data grew, their ability to process large datasets without compromising on speed was critical. With Databricks, they could scale their clusters dynamically, optimizing for performance and cost. This helped BASF manage high workloads during key times of the year (like end-of-quarter or end-of-year reporting) without any bottlenecks or performance degradation.
Cost Management: BASF also leveraged Databricks’ ability to monitor and track costs. Using the Databricks Cost Tracking feature, they could identify which processes were consuming the most resources, making it easier to optimize their data pipeline and avoid unnecessary expenditures.
5. Real-Time Data Processing and Automation
One of the standout features of Databricks that BASF was excited to adopt was real-time data processing. BASF began using Databricks’ Structured Streaming capabilities to process data as it arrived. This allowed them to run real-time analytics on their data, which was previously not possible with their legacy systems.
Additionally, BASF automated much of their reporting pipeline with Databricks. What once took hours of manual processing now occurs automatically, saving significant time and reducing the risk of errors.
BASF’s Transformation: The Road Ahead
BASF’s journey with Databricks has been a significant transformation, with numerous gains in terms of data efficiency, AI adoption, and overall productivity. As they continue to explore the full potential of Databricks, they’re looking to further integrate AI into their workflows and expand the use of real-time data processing across different departments.
The lessons learned during their Databricks journey can serve as a blueprint for other organizations seeking to modernize their data pipelines and leverage the power of AI and Big Data in their operations.
BASF Uruguay’s experience with Databricks highlights the power of modern data platforms in transforming business operations. By adopting Databricks, they’ve not only streamlined their data management but also unlocked new capabilities in AI-driven analytics, data governance, and real-time decision-making. For businesses looking to scale their data operations, Databricks offers a proven platform that brings together data engineering, AI, and business intelligence into one seamless ecosystem.
Summit Highlights with Sebastián Díaz: Key Insights from the Data+AI Summit
At the meetup, Sebastián Díaz, Head of Data & AI at Qubika, shared key highlights from the Data+AI Summit 2025 in San Francisco. A major topic of discussion was how Databricks is evolving to better support both technical and business users.
Sebastián highlighted several key developments from the Summit:
AI Integration and Chatbots:
As AI tools like chatbots become more integral to business operations, Databricks is offering solutions to simplify these processes. One of the notable tools introduced was Databricks Apps, which allows users to create data-driven applications or chatbots that interact with Databricks data. This feature is particularly useful for business users who need to interact with data without needing deep technical expertise.
Improving User Experience for Business Users:
Sebastián emphasized a common issue faced by many organizations: providing technical tools to business users who aren’t familiar with the complex interfaces of platforms like Databricks. Databricks has responded by making its tools more accessible, offering solutions like Databricks One, which simplifies the user experience by enabling easy data access without needing to interact with the full Databricks workspace.
Customization and Flexibility for AI Models:
While Databricks excels in providing customizable AI models, Sebastián acknowledged that some challenges remain, particularly around fine-tuning these models for specific business needs. However, the platform is progressing, allowing users to run custom models and compare them against different AI models such as GPT, Anthropic, and ChatGPT, to find the best-performing solution for their business.
Data Governance and Security:
With the integration of AI and large-scale data operations, maintaining strong data governance and security remains a priority. Unity Catalog continues to play a central role in managing access to data, ensuring that only authorized users can access sensitive information.
Conclusion:
BASF’s journey with Databricks showcases the transformative power of the platform in modernizing data pipelines and driving AI innovation. By integrating Databricks into their workflows, BASF has not only improved efficiency but also unlocked new capabilities for data analysis and decision-making.
The June Databricks Meetup in Montevideo provided valuable insights from industry leaders like BASF and Qubika, highlighting the ongoing advancements in Databricks technology and the future of data-driven business solutions. As the platform evolves, Databricks continues to prove itself as a powerful tool for businesses looking to scale, innovate, and stay ahead in the competitive data landscape.