Databricksters
Subscribe
Sign in
Home
Notes
Chat
AI & ML
Data Engineering
Archive
About
Latest
Top
Discussions
How Liquid Clustering Improves Streaming Merges and P99 Latency
Watch now | The Trio Behind Simpler Streaming Merges: Deletion Vectors, Row-Level Concurrency, and Liquid Clustering
Sep 16
•
Canadian Data Guy
8
8:59
Agents are like onions (they have layers)
Using custom scorers to investigate spans within a trace in MLflow 3+.
Sep 3
•
Veena
5
August 2025
Doctors HATE this one dependency trick!
A quick guide to dependency management for machine learning using MLflow 3+.
Aug 19
•
Veena
7
1
Bayes’d and Redpilled
Markov Chain Monte Carlo Sampling using PyMC on Databricks
Aug 12
•
Austin
4
Beyond the Pipeline: The Blueprint for Enterprise AI Platforms using Databricks
Moving past dependency hell requires more than code—it demands a shift to a govern-first architecture.
Aug 5
•
Debu Sinha
2
July 2025
Securing Gen‑AI Agents on Databricks: How I Keep Prompt‑Injection and Data‑Leak Nightmares at Bay
Practical, field‑tested tactics for neutralizing prompt‑injection, blocking data leaks, and shipping secure Gen‑AI workloads on the Databricks…
Jul 29
•
Debu Sinha
3
Understanding Embedding Model Pricing on Databricks– An End‑to‑End Guide
How much will it cost to embed your documents? As machine‑learning and generative‑AI teams build retrieval‑augmented generation systems or semantic…
Jul 22
•
Debu Sinha
2
It’s beaver time! Don’t get logged down with mlflow logging.
A simple workaround for when you are training thousands of models and log_model() becomes your worst enemy.
Jul 22
•
Veena
2
Write Anywhere, Read Everywhere: Achieving True Data Interoperability Between Databricks and Snowflake
This blog will show you how to eliminate data silos between Databricks and Snowflake using Federation, enabling you to write from anywhere and read from…
Jul 15
•
Nikhil Mishra
A Deep Dive into Spark Stream Static Joins: Live Demo, Caveats and Tips
We explore Spark Stream static joins with a live demo and discuss common mistakes when taking jobs to production
Jul 9
•
Canadian Data Guy
4
12:58
June 2025
💬 Chat with Your Data in Slack Using Databricks Genie – Part I
In today’s fast-paced work environments, Slack has become the go-to for team communication.
Jun 23
•
Ambarish
1
May 2025
The Hidden Price of Streaming: Cutting S3 API Calls for Massive Cloud Savings
A practical approach to cutting cloud expenses through smarter S3 API usage
May 20
•
Geethu
6
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts