KaniniPro

Databricks

Databricks Identity Sync from Microsoft Entra ID

Published by

Arulraj Gopal

on

April 6, 2026

Identity management is essential for any application to ensure that the right people have the right level of access with the appropriate permissions. When using Azure as the cloud provider with Databricks, Microsoft provides built-in integrations that simplify identity and access management. In Databricks, identities such as users, groups, and…
Continue reading →: Databricks Identity Sync from Microsoft Entra ID
Databricks

Secrets Management in Azure Databricks

Published by

Arulraj Gopal

on

March 22, 2026

Managing secrets is a core part of any application. Hardcoding secrets directly in notebooks or code is highly vulnerable. Therefore, systems provide secure ways to store secrets and use them when and where required, without exposing them directly in the code. Databricks provides a feature called Secret Scope, where we…
Continue reading →: Secrets Management in Azure Databricks
Databricks

Databricks SQL Introduction

Published by

Arulraj Gopal

on

March 8, 2026

If you are already using Databricks and thinking about moving to another platform just to get data warehouse capabilities, it might be worth reconsidering. Databricks SQL provides powerful data warehousing capabilities directly on top of your existing data lake. It is a collection of services designed to bring data warehouse…
Continue reading →: Databricks SQL Introduction
Databricks

Databricks Serverless Compute

Published by

Arulraj Gopal

on

February 21, 2026

Databricks Serverless Compute is a fully managed compute option where Databricks automatically provisions, scales, and manages the infrastructure — you don’t create or manage clusters at all. Before setting up serverless compute, let’s understand where it fits within the Databricks architecture. Let’s look at the Databricks high-level architecture. The diagram…
Continue reading →: Databricks Serverless Compute
delta-lake, duckdb

Processing ADLS delta-table using DuckDB

Published by

Arulraj Gopal

on

February 9, 2026

Modern data teams prioritize fast insights with minimal operational overhead. When your data already lives in Azure Data Lake Storage (ADLS) as Delta tables, spinning up Spark just to do light processing often feels like overkill. That’s where DuckDB shines. In this article, we’ll walk through processing Delta tables stored…
Continue reading →: Processing ADLS delta-table using DuckDB
delta-lake

DeltaLake change tracking with CDF & Row Tracking

Published by

Arulraj Gopal

on

February 1, 2026

As we know, Delta Lake tables are designed for the lakehouse architecture, combining the flexibility of a data lake with data-warehouse capabilities such as ACID transactions. Delta Lake also provides strong data-governance features, especially for tracking data changes. Two of them are Change Data Feed and Row Tracking, which we…
Continue reading →: DeltaLake change tracking with CDF & Row Tracking
Databricks

Introducing Lakeflow Spark Declarative Pipelines

Published by

Arulraj Gopal

on

January 25, 2026

Before defining Lakeflow Spark Declarative Pipelines, let’s first understand the declarative approach, Spark declarative pipelines, and finally Lakeflow Spark declarative pipelines. Procedural vs declarative Any task in computer science that describes how the task should be performed is considered a procedural programming approach, whereas defining what needs to be achieved—leaving…
Continue reading →: Introducing Lakeflow Spark Declarative Pipelines
Databricks, sql

SQL Queries that make the code simple

Published by

Arulraj Gopal

on

January 18, 2026

SQL is the most widely used language across data processing applications. For a qualified data engineer, writing efficient queries is a vital skill—but equally important is the ability to write simple, clean, and readable SQL that is easy to maintain over time. During my exploration of various SQL problems, I…
Continue reading →: SQL Queries that make the code simple
Databricks

Databricks data quality with declarative pipeline

Published by

Arulraj Gopal

on

January 11, 2026

Databricks Spark Declarative Pipelines go beyond simplifying pipeline maintenance—they also address data quality, which is paramount for any data application. Using expectations, you can define data quality checks that are applied to every record flowing through the pipeline. These checks are typically standard conditions, similar to what you would write…
Continue reading →: Databricks data quality with declarative pipeline
Databricks, spark

Schema Drift Made Easy with Spark Declarative Pipelines

Published by

Arulraj Gopal

on

January 5, 2026

Spark Declarative Pipelines are designed to simplify the way data processing applications are built by letting engineers work declaratively—you focus on what needs to be produced, and the platform takes care of how it gets executed. This approach also extends naturally to handling schema evolution. Whether you need to add…
Continue reading →: Schema Drift Made Easy with Spark Declarative Pipelines

KaniniPro

Let’s connect

Recent posts

Databricks Identity Sync from Microsoft Entra ID

Secrets Management in Azure Databricks

Databricks SQL Introduction

Databricks Serverless Compute

Processing ADLS delta-table using DuckDB

DeltaLake change tracking with CDF & Row Tracking