KaniniPro

  • ABOUT
  • Databricks

    Databricks Serverless Compute

    Published by

    Arulraj Gopal

    on

    February 21, 2026

    Databricks Serverless Compute is a fully managed compute option where Databricks automatically provisions, scales, and manages the infrastructure — you don’t create or manage clusters at all. Before setting up serverless compute, let’s understand where it fits within the Databricks architecture. Let’s look at the Databricks high-level architecture. The diagram…

    Continue reading →: Databricks Serverless Compute
  • delta-lake, duckdb

    Processing ADLS delta-table using DuckDB

    Published by

    Arulraj Gopal

    on

    February 9, 2026

    Modern data teams prioritize fast insights with minimal operational overhead. When your data already lives in Azure Data Lake Storage (ADLS) as Delta tables, spinning up Spark just to do light processing often feels like overkill. That’s where DuckDB shines. In this article, we’ll walk through processing Delta tables stored…

    Continue reading →: Processing ADLS delta-table using DuckDB
  • delta-lake

    DeltaLake change tracking with CDF & Row Tracking

    Published by

    Arulraj Gopal

    on

    February 1, 2026

    As we know, Delta Lake tables are designed for the lakehouse architecture, combining the flexibility of a data lake with data-warehouse capabilities such as ACID transactions. Delta Lake also provides strong data-governance features, especially for tracking data changes. Two of them are Change Data Feed and Row Tracking, which we…

    Continue reading →: DeltaLake change tracking with CDF & Row Tracking
  • Databricks

    Introducing Lakeflow Spark Declarative Pipelines

    Published by

    Arulraj Gopal

    on

    January 25, 2026

    Before defining Lakeflow Spark Declarative Pipelines, let’s first understand the declarative approach, Spark declarative pipelines, and finally Lakeflow Spark declarative pipelines. Procedural vs declarative Any task in computer science that describes how the task should be performed is considered a procedural programming approach, whereas defining what needs to be achieved—leaving…

    Continue reading →: Introducing Lakeflow Spark Declarative Pipelines
  • Databricks, sql

    SQL Queries that make the code simple

    Published by

    Arulraj Gopal

    on

    January 18, 2026

    SQL is the most widely used language across data processing applications. For a qualified data engineer, writing efficient queries is a vital skill—but equally important is the ability to write simple, clean, and readable SQL that is easy to maintain over time. During my exploration of various SQL problems, I…

    Continue reading →: SQL Queries that make the code simple
  • Databricks

    Databricks data quality with declarative pipeline

    Published by

    Arulraj Gopal

    on

    January 11, 2026

    Databricks Spark Declarative Pipelines go beyond simplifying pipeline maintenance—they also address data quality, which is paramount for any data application. Using expectations, you can define data quality checks that are applied to every record flowing through the pipeline. These checks are typically standard conditions, similar to what you would write…

    Continue reading →: Databricks data quality with declarative pipeline
  • Databricks, spark

    Schema Drift Made Easy with Spark Declarative Pipelines

    Published by

    Arulraj Gopal

    on

    January 5, 2026

    Spark Declarative Pipelines are designed to simplify the way data processing applications are built by letting engineers work declaratively—you focus on what needs to be produced, and the platform takes care of how it gets executed. This approach also extends naturally to handling schema evolution. Whether you need to add…

    Continue reading →: Schema Drift Made Easy with Spark Declarative Pipelines
  • Databricks

    Incremental load (SCD 1 & 2) with Spark declarative pipelines

    Published by

    Arulraj Gopal

    on

    December 28, 2025

    Incremental load is an efficient approach for moving data into downstream systems by ensuring that only the changes between the previous run and the current run are processed. However, setting this up is not trivial. There are multiple proven strategies—such as batch-based processing using watermarks to track progress, or streaming…

    Continue reading →: Incremental load (SCD 1 & 2) with Spark declarative pipelines
  • Databricks

    Getting started with Databricks SDP

    Published by

    Arulraj Gopal

    on

    December 22, 2025

    Spark Declarative Pipelines are one of the flagship capabilities of Databricks, enabling data engineers to focus purely on business logic while abstracting away infrastructure concerns such as cluster provisioning and management etc. In this article, we will explore how to get started with Spark Declarative Pipelines using Databricks. Prerequisite –…

    Continue reading →: Getting started with Databricks SDP
  • Databricks

    Tracking Table and Column Lineage in Databricks Unity Catalog

    Published by

    Arulraj Gopal

    on

    December 14, 2025

    Data governance is one of the most integral parts of any data project, and data lineage plays a key role in understanding and tracking the true source of data. What is data lineage? Data lineage provides end-to-end visibility of how data moves across systems—from its origin, through every transformation, to…

    Continue reading →: Tracking Table and Column Lineage in Databricks Unity Catalog
Next Page

Let’s connect

  • LinkedIn
  • Mail

Recent posts

  • Databricks Serverless Compute

  • Processing ADLS delta-table using DuckDB

  • DeltaLake change tracking with CDF & Row Tracking

  • Introducing Lakeflow Spark Declarative Pipelines

  • SQL Queries that make the code simple

  • Databricks data quality with declarative pipeline

  • Subscribe Subscribed
    • KaniniPro
    • Already have a WordPress.com account? Log in now.
    • KaniniPro
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar

Notifications