Hello Data Pros, and welcome back to another exciting blog in our Databricks learning series! In our last blog, we explored various platform architectures, specifically focusing on the modern Data Lakehouse architecture and its implementation using the Delta Lake framework. But today, we're moving to the next big question: With this powerful platform architecture in place, how do we organize and model our data effectively? And that’s where the Medallion Architecture comes in! So, what exactly is Medallion Architecture? It’s a data design pattern, developed to logically structure and organize your data within a Lakehouse! Its main purpose is to progressively improve the quality and usability of your data, as it moves through different stages, such as Bronze, Silver, and Gold. Think of it as a transformation journey, where raw data is refined step by step into a polished and analysis-ready state! Some people call it a multi-hop architecture because the data flows through several tr...
DBT Sources, Seeds & Analyses | Data Build Tool Demo with Real World Examples
- Get link
- X
- Other Apps
############ seeds\salestargets.csv ############
StoreID,SalesTarget
1876,260000
1396,290000
1146,305000
1107,345000
1384,350000
1155,300000
1913,300000
1882,345000
1289,300000
1005,375000
############ models\storeperformance.sql ###########
SELECT
OS.StoreID,
SUM(OFACT.Revenue) AS ActualSales,
SUM(ST.SalesTarget) AS TargetSales
FROM
{{ ref('orders_stg') }} OS
JOIN
{{ ref('orders_fact') }} OFACT ON OS.OrderID = OFACT.OrderID
JOIN
{{ ref('salestargets') }} ST ON ST.StoreID = OS.StoreID
GROUP BY 1
############ analyses\StoreRevenue.sql ############
SELECT
OS.StoreID,
SUM(OFACT.Revenue) AS TotalRevenue
FROM
{{ ref('orders_stg') }} OS
JOIN
{{ ref('orders_fact') }} OFACT ON OS.OrderID = OFACT.OrderID
GROUP BY 1
############ models\src_oms.yml ############
sources:
- name: landing
description: The primary soruce system for order management system(OMS) data anlalytics.
database: SLEEKMART_OMS
schema: L1_LANDING
freshness:
warn_after: {count: 1, period: day}
error_after: {count: 3, period: day}
loaded_at_field: Updated_at
tables:
- name: customers
description: The customers table from order management system(OMS).
identifier: customers
- name: orders
description: The orders table from order management system(OMS).
identifier: orders
- name: orderitems
description: The orderitems table from order management system(OMS).
identifier: orderitems
- name: employees
description: The employees table from order management system(OMS).
identifier: employees
columns:
- name: address
description: Full Address of emplyees for Transport Arrangements.
tests:
- not_null
- string_not_empty
- name: sales_us
- name: sales_europe
- name: sales_asia
- name: stores
- name: suppliers
- name: products
- name: training
database: SLEEKMART_OMS
schema: TRAINING
tables:
- name: city_temperature
- name: sales_us
- name: sales_uk
- name: sales_india
############ customers_stg.sql ############
{{ config(materialized='table') }}
SELECT
CustomerID,
FirstName,
LastName,
Email,
Phone,
Address,
City,
State,
ZipCode,
Updated_at,
CONCAT(FirstName, ' ', LastName) AS CustomerName
FROM
{{ source('landing', 'customers') }}
- Get link
- X
- Other Apps
Popular posts from this blog
How to Install Airflow on Windows
Hello Data Pros, In our previous blog, we explored what Airflow is! covered essential concepts such as Dags, Tasks and Operators! We also dissected its architecture and core components! In this video, we'll demonstrate how to set up Airflow on your local machine and create your first Airflow Dag! Let's begin right away! I’m using windows, but the same approach works well on macOS as well. As of this video, Airflow is not officially supported on Windows, so we'll be installing Docker and running Airflow on top of it. Docker is a software containerization platform designed for developing, shipping, and running applications. It packages the entire application along with its dependencies and configurations within a standardized unit known as a container. These containerized applications are known for their consistency, repeatability, and portability across different operating systems. Please download and install 'Docker Des...
How to Install DBT and Set Up a Project, Create Your First dbt Model
###### Snowflake DDL - at the end ######### Hello Data Folks, Today we’ll see how to install DBT, set up your first project, and create a dbt model. So, without further ado, let's dive right in and get started! Python is a prerequisite for using dbt, so make sure to download it from python.org and install it on your system. Please select this checkbox during the installation process. This’ll automatically add the Python installation directory to your system's Path variable. To verify that Python is working correctly, open the command prompt and use the "python --version" command. Looking good! Let's now download and install Visual Studio Code, the most powerful and widely used IDE in the industry. After installing VS Code, please proceed to install the Python and dbt extensions one after another. As a best practice, please choose extensions with high downloads and ratings. Please open the terminal, and cd to the path where you want to setup your f...
Airflow DAGs, Operators, Tasks & Providers
Hello Data Pros, In our last blog, we demonstrated step-by-step installation of Apache Airflow on a Windows PC, and successfully executed our very first Airflow dag! Now, it's time to dive deeper! In this video, we'll learn about the airflow configuration file! Explore each section inside a dag! Understand various Operator types! Experience the power of provider packages! Let's begin right away! As we already know, airflow dags are coded in Python language. Every Airflow setup has a ‘dags folder’. You can set this folder path in the Airflow configuration file, named airflow dot cfg. In addition to the dags folder, this configuration file has many other settings that you can customize to meet your needs. For example, to enable my Airflow instance to send email notifications, I added another Docker container in my docker compose. This new container will locally host a simple SMTP server. I then updated the Airflow configuration file to use the correspondi...
Comments
Post a Comment