Hello Data Pros, and welcome back to another exciting blog in our Databricks learning series! In our last blog, we explored various platform architectures, specifically focusing on the modern Data Lakehouse architecture and its implementation using the Delta Lake framework. But today, we're moving to the next big question: With this powerful platform architecture in place, how do we organize and model our data effectively? And that’s where the Medallion Architecture comes in! So, what exactly is Medallion Architecture? It’s a data design pattern, developed to logically structure and organize your data within a Lakehouse! Its main purpose is to progressively improve the quality and usability of your data, as it moves through different stages, such as Bronze, Silver, and Gold. Think of it as a transformation journey, where raw data is refined step by step into a polished and analysis-ready state! Some people call it a multi-hop architecture because the data flows through several tr...
Snowflake Storage Layer Key Concepts | Micropartition | Columnar Storage | Partition pruning
- Get link
- X
- Other Apps
Hello Data Pros, and welcome back to another exciting episode of our Snowflake learning series!
In our previous videos, we explored Snowflake's storage and processing layers, covering key concepts along the way!
Now, let's shift our focus to the Cloud Services Layer - the mastermind behind everything that happens in Snowflake!
--All commands and sample csv file in the video description--
create database SLEEKDATA;
create schema OMS;
create stage MY_STAGE;
create table MY_TABLE (
ID int,
Name varchar(100),
Age int,
Region varchar(10)
);
-- Run put command in SNOWSQL CLI
-- put file://D:/mydata.csv @MY_STAGE;
copy into MY_TABLE from @MY_STAGE/mydata.csv.gz;
select * from MY_TABLE
mydata.csv
1,Alice,25,US
2,Bob,30,EU
3,Charlie,28,AP
4,David,35,US
5,Emma,40,EU
6,Frank,32,AP
7,Grace,27,US
8,Henry,45,EU
9,Ivan,50,AP
10,Jasmine,23,US
- Get link
- X
- Other Apps
Popular posts from this blog
How to Install Airflow on Windows
Hello Data Pros, In our previous blog, we explored what Airflow is! covered essential concepts such as Dags, Tasks and Operators! We also dissected its architecture and core components! In this video, we'll demonstrate how to set up Airflow on your local machine and create your first Airflow Dag! Let's begin right away! I’m using windows, but the same approach works well on macOS as well. As of this video, Airflow is not officially supported on Windows, so we'll be installing Docker and running Airflow on top of it. Docker is a software containerization platform designed for developing, shipping, and running applications. It packages the entire application along with its dependencies and configurations within a standardized unit known as a container. These containerized applications are known for their consistency, repeatability, and portability across different operating systems. Please download and install 'Docker Des...
How to Install DBT and Set Up a Project, Create Your First dbt Model
###### Snowflake DDL - at the end ######### Hello Data Folks, Today we’ll see how to install DBT, set up your first project, and create a dbt model. So, without further ado, let's dive right in and get started! Python is a prerequisite for using dbt, so make sure to download it from python.org and install it on your system. Please select this checkbox during the installation process. This’ll automatically add the Python installation directory to your system's Path variable. To verify that Python is working correctly, open the command prompt and use the "python --version" command. Looking good! Let's now download and install Visual Studio Code, the most powerful and widely used IDE in the industry. After installing VS Code, please proceed to install the Python and dbt extensions one after another. As a best practice, please choose extensions with high downloads and ratings. Please open the terminal, and cd to the path where you want to setup your f...
Airflow DAGs, Operators, Tasks & Providers
Hello Data Pros, In our last blog, we demonstrated step-by-step installation of Apache Airflow on a Windows PC, and successfully executed our very first Airflow dag! Now, it's time to dive deeper! In this video, we'll learn about the airflow configuration file! Explore each section inside a dag! Understand various Operator types! Experience the power of provider packages! Let's begin right away! As we already know, airflow dags are coded in Python language. Every Airflow setup has a ‘dags folder’. You can set this folder path in the Airflow configuration file, named airflow dot cfg. In addition to the dags folder, this configuration file has many other settings that you can customize to meet your needs. For example, to enable my Airflow instance to send email notifications, I added another Docker container in my docker compose. This new container will locally host a simple SMTP server. I then updated the Airflow configuration file to use the correspondi...
Comments
Post a Comment