Posts

Showing posts from October, 2023

Airflow Tutorial - Sensors | What are Airflow sensors | How sensors Work | Examples | s3keysensor

Image
  Hello Data Pros,  In our previous blog, we explored the power of Airflow Variables and Connections and how to use them effectively in your Dags. Today, we're going to deep dive into the world of Airflow sensors!   As we mentioned already, Airflow sensors are a special type of operator that are designed to wait for a specific condition to be met! At regular intervals, they check to see if the condition is met. Once it’s met, the corresponding task is marked successful, allowing their downstream tasks to execute. Sensors make your Dags more event-driven, scattering use cases such as when a task needs to wait for a file to be created, Or a database table to be updated, or an external API to become available.   Here is a simple Dag consisting of two tasks. The first task uses S3KeySensor, which waits for a file to be available in the AWS S3 bucket. Once the file is ready, the next task loads the file into a Snowflake table.   Let's see how it works in the Airflow UI. Though it

How to Install DBT and Set Up a Project, Create Your First dbt Model

Image
  ###### Snowflake DDL  - at the end ######### Hello Data Folks,  Today we’ll see how to install DBT, set up your first project, and create a dbt model. So, without further ado, let's dive right in and get started!   Python is a prerequisite for using dbt, so make sure to download it from python.org and install it on your system. Please select this checkbox during the installation process. This’ll automatically add the Python installation directory to your system's Path variable. To verify that Python is working correctly, open the command prompt and use the "python --version" command. Looking good!   Let's now download and install Visual Studio Code, the most powerful and widely used IDE in the industry. After installing VS Code, please proceed to install the Python and dbt extensions one after another. As a best practice, please choose extensions with high downloads and ratings.   Please open the terminal, and cd to the path where you want to setup your first pr

Airflow Tutorial - Variables and Connections

Image
Hello Data Pros,  In our last blog, we covered the fundamental concepts of Apache Airflow, including  dags , tasks and operators! In addition, we demonstrated the importance of the airflow configuration file, and the provider packages!   Today, we’ll learn about the power of Airflow Variables and Connections and how to use them effectively within your  Dags .   Let's start with Apache Airflow Variables! Variables are like little storage containers for values, that you can reuse across your  dags . Instead of hardcoding values, you can store them as variables, and reference them with its name whenever needed.   Technically each variable is a key and a value pair. You can think of the key as the variable name and the value as the data it holds.   There are two types of Airflow variables! One, Regular variables, Where the value can be any string. Two, JSON variables, where the value is a JSON string.   Let’s consider I have a  dag , that includes a hard-coded support email address, an

Airflow DAGs, Operators, Tasks & Providers

Image
Hello Data Pros,  In our last blog, we demonstrated step-by-step installation of Apache Airflow on a Windows PC, and successfully executed our very first Airflow dag! Now, it's time to dive deeper! In this video, we'll learn about the airflow configuration file! Explore each section inside a dag! Understand various Operator types! Experience the power of provider packages! Let's begin right away!   As we already know, airflow dags are coded in Python language.   Every Airflow setup has a ‘dags folder’. You can set this folder path in the Airflow configuration file, named airflow dot cfg.   In addition to the dags folder, this configuration file has many other settings that you can customize to meet your needs.   For example, to enable my Airflow instance to send email notifications, I added another Docker container in my docker compose. This new container will locally host a simple SMTP server. I then updated the Airflow configuration file to use the corresponding Docker se