Posts

Showing posts with the label apache airflow

Manage flow of tasks - Airflow Tutorial Trigger Rules, Conditional Branching, Setup Teardown, Latest Only, Depends On Past

  Hello Data Pros, and welcome back to another exciting episode of our Apache Airflow series! ****  Code lines at the End  **** Today, we'll explore how to manage the flow of tasks in Airflow—a critical step in orchestrating efficient data pipelines !   With the default airflow settings, a task is executed only when all its dependencies complete successfully. However, in real-world projects, customizing this default behaviour becomes essential to address a vast number of use cases.   For example, you might need to dynamically pick and run a specific branch depending on the outcome of a preceding task, while skipping the remaining branches. The Branch Python Operator facilitates this feature, by allowing you to select a branch through a user-defined Python function. Within this function, you can implement the logic to determine the appropriate branch, and should ensure that it returns the task ID of the downstream task to be executed next.   All the code lin...

Airflow Tutorial - Hooks | Hooks vs Operators | airflow hooks example | When and How to use

  Hello Data Pros,  In our last blog, we uncovered the need for airflow X-coms! and demonstrated how to leverage them effectively in your dags! Today, we're shifting our focus to Airflow hooks!  We’re going to cover what hooks are! How they differ from Airflow operators! Lastly, when and how to use hooks, in your dags! Let's dive right in!   Technically, Hooks are pre-built Python classes. They simplify our interactions with external systems and services. For instance, the popular S3Hook, which is part of the AWS provider package, offers various methods to interact with S3 Storage.   For example, the create bucket method, Creates an Amazon S3 bucket! Load string method – can load a string value as a file in S3! Delete objects method - can be used to delete an S3 file.   Now, let's dive into the source code of this Hook! As you can see, it's low-level Python code. And if AWS has not provided this hook, you might find yourself having to write all this complex...