Apache Airflow Taskflow API | Airflow Decorators
Hello Data Pros, welcome back to another episode of our Apache Airflow series!
Today, we're taking a deep dive into the world of Task flow API!
The Task flow API was introduced to Airflow, starting from version 2.0, aiming to simplify the dag authoring experience by significantly reducing the number of code lines needed by developers! this also makes the code more modular, understandable and easily maintainable!
The Taskflow API was built based on the concept of Python decorators! So to understand Taskflow API, you first need to know the basics of decorators.
Think of a decorator as a special function that takes another function as its input and returns a modified version of that original function.
It's like wrapping a gift: the decorator adds extra features to the original function.
Decorator is applied using the @ symbol, above a function definition.
In our case, ‘decorator example’ decorates the ‘say hello’ function.
Let's execute and observe the results. When I call the original function, you can see that the decorator's feature is also in effect.
For added clarity, let's remove the decoration from the function definition. Now, the original function behaves as usual.
Decorator is useful when applying common features to multiple functions, without code repetition.
This also keeps the code clean and organized by separating extra features and complex logics from core function.
Airflow has adopted this concept, and introduced various pre-built decorators.
Among them, one of the most popular and widely used decorators is @ task, which serves as an efficient alternative for traditional Python operators.
Let's see this in action!
Here's a dag coded in the traditional approach!
And here is the same dag coded with the @task decorator.
The first noticeable difference is the significant reduction in the number of code lines.
In the traditional approach, explicit coding of x-com logics for pushing and pulling data between tasks, along with the need to explicitly define dependencies between tasks, is required.
However, in the Taskflow API, all these boilerplate code lines are not necessary. They are cleverly hidden, or wrapped inside the @ task decorator by Airflow.
Its time to validate this in the Airflow UI.
As you can see, both these dags has produced exact same results!
Just like @ task, you can experiment with other prebuilt decorators based on your project's use case.
Before we wrapup the video, I'd like to break a common misconception, it's not always this or that! you can in fact create a hybrid dag that involves a combination of Taskflow API and traditional operators!
As always, all the code lines mentioned in this video are available in the link provided in the description.
In summary, the Taskflow API emerges as an invaluable feature in Airflow, promising significant reductions in development and maintenance efforts for your project!
That's all for today! Please stay tuned for our next video where we’ll explore more advanced airflow features!
1) Task Flow API:
from airflow import DAG
2) Traditional:
3) Hybrid:
Comments
Post a Comment