Airflow Prometheus (0.4.2)
We extended the project, improved the code and added new features to enable better monitoring of your Airflow workloads. :rocket:
To install this package please do:
$ python3 -m pip install "airflow-prometheus==0.4.2"
Or if you are using Poetry to run Apache Airflow:
$ poetry add apache-airflow@latest $ poetry add "firstname.lastname@example.org"
What this package provides?
- Support for exporting Prometheus metrics
- Support for exporting additional data into Grafana
Metrics are exported on the
|dag_bag_stats||property||Statistics for the dag bag:|
|airflow_dag_status||dag_id, owner, status||Shows the number of dag starts with this status|
|airflow_dag_run_duration||dag_id||Duration of successful dag_runs in seconds|
|airflow_dag_scheduler_delay||dag_id||Airflow DAG scheduling delay|
|airflow_task_status||dag_id, task_id, operator_name, owner, state||Shows the number of task instances with particular status|
|airflow_task_duration||aggregation, operator_name, task_id, dag_id||Durations of tasks in seconds by operator:|
|airflow_task_max_tries||operator_name, task_id, dag_id||Max tries for tasks|
|airflow_last_dag_run||status, task_id, dag_id||Tasks status for latest dag run|
|airflow_successful_task_duration||task_id, dag_id, execution_date||Duration of successful tasks in seconds|
|airflow_task_fail_count||dag_id, task_id||Count of failed tasks|
|airflow_xcom_parameter||dag_id, task_id||Airflow Xcom Parameter|
|airflow_task_scheduler_delay||queue||Airflow Task scheduling delay|
|airflow_num_queued_tasks||-||Airflow Number of Queued Tasks|
You can use SimpleJson datasource to display states of DAGs. Install the plugin with the following command or via grafana.com:
$ sudo grafana-cli plugins install grafana-simple-json-datasource
Now let’s create a json datasource and point it to
/metrics/json/ (trailing slash is important and you may need to check skip TLS verify in order for it to work):
Now add ad-hoc variable:
Now you can see ad-hoc filter at the top of the dashboard. You can select DAGs with that filter. Now we need to add some visualizations.
We add new panel and select newly created json datasource. As metric we select
dags and for visualization type:
Node graph will show the dependencies between tasks and their status for the latests instance of the DAG. DAGs can be selected with the ad-hoc variable you created. You can remove that ad-hoc filter to show all DAGs, but it’s not recommended as NodeGraph panel is fairly bad at zooming or paning the diagram.
The example dashboard is available here: example/dashboard.json