One morning at 8 am, I woke up to the Cabinet Minister of India calling me. He said, “Prukalpa, the number on this dashboard doesn’t seem right.”
Frantic, I opened up my laptop and loaded the dashboard to realize the number was clearly off. And yet, at that moment, there was nothing I could do to explain it. I could feel myself losing the credibility and hard-earned trust that had taken months to build.
I called my Project Manager, who was fantastic at stakeholder management but couldn’t understand the nitty-gritties of data. She called our Data Analyst, who looked at the dashboard and said, “Seems like something broke down in the pipeline”. Our Analyst then called our only Data Engineer, who pulled out logs from Apache Airflow. But he couldn’t troubleshoot it because he didn’t know what the variables meant and didn’t have the data context.
It took us 8 hours and 4 people to figure out what went wrong. We lost time that day.
But more importantly, we lost trust. Trust with our customer. Trust in our team.
Trust is often not about things breaking. In years of working with data, I’ve learned that data will always be chaos. But when things break and you find out too late, or you can’t explain why something broke, that’s what breaks trust.
Imagine if, at that moment when the cabinet minister called me, I could quickly open a dashboard and say, “Yes, seems like the pipeline didn’t run on time today. We’ve received an alert and it has already been escalated to data engineering.” Or even better, imagine if the dashboard had an alert on it, signaling to the minister that something was wrong and he shouldn’t use it.
Today we are excited to announce that Atlan natively integrates with Apache Airflow. For data teams everywhere, this means more transparency and trust, and less time spent debugging pipelines after a broken dashboard or mismatched metrics.
Atlan + Airflow: Building an ecosystem of trust and transparency
With this integration, data teams can build better data engineering experiences centered around building knowledge and trust in their data.
First, Atlan’s integration with Airflow brings much-needed pipeline context to data assets.
Now you can share any type of metadata from Airflow pipelines to Atlan data asset profiles, where data analysts, scientists, and business users have access to it. This opens up pipeline context and makes it fully transparent so that data teams and consumers can always know the status of the data pipeline associated with each data asset.
Here are some great context fields that we’ve seen people bring from Airflow to Atlan:
- Freshness: When was my table last updated?
- Run schedule: Did the pipeline run as expected?
- Pipeline status: Was the last pipeline run successful?
Atlan already connects to data warehouses (e.g. Snowflake, Redshift) and BI tools (e.g. Tableau and Looker). Bringing Airflow into this ecosystem also means that data teams can now map relationships across all of their data. Whether you’re loading in new data, revising a pipeline, or setting up a dashboard, you can now construct and visualize data lineage from end to end.
Less time debugging, more time building
Getting an urgent call about broken data is one of the worst experiences for a data team. Instead of calling everyone who has ever touched the data, you can now diagnose the problem in seconds.
All it takes is opening a data asset profile and checking the pipeline status and metrics. No more hours of scrambling or broken trust, Atlan and Airflow’s integration lets you see all of your data and its context in one place.
Ready to get started with this integration? Check out a demo of Atlan.
Here are two resources to help you get started with bringing Airflow and Atlan together: