At Atlan, we started as a data team ourselves.
We were inspired by the idea of using data science to tackle the big problems facing humanity. At one point, we were processing data for 500 million Indian citizens, and billions of pixels of satellite imagery.
These were dream projects for data practitioners. But in reality, every day was chaos.
As soon as we started hitting scale, our Slack messages started looking like this:
Who can explain what this data means?
There are two tables with the same name?! Which one should I use?
Why are our MRR numbers different in the finance and sales reports?
Our “2x spike in applications” disaster
One day, a Cabinet Minister of India called at 8 am and said, “The number on this dashboard doesn’t seem right.”
Frantic, I opened up my laptop and loaded the dashboard. There was a sinking feeling in my heart as I realized the number of applications to the program had doubled…. in a day. That wasn’t possible.
Something was clearly wrong. And yet, in that moment, there was nothing I could do to explain it. I could feel myself losing the credibility and hard-earned trust that had taken months to build up. I said, “Sir, I’ll call you back”.
I called my Project Manager, who was fantastic at stakeholder management but couldn’t understand the nitty-gritties of data. She called our Data Analyst, who looked at the dashboard and said, “Seems like something broke down in the pipeline”. Our Analyst then called our only Data Engineer.
When we couldn’t reach him, we were left helpless for a few hours. Our Data Engineer was supporting 5 other projects at the time, and he had just gone to sleep after pulling an all-nighter for another project. He finally woke up, pulled out the audit logs, and called us back.
Nothing had visibly changed in the pipeline, and nothing had broken. “It must be a data issue,” he said. But he didn’t know what the variables meant and how they should work together, so he couldn’t troubleshoot the issue.
We painstakingly started a RCA (Root Cause Analysis) exercise. Our Data Analyst, Data Engineer, and Project Manager sat together and checked the input and output for every node in our data pipeline in Airflow.
After 8 painstaking hours of 3 people’s time, we found the root cause. The API that sent data to our dashboard typically sent incremental data — i.e. the number of applications registered on May 1st. That day, there was some change in the API, and it sent us cumulative data in what typically was a daily field. This led to a 2x spike in one day.
Chaos as the norm, not the exception
This story might seem like an ordinary occurrence. Laughable even.
But at one point, we were spending 50-60% of our time dealing with issues like this.
Trust issues were breaking our team apart. The Analyst Teams blamed our Data Engineering Team. Our Data Engineering Team had far more work than they could handle and was frustrated by constant firefighting. They blamed our Consulting Teams for setting unreasonable expectations.
We had doubled our team size, but new members didn’t know enough about our data or projects to become productive. Older team members were too overburdened with work to share their knowledge.
One night, I received an email from our oldest analyst, saying he quit. We hit rock bottom that night.
Rebuilding bit by bit with the Assembly Line Project
We assembled our now shrunken team into a room for a marathon session, where we created a list of every single thing that went wrong during our data projects. That checklist was the start of our “Assembly Line” project.
We assembled a team of engineers and started an internal project to make our lives better. Our goal was to make our data team more agile and efficient. And over two years, we built tools that powered our human stack.
Bit by bit, things got better. The SOS calls stopped. The number of iterations reduced. Our team started being able to do more in less time. The culture of the team improved drastically.
Even though our team was relatively small, we partnered with the United Nations in 2017 to create global data platforms for the SDGs. These were a unified, data-driven way to track global progress toward Agenda 2030.
In 2018, our team built India’s entire national data platform. It was launched by Prime Minister Modi himself. Today, it is used by 100,000 government officials, MPs, and MLAs as a backbone for data-driven decision making.
Our work was impacting the lives of over a billion people. Pretty cool, right? But that even wasn’t the cool part. This platform was built by an 8-member team, 4 of whom had never pushed a line of code to production before. The average age: 25 years old.
Benchmarking our progress
After completing that project, we felt like it had gone well. But, as data nerds, we wanted to quantify it. We wanted real proof that all our hard work on the Assembly Line Project had been worth it.
We benchmarked the time it took us to build the national data platform, compared to a similar project we executed (albeit at a slightly smaller scale) two years earlier for the Chief Minister of a large state in India.
We were amazed to find that our team had become 6 times more agile.
In the time it took us to do 1 project in 2016, we could now do 6 projects in 2018. Here’s how.
Our work quality doubled. Our better data and dashboards meant that we had to spend less time going back and forth on revisions with our stakeholders. In fact, the number of iterations we needed to get to a finished product had been reduced by half. Our end stakeholders were happier, and our team was too 🙂
We only needed one-third of the resources. The amount of time and people we needed on each product had reduced drastically. We achieved this mainly through tooling that team members’ dependencies on one another. Through automated data quality profiling, lineage, and data observability tools, our data engineering team could spend more time building and less time as a bottleneck.
At the same time, our analysts were empowered to do more. New data discovery and documentation tools helped them eliminate 25% of their effort when starting new projects or using new data sets, followed by more time saved at each subsequent step. This meant that a single analyst could do almost twice the work they could previously do.
Our cycle time was cut in half. Saving time, every step of the way, meant we could go significantly faster. This allowed us to promise and deliver on timelines to customers that were at least two, if not three times faster than market standards.
We called all of these new tools “the tech stack that powered our human stack”.
In data teams, the sum is greater than the parts
When diverse people — engineers, analysts, economists, consultants, and scientists — can come together and collaborate effectively, amazing things can happen. Amazing things that wouldn’t happen without the inherent diversity in these teams.
Our team went on to crack solutions to problems that had seemed impossible. We built an Affluence Index that measured affluence — not for a zip code, but for an individual building. We built a disease prediction model for 22 diseases for every sub-district in India. We could measure economic growth in every village, every month.
That’s when we started thinking… Could our tech stack help not just our team, but all the humans of data around the world?
Our purest goal: building software that can impact everyone
At a very personal level, our founding team at Atlan has always been driven by this idea of “impact” — what can you do in your lifetime, and what impact can it leave on the world?
Software, when built for the right reasons and with the right scale, promises a scale of impact that very few other endeavors can ever have.
This single image in the control centre when NASA sent the InSight spacecraft to Mars in 2018 exemplifies that notion. At that moment, humankind was taking a huge leap forward. And at the same time, the team behind it was using Slack to collaborate.
Today, NASA and 600,000 other teams around the world can attribute some of their culture and success to Slack. Heck, even our team uses Slack to collaborate, and I can’t imagine what our culture would have been without it!
As data becomes a function, one that’s more important than software in many organizations, could Atlan be the collaborative tissue that helps these teams do more?
The next decade is the decade of data-driven teams
In our office, we have a ritual called the Dream Wall, where our team jots down their dreams for our work. Many of them are about the amazing teams that we think Atlan can empower and enable.
We believe that data-driven teams will be behind the most amazing human achievements in the next decade, from curing cancer to developing self-driving cars to putting people on Mars.
However, they will only be successful if these diverse individuals find a way to collaborate effectively — when the “humans of data” become a real team.
We think that collaboration is the key to making the sum greater than the parts. If we can help the diverse people in today’s data teams work together effectively, we can finally invert the depressing statistic that only 27% of data projects are successful. It’s long past time that 73% of these teams saw success.
At Atlan, our dream is to be the place where these amazing teams live every day. Where they work together, better.
We’re always looking for dreamers who believe in the power of data-driven teams. If this sounds like you, check out our open positions.