Enabling Discovery and Self-service to Increase the ROI of Data with Atlan
The Active Metadata Pioneers series features Atlan customers who have recently completed a thorough evaluation of the Active Metadata Management market. Paying forward what you’ve learned to the next data leader is the true spirit of the Atlan community! In that spirit, they’re here to share their hard-earned perspective on an evolving market, insights into what makes up their modern data stack, innovative use cases for metadata, and more.
In this installment of the series, we meet Arūnas Vaitkus, Chief Data Officer, and Vidmantė Čižienė, Data Engineer, at carVertical, a vehicle history data company serving 27 markets. With 900+ data sources, 22.3 million vehicle identification numbers checked, and 1.5 million users per month, data is crucial to carVertical, and Vidmantė and Arūnas share why Atlan fits into their modern data stack, as well as how the platform will help their colleagues find new, valuable ways to utilize data.
This interview has been edited for brevity and clarity.
Could you tell us a bit about yourself, your background, and what drew you to Data & Analytics?
I currently serve as the Chief Data Officer (CDO). My professional journey began in knowledge engineering and data engineering. I initially worked in a legal firm, where I led a global data architecture team. Later, I transitioned to the financial sector, gaining experience at a major investment bank and in various consulting roles. Eventually, I joined carVertical from its inception and was part of the team that built the entire carVertical data ecosystem from the ground up.
I have spent 10 years in communications and all my professional roles have revolved around information, communication, and over the latter three years, data. The data cataloging project aligns with my areas of expertise, encompassing information distribution across the company, fostering inter-team communication, and facilitating comprehensive data organization, all of which I’ve honed throughout my career.
Would you mind describing your data team?
carVertical has over 120 people right now, and around 20 of those work in the data department. We’ve set up four different teams in the department: Data Acquisition, Data Engineering, Machine Learning Engineering, and Data Analytics.
What does your data stack look like?
From the start, carVertical was not a typical data company. Most of the data was on NoSQL databases stored “as-is”. But as time went on, we started migrating to more typical relational databases and data warehouses. As of now, we have BigQuery for a data warehouse, and Postgres and MongoDB for operational purposes.
Most of the data that is not structured or is semi-structured is stored on Amazon S3 in our data lake. The orchestration happens with Airflow, which runs ELT processes with the help of dbt. We have chosen Tableau as our data visualization tool for our analytical needs.
As for operational data feeds, we are heavily invested in AWS services. The operational data transformations are off-loaded to AWS Lambda Functions. We tend to write just a few APIs, as we have a dedicated back-end team for that. It is also worth mentioning that the final data outputs of said Lambda Functions are validated against well-documented JSON schemas.
Why search for an Active Metadata Management solution? What was missing?
Life happened. Our company is currently in a phase of growth, which means there’s an increasing demand for data, along with the need to onboard new team members. When I first joined the company, I was just the fifth person in our data domain.
Since then, our data domain has evolved significantly, with a diverse range of new roles emerging. Additionally, our product team is also on the rise, and as we continue to develop more products, needs for data keep growing.
As the company’s overall demands continue to rise, and with more users and team members relying on data, it has become crucial for us to consolidate the knowledge that’s currently spread across different platforms, tools, and conversations. Our goal is to effectively bring all this scattered knowledge together to support our ongoing growth in the best possible way.
Why was Atlan a good fit? Did anything stand out during your evaluation process?
We did our research, checked the rankings, and we came up with a number of vendors to check out. After interviews, we narrowed down our options to two vendors, and in the final stage, we chose Atlan. The evaluation process turned out to be a longer journey than we ever expected, and we began to wonder when we had signed up for this marathon? Nevertheless, it was surprisingly educational!
We wanted a tool that could be easily integrated into our systems, featuring a modern user interface so our colleagues could easily grasp the data’s context. Some of the larger vendors lacked an intuitive user interface, which could have negatively impacted adoption, a crucial concern for us.
Automation capabilities were also important. We need them to help us adopt the necessary processes on our path to ISO 27001. This includes support for policies like writing who has access to data, and making it easily visible.
We had a list of functionalities we were aiming for, primarily focused on discoverability, the capability to upload and manage our glossaries, and the assignment of ownership, classifications, and data lineage. Atlan met all these criteria.
I would also like to mention that we experienced a highly effective procurement process, led by Michael, the Account Executive, and Kevin, the Engineer.
What do you intend on creating with Atlan? Do you have an idea of what use cases you’ll build, and the value you’ll drive?
We want to have most of the data that is running our products on Atlan. This way, the product owners can look and see if there is something missing from their view, or perhaps there’s an opportunity to create a new product, so that our product owners are empowered to find new uses for it, reducing the cost of existing data by reusing it multiple times.
The other big part is compliance. As a European-Union-based company, we hold a strong commitment to upholding privacy and security standards. Atlan is set to be a substantial addition to our ongoing endeavors aimed at this.
With lineage, we are interested in impact analysis and seeing what effects changing a column upstream may have later on. Showing column-level lineage helps business owners see where their data is coming from and how it’s connected through different products. With that better view, we can either reduce data’s cost or increase its value.
I see Atlan as a meeting point where different people come together, or like a data “e-shop”. The ultimate goal is to create a positive shopping experience for our internal users. We’re practical, and know that the platform itself isn’t a magic fix for everything and its effectiveness is directly related to how well it meets users’ needs.
Did we miss anything?
Working with Atlan has given us a really nice story to tell to our seniors. I was impressed with the development team: how knowledgeable they are, how they involved us when building a MongoDB connector and heard our ideas for what we would need. Both sides were committed to building something better and it was cool. So, thanks for that.
From my side, as I come from a communications background, Atlan’s work in that field really caught my eye.