By Brendan Siebecker, Director of Alliances – Atlan
By Gaurav Malhotra, Solutions Architect – AWS
More people are using data today than ever before, but it’s getting harder and harder for everyone to collaborate on the same data. This includes data engineers and analysts, product managers, marketers, researchers, and more.
Atlan is an AWS Partner with the Amazon Redshift Ready designation. It has pioneered a collaborative workspace helping modern data teams work together better. Atlan is a collaboration and orchestration layer—the glue that brings together your team, the tools you love, and the data you need.
With deep integrations across the modern data stack, Atlan helps teams create a single source of truth for all of their data assets. Atlan is extending its suite of integrations through extensive collaboration with Amazon Web Services (AWS), and you can now find Atlan on AWS Marketplace.
In this post, we will share how companies use Atlan and AWS to democratize their data, collaborate more effectively, and unify all of their knowledge and context in one place. We’ll also show how to integrate Atlan and Amazon Redshift with a step-by-step walkthrough.
How Companies Use Atlan and AWS
Customers in the AWS ecosystem can benefit from Atlan’s seamless integration with a suite of AWS services—including analytics tooling such as Amazon Redshift, Amazon Athena, and AWS Glue—as well as popular tools in the modern data stack such as Tableau, Apache Airflow, and dbt.
For example, Postman (an API platform used by more than 500,000 companies worldwide) uses AWS and Atlan to open up their data, build trust, and become more data-driven. This is important because Postman’s leaders staunchly believe that everyone in the company should be able to access data and gain insights from it. However, before Atlan, their data was often a mystery and context lived in the heads of early team members.
Prudhvi Vasa, Analytics Leader at Postman, explained the value of democratizing and documenting data with AWS and Atlan: “We’ve been able to catalog and document all of our data, which acts as a single source of truth for our data. The result? Everyone is able to find the right data for their use case, and the data is consistent across the board for all accessing it,” says Vasa.
“Having a reliable data foundation, where people can find and understand all our data opens the possibility of having everyone participate in analyzing data. This enables our entire company to become more data-aware and data-driven, which is the goal for any major company today,” Vasa adds.
Atlan, AWS, and the Modern Data Stack
Atlan acts as a virtualized layer across a variety of tools in the modern data stack. Its push- and pull-based metadata crawlers bring metadata from different tools in the data platform to build a unified collaboration platform.
First, Atlan creates a powerful search and discovery layer. It acts as a Google-like search engine for all of your data, where you can quickly discover and access any data asset along with all of its associated context and documentation. This search supports intelligent keyword recognition, powerful search filters, sorting by relevance or popularity, and even a Cmd+K shortcut.
This search doesn’t just surface data tables—it surfaces everything about an organization’s data. In today’s day and age, data assets are not just tables. That’s why Atlan lets people search across every type of data asset—business intelligence (BI) dashboards, pipelines, code, models, queries, metrics, directed acrylic graphs (DAGs), and more.
Second, Atlan unifies context from all the different tools in your data stack in one place. Where does this data come from? Who uses it? Can I trust it? The “asset profile” in Atlan answers questions like these with information like a data asset’s description, certification (verified, WIP, or deprecated), column previews, sample data, and Readme. This makes it easier to understand each data asset (like lineage, documentation, and ownership) in a single view.
Embedded Collaboration Integrations
Atlan is built on the premise of embedded collaboration, borrowing principles from GitHub, Figma, Superhuman, and other modern future-of-work tools.
Embedded collaboration is about work happening where you are, with the least amount of friction. What if you could request access to a data asset when you get a link, and the owner could get the request on Slack and approve or reject it right there?
What if, when you’re inspecting a data asset and need to report an issue, you could immediately trigger a support request that’s perfectly integrated with your engineering team’s Jira workflow?
Embedded collaboration unifies these micro-workflows that waste time, cause frustration, and lead to tool fatigue, turning time-consuming tasks across multiple tools into a few clicks in whichever tool you’re already using.
Getting Started with Atlan and AWS
Atlan is built on top of AWS’s powerful services. By directly integrating with AWS services like Amazon Redshift, Atlan helps data teams accomplish more by making collaboration a seamless part of their process.
In the following step-by-step guide, we’ll show you how to quickly integrate Atlan with Redshift to open up a new world of collaboration, clarity, and trust for modern data teams.
Follow the steps below to establish a connection and integrate Atlan with a Redshift database.
Step 1: Select the Source
- Log into your Atlan workspace.
- Click on the Workflow button in the left sidebar.
- You’ll see the Marketplace page with the list of sources available on your workspace. Click on New Workflow at the top right.
- Select Redshift from the list of options in the integrations tab, and click Setup Workflow.
Step 2: Provide Credentials
- To set up a new connection, fill in your Redshift credentials on the Credential page. Below is an example:
- Hostname: examplecluster.abc123xyz789.us-west-2.redshift.amazonaws.com
- Port: 5439
- Username: myusername
- Password: xxxxxx
- Default Database: dev
- Select the correct authentication method (basic authentication, or an IAM user or IAM role).
- Once you have filled in the details, click on Test Authentication and then Next.
Step 3: Set Up Your Configuration
- On the Connection page, name your connection and select the users or groups who should be able to access it.
- On the Metadata page, specify any metadata you want to include or exclude from crawling.
- Click Run to run the crawler once, or click on Schedule & Run to schedule it for a daily, weekly, or monthly run.
- Once you click Run or set a schedule, the workflow will start running.
Atlan will crawl your Amazon Redshift instance and ingest all of the metadata into Atlan. This process varies depending on the size of the warehouse you’re looking to crawl, but it often takes less than 30 minutes.
Step 4: Discover Your Assets
Now that you’ve successfully connected Atlan to Amazon Redshift and the Atlan crawler has ingested the metadata, you can start discovering your assets inside Atlan.
In the example below, a company has 362 Redshift assets inside Atlan. These are visible on the Discovery page, filtered by the Amazon Redshift integration.
The data team can click through to any asset to see relevant context, such as the column names, glossary terms, classifications, status, related queries, and Readme.
They can also see the data asset’s lineage, which is auto-generated at the column level for every Redshift asset. This helps data teams see where each asset comes from and which dashboards use it.
Setting up Atlan is easy for analysts and business users alike. From connecting your tables and dashboards within Amazon Redshift to enriching imported data assets, the process doesn’t require major data engineering resources or time. Crawling happens seamlessly and stays in sync with your AWS instances.
For more information, check out these links:
This article was originally published on the AWS Partner Network (APN) blog.