If you’ve been on the internet recently, you probably saw how OpenAI’s ChatGPT is taking the world by storm. From writing books and designing rooms to debugging code and explaining data concepts, it seems that automation and AI can now answer any question you ask.
In comparison, metadata can feel like living in the Stone Age. Traditionally, it’s a tiresome, manual process that needs a lot of human intervention from data leaders, stewards, engineers, and analysts alike. In our latest webinar, a poll showed that 60% of attendees spend over 3 hours per week on manual data tasks.
At Atlan, one of our goals is to automate wherever possible. How can we minimize the mundane and maximize the impact data teams can have on the business?
In last week’s Atlan Activate, our quarterly product webinar, we launched new automation features to superpower your data. Here are the top five features that will help you map context across your data estate and limit the repetitive manual work that slows data teams down.
TL;DR: 5 new features in Atlan you should know about
- Metadata Playbooks for rule-based actions: Like Zapier for data, this is the first low-code/no-code metadata automation for data teams.
- Atlan + AWS EventBridge event-based actions: Create production-grade, event-driven automations for the world of metadata, such as alerts when ownership changes or auto-tagged classifications.
- Profiling and Popularity Insights: Use new column-level profiling, popularity, and usage metrics to assess data’s quality, find the most widely used queries, identify top users, and more.
- Atlan to GitHub integration: Bring metadata right to GitHub to minimize risk and increase transparency before any changes are made to your data.
- Trident AI powered by GPT-3: Say goodbye to manual documentation with increasingly intelligent automated descriptions, business terms, READMEs, and more.
Metadata Playbooks: Introducing Zapier for your data estate
Create rule-based bulk automations at scale to automatically deprecate unused data assets, assign ownership, report failures or announcements, and more.
One of the common questions we get from data teams is, “How can we automate our metadata?” While other teams like marketing or sales can do action-based automation at scale with tools like Zapier or Salesforce, data teams don’t have the bandwidth to code custom automations for each diverse use case. Why can’t there be a Zapier for metadata?
That is why we have developed the first low-code/no-code metadata automation for data teams. With Atlan’s Playbooks, users can now create rule-based automations at scale. These can drive endless use cases across your organization. Here are a few examples:
- Deprecate assets: Mark any assets that haven’t been queried in the last 30 days as deprecated.
- Add/change ownership: For Salesforce assets that are missing an owner, add RevOps as the owner.
- Report failed assets: Post a list on Slack of every table and dashboard with a “failed” Airflow status.
- Protect sensitive data: Attach GDPR custom metadata on all assets tagged as “PII”.
- Flag upstream alerts: Notify downstream owners when an upstream asset is tagged with a warning or announcement.
Atlan + AWS EventBridge: Build custom, event-based metadata automations
Create production-grade, event-driven automations for the world of metadata, such as alerts when ownership changes or auto-tagged classifications.
Low-code/no-code, rule-based automations like Atlan playbooks are great, but some data teams don’t want to be limited by rules. They want to build their own automations for repetitive actions.
Other teams, from data observability to incident management, are building automations that can trigger actions based on events. For example, many access management teams use Okta to monitor system log events for suspicious activity and automate actions to mitigate risks. Customer engagement teams use Salesforce to create events that enrich support cases with order data from customers. But what about metadata management? There’s nothing available to create these type of event-driven use cases for metadata… until now.
We’re excited to announce that data teams can now use Atlan to create production-grade, event-driven automations for the world of metadata. This leverages our integration between Atlan and AWS EventBridge, an AWS service that creates an event and lets users consume and build use cases on it, to create Atlan metadata events into an EventBridge account.
Here are some examples of how this might work in Atlan:
- Ownership alerts: Get notifications in Slack when there is a change in the ownership of an asset.
- Propagation and classification: If someone marks a field as PII in an upstream data source, automatically create a masking policy to change all related fields downstream.
- And countless more use cases, such as notifications around schema changes on assets in Atlan for data engineering, login/logout events for security teams, or triggering a Fivetran enrichment event to kick off an Atlan workflow.
Leverage Profiling and Popularity Insights to build context and trust
Use new column-level profiling, popularity, and usage metrics to assess data’s quality, find the most widely used queries, identify top users, and more.
Being able to profile your data is powerful. It builds gives the context and insights that people need to understand, use, and trust in data, no matter whether they’re a data analyst or business user.
That’s why a new Profiling feature is now available within Atlan. With a range of new metrics, users everywhere can build trust in the data they are consuming.
- Business metrics: Average, maximum, minimum, mean, row count, sum, and more.
- Advanced metrics: Duplicates, frequencies, missing data, uniqueness, standard deviation, and more.
We’ve also added new Usage and Popularity features to help data teams manage their assets. Companies can use these new metrics to save costs on cloud data warehouse spending, identify the most used assets, find people with the most context on a data asset, and more.
- Asset query logs: The number of times an asset has been queried (asset popularity), when it was last updated, etc.
- User query logs: Top users, most recent users, who last queried an asset, etc.
- Cost and performance optimization logs: Which queries are slowest, which queries are most expensive, etc.
Atlan + GitHub: Enable data contracts by bringing metadata into your data creation process
Bring active metadata right to GitHub to minimize risk and increase transparency before any changes are made to your data.
A big discussion nowadays is moving data “to the left” — that is, moving important data processes and checks closer to when assets are created, rather than when they are distributed. This idea is part of the data contracts debate, which has highlighted how important it is to improve reliability and usability between the data producers and consumers.
Data contracts can play a critical role in data pipeline execution, validation of data types, versioning, and more. For instance, if a data engineer makes changes to a dbt model, they might unknowingly be affecting tens or hundreds of people, tables, or dashboards. How can we minimize this risk while also providing transparency for the data engineer and data consumer?
By bringing the power of active metadata from Atlan to GitHub, data engineers can now access all the context they need to minimize the risk for data consumers.
Here’s an example of what this looks like: say that you’re a data engineer, and you’ve created a pull request. When the GitHub action runs, it will automatically create a list of all downstream assets that will be affected by this request — before you make the change. From there, you can reach out to users in advance, or research and test the assets to see how they might be affected. As the saying goes, “prevention is better than cure”.
Automate metadata with Trident AI powered by GPT-3
Say goodbye to manual documentation with increasingly intelligent automated descriptions, business terms, READMEs, and more.
An existing feature in Atlan, Trident makes metadata enrichment fun and easy by providing suggestions for new descriptions, owners, terms, and classifications. While this has been extremely successful — in fact, one-third of all description updates on Atlan so far were made using Trident — our customers have been asking for more. More AI and ML, that is.
Introducing Trident AI, powered by GPT-3 — the powers of Trident combined with the intelligence of GPT-3. Currently under development, this feature will support use cases like:
- Creating descriptions for column names
- Creating descriptions for common business terms
- Writing a README for a business term
Before, Trident would provide a suggestion for these items, which you could then apply if you liked it. Now, with Trident AI, if you don’t like the suggestion, simply ask Trident AI for a change. Similar to ChatGPT, Trident AI will provide a more robust recommendation based on the power of AI and GPT-3.
💡 If you want to learn more, check out the full recording of Atlan Activate: Supercharged Automations to Map Your Entire Data Estate. You can also subscribe to our Product Updates newsletter for all the latest news.
💡 Ready to start using these new features? Reach out to our Sales team or your Customer Success Manager to find out how.