A curated list of blogs, books, newsletters, podcasts, and communities for all things modern data stack

Recently, one of our new hires at Atlan asked me, “What are the resources you recommend for me to stay on top of what’s happening in the modern data stack?”

The modern data stack is messy and complicated, and it’s changing every day. There’s tons of news about it, and it’s hard to separate the hype and noise from reality.

Here’s how our team keeps in touch with the latest news and trends.

Modern Data Stack 101

The Building Blocks of a Modern Data Platform

My blog post is a beginner’s guide to defining a modern data platform, the key building blocks of a modern data platform, and the top tools and companies at every stage of the stack.

Emerging Architectures for Modern Data Infrastructure

A great, in-depth read from a16z about which technologies are winning in the modern data stack, based on interviews with 20+ practitioners.

Modern Data Stack Conference 2020

Resources from Fivetran’s first Modern Data Stack Conference on the latest innovations, tools, and best practices.

The Modern Data Stack: Past, Present, and Future

This blog from Tristan Handy is a good primer to the fundamental innovations that created the modern data stack, where we’re at right now, and key spaces to pay attention to for future innovation.

Building & Running Data Teams

Creating a Data-Driven Organization

This is one of my favorite beginner books on what it takes to build and run a data-driven organization. It features practical advice from the trenches by Carl Anderson, who wrote the book while he was the Director of Data Science at Warby Parker and is currently the VP of Data at WW (Weight Watchers).

Analyzing the Analyzers

My biggest challenge with data teams is stereotyping people to traditional JDs. In my experience, I have never found a typical “data scientist” or a “data engineer” or a “data analyst”.

I’ve found Sandy, who is an economist by training and amazing at understanding business problems, identifying solutions, and prototyping data science methods to solve them. Or Pam, who is a computer engineer by training and is great at productionalizing and scaling models, but also enjoys running POCs. Or Mark, who’s also a computer engineer by training but is more of a generalist who excels in a role that’s 50% data scientist and 50% engineer.

This book by Sean Murphy, Marck Vaisman, and Harlan Harris breaks the norms we have about the different types of people in data teams. Instead of trying to stereotype the humans of data, it builds heat maps of different skill sets and how they relate to different roles in a data team.

Two great books from O’Reilly about building and running data teams

Slack Communities

Locally Optimistic

This is one of my favorite vendor-neutral communities of data leaders and practitioners. It’s full of very thoughtful discussions, especially regarding data teams and structures.

dbt

dbt’s Slack community is one of the most lively groups of data practitioners, or “analytics engineers” as they call them.

Great Expectations

This is an emerging community, mostly filled with data engineers hanging out and talking about one of my favorite topics: trust!

Newsletters

The Data Science Roundup

Every week, Tristan Handy curates a great set of data links along with his narrative and thoughts, which are always interesting to read. The posts are far-ranging, covering everything from data culture and data teams to new tools and layers of the stack.

Data Engineering Weekly

Anand Packkildurai curates the top reads every week, mostly focused on data engineering.

Data Council

Their newsletter includes top reads, upcoming events and more, often focused on open source projects.

Modern Data Stack

Curated by Andrew Ermogenous, this newsletter shares blogs, guides, and podcasts on the modern data stack and data culture.

Snippets from the Data Science Roundup (left) and Modern Data Stack (right) newsletters

Blogs & Podcasts

Towards Data Science

This is probably the most popular blog covering all things data. With a wide range of external contributors and excellent content editing guidelines, it has become the go-to destination for data practitioners to share their articles.

The Data Engineering Podcast

Hosted by Tobias Macey, this podcast mostly covers in-depth conversations about different data tools. The modern data stack can get messy very quickly, so this is a great resource to unpack the hype and marketing messages. His conversations always get down to the information we need to know about a new data tool or project: how it works, how it’s deployed, how it compares with other tools, etc.

Locally Optimistic

This blog focuses on practical tips, tricks, and learnings from a diverse range of data leaders about building and running data teams.


Enjoyed this? Subscribe to the Humans of Data Substack to receive the next post from Prukalpa in your inbox.

This article was originally published in Towards Data Science.

Write A Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.