A curated list of blogs, books, newsletters, podcasts, and communities for all things modern data stack
Recently, one of our new hires at Atlan asked me, “What are the resources you recommend for me to stay on top of what’s happening in the modern data stack?”
The modern data stack is messy and complicated, and it’s changing every day. There’s tons of news about it, and it’s hard to separate the hype and noise from reality.
Here’s how our team keeps in touch with the latest news and trends.
Modern Data Stack 101
The Building Blocks of a Modern Data Platform
My blog post is a beginner’s guide to defining a modern data platform, the key building blocks of a modern data platform, and the top tools and companies at every stage of the stack.
Emerging Architectures for Modern Data Infrastructure
A great, in-depth read from a16z about which technologies are winning in the modern data stack, based on interviews with 20+ practitioners.
Modern Data Stack Conference 2020
Resources from Fivetran’s first Modern Data Stack Conference on the latest innovations, tools, and best practices.
The Modern Data Stack: Past, Present, and Future
This blog from Tristan Handy is a good primer to the fundamental innovations that created the modern data stack, where we’re at right now, and key spaces to pay attention to for future innovation.
Building & Running Data Teams
Creating a Data-Driven Organization
This is one of my favorite beginner books on what it takes to build and run a data-driven organization. It features practical advice from the trenches by Carl Anderson, who wrote the book while he was the Director of Data Science at Warby Parker and is currently the VP of Data at WW (Weight Watchers).
Analyzing the Analyzers
My biggest challenge with data teams is stereotyping people to traditional JDs. In my experience, I have never found a typical “data scientist” or a “data engineer” or a “data analyst”.
I’ve found Sandy, who is an economist by training and amazing at understanding business problems, identifying solutions, and prototyping data science methods to solve them. Or Pam, who is a computer engineer by training and is great at productionalizing and scaling models, but also enjoys running POCs. Or Mark, who’s also a computer engineer by training but is more of a generalist who excels in a role that’s 50% data scientist and 50% engineer.
This book by Sean Murphy, Marck Vaisman, and Harlan Harris breaks the norms we have about the different types of people in data teams. Instead of trying to stereotype the humans of data, it builds heat maps of different skill sets and how they relate to different roles in a data team.
Slack Communities
Locally Optimistic
This is one of my favorite vendor-neutral communities of data leaders and practitioners. It’s full of very thoughtful discussions, especially regarding data teams and structures.
dbt
dbt’s Slack community is one of the most lively groups of data practitioners, or “analytics engineers” as they call them.
Great Expectations
This is an emerging community, mostly filled with data engineers hanging out and talking about one of my favorite topics: trust!
Newsletters
The Data Science Roundup
Every week, Tristan Handy curates a great set of data links along with his narrative and thoughts, which are always interesting to read. The posts are far-ranging, covering everything from data culture and data teams to new tools and layers of the stack.
Data Engineering Weekly
Anand Packkildurai curates the top reads every week, mostly focused on data engineering.
Data Council
Their newsletter includes top reads, upcoming events and more, often focused on open source projects.
Modern Data Stack
Curated by Andrew Ermogenous, this newsletter shares blogs, guides, and podcasts on the modern data stack and data culture.
Blogs & Podcasts
Towards Data Science
This is probably the most popular blog covering all things data. With a wide range of external contributors and excellent content editing guidelines, it has become the go-to destination for data practitioners to share their articles.
The Data Engineering Podcast
Hosted by Tobias Macey, this podcast mostly covers in-depth conversations about different data tools. The modern data stack can get messy very quickly, so this is a great resource to unpack the hype and marketing messages. His conversations always get down to the information we need to know about a new data tool or project: how it works, how it’s deployed, how it compares with other tools, etc.
Locally Optimistic
This blog focuses on practical tips, tricks, and learnings from a diverse range of data leaders about building and running data teams.
Enjoyed this? Subscribe to the Humans of Data Substack to receive the next post from Prukalpa in your inbox.
This article was originally published in Towards Data Science.