A curated list of the year’s best articles from the data world

Just like that, we’re at the end of 2022! And what a rollercoaster ride it has been with major changes and uncertainty across every industry. (Especially for the bird app users…)

A lot happened in the world of the modern data stack this year. We talked about job titles, thought about saying goodbye to data science, debated centralized vs. embedded data teams and bundling vs. unbundling, kickstarted important discussions like the technical pay gap, and so much more.

Whether you’re deep in this community or just started with data, it can be hard to keep up with everything. So, continuing our tradition from last year, we’re sharing the top blogs from 2022 along with some follow-up reading to keep you thinking. Happy reading!


P.S. Special shoutout to everyone who shared their data experiences, learnings, views, and observations this year! Now’s the time to have more open conversations about what we want for the future of data, and we’re so thankful for all the data practitioners who give their time to share insights, spark debate, and keep our industry moving forward.


On data as a product

Data product in changing environments: rethinking and updating investments by Eric Weber

The last few years have been full of ‘here’s what we need to do next’ or ‘once we have this team, we can do this’. We plan how we’d support more personas and areas of the business with more investment, but we don’t think about what we’d do if we had to cut support. I get it. That doesn’t feel very comfortable. But just like succession planning for people, we need to have a plan for what we’d do in hard situations. In some cases, you might drop support for particular personas on a product. In others, you might drop support for a product altogether. It isn’t easy to say what the ‘right answer’ is. But spending time thinking about your answer is important.

More follow-up reading:

On working with data

Should we be grateful for the modern data stack? by Benn Stancil

That’s the paradox we need to solve. Why has data technology advanced so much further than value a data team provides? Does all of this new tooling actually hurt, by causing us to lose focus on the most important problems (e.g., the data in Salesforce) in favor of the shiny new things that don’t actually matter (e.g., the data in our twenty-fifth SaaS app)? Has the industry’s talent not caught up with the capacity of its tools, and we just need to be patient? Is the problem more fundamental? I’m not sure. But if our 2032 selves want to be as grateful for 2020s as we should be for the 2010s, those are the next questions we need to answer.

More follow-up reading:

On data contracts

The rise of data contracts by Chad Sanderson

Data Contracts are API-like agreements between Software Engineers who own services and Data Consumers that understand how the business works in order to generate well-modeled, high-quality, trusted, real-time data.

Instead of data teams passively accepting dumps of data from production systems that were never designed for the purpose of analytics or Machine Learning, Data Consumers can design contracts that reflect the semantic nature of the world composed of Entities, events, attributes, and the relationships between each object.

This abstraction allows Software Engineers to decouple their databases/services from analytical and ML-based requirements. Engineers no longer have to worry about causing production-breaking incidents when modifying their databases, and data teams can focus on describing the data they need instead of attempting to stitch the world together retroactively through SQL.

More follow-up reading:

On building and leading a data team

Growing data teams from reactive to influential by Emily Thompson

Data teams tend to be a fairly scrappy bunch, and often default to rolling up their sleeves and building what they need in order to get unblocked. But there is an opportunity here to start influencing roadmaps on other teams. Rather than filling in the technology gaps themselves with messy workarounds, my team’s charter also prescribed that they make technical recommendations to the teams we depended on.

Because the data team was now required to proactively drive the conversation, they made the time to work with partners and propose cross-functional solutions. Foundational work was considered part of the backlog of ‘impact-driving’ work, which led to specific quarterly goals, and progress was tracked just as every other initiative owned by the data team.

More follow-up reading:

BONUS: We talked with four amazing data leaders — Stephen Bailey (Data Engineer at Whatnot), Erica Louie (Head of Data at dbt labs) and Taylor Murphy (Head of Data at Meltano), and Gordon Wong (Founder of Wong Decision Intelligence; formerly Senior Leader of Business Intelligence at Hubspot) — about what it takes to succeed in your first 365 days as a data leader. Download the Secrets of a Modern Data Leader ebook here.

On metrics, data catalogs, active metadata, and more

People-first data stacks by Ilan Man

The problem is your stakeholders, while giving you the thumbs up the whole time and claiming they’d love an easier way to discover data, are no longer using the tools you’ve painstakingly researched and implemented. They fall into their old habits and inevitably you see an incorrectly defined metric on a Powerpoint slide somewhere.

We need to ensure stakeholders adopt data tools in the ways they should. Reading documentation and taking a training is not enough. We need to reinforce good data-tooling hygiene. I’ve seen many instances of folks starting out in a BI tool, and a few months later they’re back in Excel, pivoting a CSV and pasting it into a presentation. There should always be room for creative solutions and serendipity, but the Data team needs to keep an eye on how stakeholders use the tools they implement. Data models and BI tools need to adapt to business changes.

More follow-up reading:

Bonus picks

Still want more? Here are a few more articles to keep you reading and thinking through the new year:


This article was also published on Medium.

Header image: Aaron Burden on Unsplash

Write A Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.