Data Quality & Observability

Detect anomalies anywhere in your data, in real time


Get to the root cause and resolve issues quickly

Data Catalog

Discover any data assets and understand how they are used

Discover the platform for yourself

Take a tour

Learn more

Customer stories

Hear why customers choose Validio


Data news and feature updates

Resource hub

Whitepapers and guides

Events & webinars

Upcoming events and webinars, and past recordings

Get help & Get started

OfferFit take their ML models to the next level with Validio

Read the case study
Data Trends & Insights

The best data books to read in 2024

Thursday, Dec 14, 20234 min read
Quyen Nguyen

Are you a data enthusiast searching for fresh sources of knowledge and insights? 

The data world has evolved rapidly in 2023 and it’s important to keep up with the new developments in the area. In our quest to offer you the best resources, we asked the team at Validio and members of the Heroes of Data network for their favorite data books. 

We’re excited to share with you a curated list of 9 books to inspire your reading list for 2024. The books range from data engineering and data science, to data mesh and data quality, and more. Most of the books are for people who want to learn more about the technical concepts of data, but you will find one or two business reads as well. 

Enjoy your new year with new knowledge and inspiration! 

1. Fundamentals of Data Engineering 

Author: Joe Reis & Matt Housley 

Recommended by: Matthew Weingarten, Senior Data Engineer at Disney Streaming Services.

“For anyone looking to get started in data engineering, this is the book to check out. Without getting too deep into specifics (which is a constant limitation of Data Engineers), this does a great job of hitting all the major points to consider if you want to do data engineering the right way.”

2. Designing Data-Intensive Applications

Author: Martin Kleppmann 

Recommended by: Mathias Kindberg, Software Engineer at Validio.

“The one stop shop on theory and practice powering everything data, from streaming to databases to batch processing.”

Also recommended by: Matthew Weingarten, Senior Data Engineer at Disney Streaming Services.

“For those looking into how to scale a system that deals with big data, this is a great book to cover key points that will almost assuredly arise at some point. A lot of great content here.”

3. Statistical Rethinking: A Bayesian Course with Examples in R and Stan

Author: Richard McElreath

Recommended by: Erik Thorsén, Senior Data Scientist at Validio.

“A great introduction to statistics and a more technical in depth introduction to causal inference. All the Bayesians in the world will love you after reading this. It's a great book, well written and has a lot of really good content.”

4. Hands-On Machine Learning with Scikit-Learn and TensorFlow

Author: Aurélien Géron

Recommended by: Sara Landfors, Head of Marketing at Validio.

“This is the book I used to learn to do machine learning in Python. It has all of the theory and tons of resources to code examples and exercises you can do. In short, it contains everything you need to learn data science.” 

5. Data Mesh: Delivering Data-Driven Value at Scale

Author: Zhamak Dehghani

Recommended by: Oscar Ek, Strategic Projects Manager at Validio.

“The data mesh bible directly from the thought leader Zhamak Dehghani. Covers all relevant concepts and goes into details with examples. I personally used the book a lot when implementing data mesh operating models for global enterprises.”

Also recommended by: Matthew Weingarten, Senior Data Engineer at Disney Streaming Services.

‘Learning the fundamentals of data mesh is key if you work with data in a bigger organization. This is a great model to follow for all teams that are ready to take on a challenge’

6. Data Quality: Empowering Businesses with Analytics and AI

Author: Prashanth Southekal PhD

Recommended by: Amit Sharma, Data Product Manager at wefox.

“In this book, Prashant has shared 10 Best Practices to achieve high data quality. He has also defined a Data Quality framework: DARS (Define- Analyse-Realise - Sustain)  which can be used to achieve high data quality.”

7. Driving Data Quality with Data Contracts: A comprehensive guide to building reliable, trusted, and effective data platforms

Author: Andrew Jones 

Recommended by: Amit Sharma, Data Product Manager at wefox.

“The book will guide you through the problems with current data architectures before introducing data contracts, a step change in building a new type of data architecture. It looks at how data contracts drive a change in your data culture and finishes with practical advice on implementing a successful data platform built around data contracts.”

8. The Book of Why:The New Science of Cause and Effect

Author: Judea Pearl, Dana Mackenzie

Recommended by: Erik Thorsén, Senior Data Scientist at Validio.

“An introduction to causal inference from one of its creators. It's not a technical book so perfect for a longer weekend where you want to learn a thing or two!”

9. Unmasking AI: My Mission to Protect What Is Human in a World of Machines

Author: Joy Buolamwini

Recommended by: Dhiana Deva, Staff Machine Learning Engineer at EQT Group.

“As the founder of the Algorithmic Justice League, celebrated “Poet of Code", a prominent figure in Netflix's "Coded Bias", and a researcher at MIT Media Lab, Joy Buolamwini has made her literary debut with her first book. Titled "Unmasking AI", this compelling work delves deeply into the intricacies of algorithmic discrimination, providing profound insights and proposing pathways towards the development of more equitable and accountable artificial intelligence technologies. Having met her many years ago and followed her journey, I highly recommend fellow AI practitioners to consider reducing one hour of LinkedIn hype scrolling a day and start reading this book - now!”

Have thoughts on the list and want to share your favorite data books as well? Let us know.

Want to be more informed on data observability related topics? Visit Validio’s website for more blogs and case studies in the area.