Heroes of Data

Revolutionizing Analytics in Insurance using Fivetran, dbt and Vertex AI—an inside look with Hedvig

August 24, 2022

Sara Landfors

“People didn’t wake up one day and decide that insurance is a boring financial commodity. It’s the insurance industry’s fault.” This is how Hedvig—a Next Generation Insurance company—started their presentation at our last Heroes of Data Meetup. They want to “provide reliable insurance that enables freedom: the possibility to live a more colorful and exciting life without worry and fear.” In this article, we’ll summarize Hedvig’s analytics journey and how it has enabled the company’s customer experience with a Net Promoter Score of 10x that of its competition!

null

Heroes of Data is an initiative by the data community for the data community. We share the stories of everyday data practitioners and showcase the opportunities and challenges that arise daily. Sign up for the Heroes of Data newsletter on Substack and follow Heroes of Data on Linkedin. This article is summarized by Sara Landfors based on a presentation by Hedvig at the Data Engineering Meetup by Heroes of Data in June 2022.

First, we’ll introduce Hedvig in a little bit more detail, then we’ll shortly talk about Hannes, Filip and Jacinta who are some of the masterminds behind Hedvig’s data stack and members of the Heroes of Data community. Third, we’ll dive into the bulk of this article which is Hedvig’s analytics journey: what tools they have used in their data stack and for what purposes, how that has changed over time and their visions for the future. Let’s dive in!

A little bit more about Hedvig

Hedvig was founded in 2017 and is a challenger in the insurance industry whose leading companies are 150 years old(!) Hedvig saw an opportunity to disrupt the space which—despite its large size—had experienced relatively little digital innovation and change compared to other traditional industries. As part of realizing this disruption, they have designed a completely different, smooth and fast claims experience. All in all, this has led to tremendous growth, especially via referrals, and especially in their home country, Sweden. They also market themselves very differently from traditional insurers. For example, they were the first insurance company to cover beauty products—resulting in a Vogue feature (see below).

null

So how does this relate to data? Well, this success would not be possible without a decision-making process backed by data and analytics, as we shall see.

The team behind Hedvig’s data & analytics machinery

Hedvig is a quite large startup with many people working with data & analytics. The three Heroes of Data who presented this particular work at our Meetup are Hannes, Filip and Jacinta:

Hannes Kindbom has worked with Hedvig in different forms since the company’s inception. Since last summer, he works as Analytics & ML Engineer, coming straight from studies in machine learning and applied mathematics at Royal institute of Technology in Stockholm, where he graduated with two Master’s degrees.

Jacinta Waak has been with Hedvig for around half a year and heads up the Analytics team. She has a long background leading various teams in market research, analytics product development and insights delivery.

Filip Allard is a Senior Pricing Data Analyst at Hedvig. He is a theoretical particle physicist who ended up trying to change the insurance industry after having awful insurance experiences during his own cancer treatment, which fired up his passion to make a change.

Unlike many of his colleagues at Hedvig, Filip had previously worked at a more traditional insurance company, where he saw first-hand the difficulties that could come with handling insurance data. For example, having trouble moving 3rd party data into an analytics environment, and the mess that could result from performing too many transformations with too little governance. In addition, he experienced the pain of trying to make a central data team add data features and having to wait two months for implementation, limiting the speed of the business. Last but not least, he also saw great improvement potential in using more and better machine learning models for pricing insurance—something his old employer didn’t see in the same positive light.

Filip, Jacinta and Hannes were determined to change all of this at Hedvig, and couldn’t wait to get started!

The first phase: ad-hoc queries saved in Google docs

Back in 2020, before Filip, Jacinta and Hannes took the reins, the original data & analytics team at Hedvig consisted of only two people who performed ad-hoc queries directly to the production database in Postgres. They then saved the queries in Google docs—which worked but was less than ideal.

From these beginnings, the backend was hosted in AWS, but the team had some hesitations about the analytics capabilities of AWS, which we’ll get back to shortly.

null

Photo by Robert Bye on Unsplash

The second phase: dbt, Postgres in AWS and Tableau

With the crude Google docs setup soon becoming too painful, the early analytics team at Hedvig set out to create the company’s first real data stack. This first “data stack car” consisted of dbt, Tableau, and was still powered by Postgres running on AWS.

null

Photo by Jonathan Borba on Unsplash

However, this setup was still quite ad hoc—any type of analysis would require its own manual query and Tableau dashboard. Tableau also ended up not being so scalable since it was run locally, and relied on SQL skills in order to be able to create useful visualizations. Last but not least, the setup was slow. The dbt models took around an hour to run in Postgres—clearly not meeting the expectations or the potential of the business.

The team needed to look at other tools to reach their vision for the insurance industry.

The third phase: Fivetran, Looker, Google BigQuery, and Vertex AI

This is when Hedvig upgraded to a polished sports car data stack! The new data stack featured:

Fivetran to transfer data to Google Cloud Platform (GCP). This was a shift from the analytics software in AWS that they knew they wanted to move away from
BigQuery (GCP) as the engine and data warehouse of the analytics setup
Dbt to perform transformations on the data into analytics-ready data sets
Looker for visualizations
Vertex AI to build infrastructure for machine learning and pricing models

null

Photo by Yuvraj Singh on Unsplash

This sports car setup brought with it tremendous benefits: the daily materialization of the dbt models went down to 10 minutes (from an hour!) despite serving more models in production. The maintenance burden for the engineering team went down significantly, primarily due to the plug-and-play interface to ingest data from Fivetran. Hannes made sure to mention that this has enabled the data team to focus more on enabling the business than they would have if they built custom functionality from scratch themselves. Finally, he mentions that the data stack sports car is “easy to drive.” In fact, it is so easy to use that many business stakeholders can “drive” themselves making data very accessible throughout the organization—largely thanks to Looker’s easy-to-use exploration mode.

More specifically, this data stack allows the pricing team to manage all the pricing data themselves; everything from curating model training datasets to performing transformations. If there’s some data point they’re lacking, they can go into the raw data and add it in a fast and easy way themselves. Once the training datasets are prepared, the team uses Vertex AI to ingest the data and run various machine learning model training pipelines. Once the models are ready, Vertex AI enables the pricing team to deploy prices live to customers in containers via API endpoints. When Filip talks about this, you can see his eyes light up!

All in all, quite the step change versus the old insurance industry with regards to data, tools, models, deployment and the customer experience enabled by all of this.

null

Four different Hedvig machine learning models trained in Vertex AI.

The future: Perhaps Airbyte and a lakehouse architecture?

For the future, the team believes that Airbyte might become a more viable alternative to Fivetran as it might do the same job but in a more cost-effective way. In addition, they might look into a lakehouse architecture. Instead of copying the backend Postgres data into BigQuery, this would mean that they load their raw event data directly into GCP. However, this remains to be seen as the team investigates pros and cons of each setup.

null

Photo by Ryan on Unsplash

Closing thoughts

We’re so impressed with Hedvig’s data & analytics journey, and can’t wait to hear more about what they’ll accomplish. It’s exciting to see how their data stack enables their pricing models and their customer experience. Stay tuned for more!

If you’re intrigued, feel free to check out Hedvig’s career page here.