So how does this relate to data? Well, this success would not be possible without a decision-making process backed by data and analytics, as we shall see.
The team behind Hedvig’s data & analytics machinery
Hedvig is a quite large startup with many people working with data & analytics. The three Heroes of Data who presented this particular work at our Meetup are Hannes, Filip and Jacinta:
Hannes Kindbom has worked with Hedvig in different forms since the company’s inception. Since last summer, he works as Analytics & ML Engineer, coming straight from studies in machine learning and applied mathematics at Royal institute of Technology in Stockholm, where he graduated with two Master’s degrees.
Jacinta Waak has been with Hedvig for around half a year and heads up the Analytics team. She has a long background leading various teams in market research, analytics product development and insights delivery.
Filip Allard is a Senior Pricing Data Analyst at Hedvig. He is a theoretical particle physicist who ended up trying to change the insurance industry after having awful insurance experiences during his own cancer treatment, which fired up his passion to make a change.
Unlike many of his colleagues at Hedvig, Filip had previously worked at a more traditional insurance company, where he saw first-hand the difficulties that could come with handling insurance data. For example, having trouble moving 3rd party data into an analytics environment, and the mess that could result from performing too many transformations with too little governance. In addition, he experienced the pain of trying to make a central data team add data features and having to wait two months for implementation, limiting the speed of the business. Last but not least, he also saw great improvement potential in using more and better machine learning models for pricing insurance—something his old employer didn’t see in the same positive light.
Filip, Jacinta and Hannes were determined to change all of this at Hedvig, and couldn’t wait to get started!
The first phase: ad-hoc queries saved in Google docs
Back in 2020, before Filip, Jacinta and Hannes took the reins, the original data & analytics team at Hedvig consisted of only two people who performed ad-hoc queries directly to the production database in Postgres. They then saved the queries in Google docs—which worked but was less than ideal.
From these beginnings, the backend was hosted in AWS, but the team had some hesitations about the analytics capabilities of AWS, which we’ll get back to shortly.
The second phase: dbt, Postgres in AWS and Tableau
With the crude Google docs setup soon becoming too painful, the early analytics team at Hedvig set out to create the company’s first real data stack. This first “data stack car” consisted of dbt, Tableau, and was still powered by Postgres running on AWS.
However, this setup was still quite ad hoc—any type of analysis would require its own manual query and Tableau dashboard. Tableau also ended up not being so scalable since it was run locally, and relied on SQL skills in order to be able to create useful visualizations. Last but not least, the setup was slow. The dbt models took around an hour to run in Postgres—clearly not meeting the expectations or the potential of the business.
The team needed to look at other tools to reach their vision for the insurance industry.
The third phase: Fivetran, Looker, Google BigQuery, and Vertex AI
This is when Hedvig upgraded to a polished sports car data stack! The new data stack featured:
- Fivetran to transfer data to Google Cloud Platform (GCP). This was a shift from the analytics software in AWS that they knew they wanted to move away from
- BigQuery (GCP) as the engine and data warehouse of the analytics setup
- Dbt to perform transformations on the data into analytics-ready data sets
- Looker for visualizations
- Vertex AI to build infrastructure for machine learning and pricing models
This sports car setup brought with it tremendous benefits: the daily materialization of the dbt models went down to 10 minutes (from an hour!) despite serving more models in production. The maintenance burden for the engineering team went down significantly, primarily due to the plug-and-play interface to ingest data from Fivetran. Hannes made sure to mention that this has enabled the data team to focus more on enabling the business than they would have if they built custom functionality from scratch themselves. Finally, he mentions that the data stack sports car is “easy to drive.” In fact, it is so easy to use that many business stakeholders can “drive” themselves making data very accessible throughout the organization—largely thanks to Looker’s easy-to-use exploration mode.
More specifically, this data stack allows the pricing team to manage all the pricing data themselves; everything from curating model training datasets to performing transformations. If there’s some data point they’re lacking, they can go into the raw data and add it in a fast and easy way themselves. Once the training datasets are prepared, the team uses Vertex AI to ingest the data and run various machine learning model training pipelines. Once the models are ready, Vertex AI enables the pricing team to deploy prices live to customers in containers via API endpoints. When Filip talks about this, you can see his eyes light up!
All in all, quite the step change versus the old insurance industry with regards to data, tools, models, deployment and the customer experience enabled by all of this.