Volkswagen Data Leak: Location Tracking and Privacy Issues

I've always been fascinated by analyzing real-world technical incidents. These cases offer valuable learning opportunities because they reveal actual problems that companies face.

Jan 03, 2025

They're rarely simple to understand, as most incidents stem from multiple interconnected issues. While technical failures often lead to data breaches or outages, the root causes frequently lie in engineering team decisions, lack of impact awareness, or insufficient guidelines and processes in larger companies. Even with robust safeguards, mistakes happen—and how companies respond to these incidents is particularly revealing.

Unfortunately, many details remain hidden since internal information is often restricted to company employees, with only a handful of people knowing the complete story. This lack of transparency hinders learning opportunities in our industry. Other sectors, like aviation, do better at sharing information publicly, allowing everyone to learn from incidents.

For this reason, I'll include a clearly marked speculative section in these reports to address gaps where internal information isn't available.

Recently, the Volkswagen group experienced a major data leak exposing user data and, more critically, vehicle tracking information. While most news articles provided only surface-level coverage, an excellent video from CCC (Chaos Computer Club) thoroughly analyzes the breach and details their disclosure to Volkswagen. Here's the video that provided much of my information: https://media.ccc.de/v/38c3-wir-wissen-wo-dein-auto-steht-volksdaten-von-volkswagen.

What happened?

Let's examine exactly what happened from a technical perspective. Some articles incorrectly stated that an AWS S3 bucket was accessible from the outside—this isn't accurate. Let me explain.

First, we need to understand Volkswagen's structure. The Volkswagen Group owns multiple car brands, with VW, Seat, Škoda, and Audi being the most well-known. These brands share one electronic platform when producing combustion and electric vehicles. This approach is efficient—the company develops one core concept that each brand can customize. For example, while the VW ID.4 and Škoda Enyaq share the same base platform, they look and drive differently. More importantly, they have distinct user interfaces in the car and their smartphone apps.

This brand-specific customization makes sense for maintaining distinct brand identities. However, it appears there's one common backend system for all cars, as they require the same core systems. While I don't have access to the system architecture diagrams, this is supported by information from https://cariad.technology/de/en/solutions/unified-software.html. Cariad, Volkswagen's software division, handles all software development for the Volkswagen group's cars. They're developing a unified platform, vehicle OS, and vehicle cloud for the entire group.

This unified system explains why the breach affected multiple car brands.

The problem

An interested security researcher examined the domain where requests were sent and ran scripts to discover potential vulnerabilities. It's important to note that this was legal—domain names are publicly accessible. This should be obvious, but it bears repeating: security through domain name obscurity doesn't exist. Any domain name can be discovered, especially those using standard DNS.

Tools like subfinder can discover subdomains by checking multiple passive sources including SecurityTrails, certificate transparency logs, and public DNS databases. If your subdomain has a Let's Encrypt certificate, subfinder will find it.

After finding interesting domains, the researcher used a brute-force tool to check for accessible directories on the website.

They discovered an unsecured Spring Boot actuator endpoint. For context, Spring Boot is a dominant Java framework used by major companies like Netflix. It provides access to resources via endpoints, including built-in actuator endpoints that let you monitor and interact with applications. These can check application health and thread dumps—but more critically, when misconfigured, they can expose log files and heap dumps. In Volkswagen's case, the actuator endpoints were public and provided access to heap dumps.

A heap dump is essentially a snapshot of an application's current memory state, including all variables and keys. The researcher found AWS keys in the heap dump's variables using simple tools.

This gave access to an S3 bucket containing enrollment data and sensitive location information. They also discovered the backend's Identity Provider (IDP) client ID and secret in the heap dump's binary streams. The IDP handles authentication and authorization, so these credentials allowed the researchers to impersonate users based on the IDs found in the S3 bucket. While they couldn't control the cars, they gained access to sensitive user information. They could access complete user profiles and location data by combining this data. The breach revealed:

Enrollment data for both electric and non-electric vehicles, including details like VIN, model, year, and user ID
User data, including name, email, phone, and in some cases, physical addresses and preferred dealerships
EV data: Mileage, battery temperature, battery status, charging status, and even warning light data
Tracking data only for electric cars: GPS coordinates of the vehicles’ locations recorded every time the engine was turned off

The data was pushed to the S3 bucket every 24 hours, so it wasn't real-time. The reason tracking data was only available for electric cars in the S3 bucket remains unclear, as VW's fleet management system for combustion cars has tracking capabilities. Perhaps this data is collected but stored elsewhere.

Tracking data

Let's recap. We now have user profiles showing which cars people drive and tracking data that sometimes spans years. While this data collection is covered in the terms and conditions for product improvement analysis, Volkswagen says they track this data to understand battery lifecycles better. Still, the need for location data remains unclear. The terms and conditions state that GPS data is truncated, which would significantly reduce tracking capabilities if accuracy drops to around 10 kilometers. Audi and Škoda implemented this correctly—cars from their fleet had location data truncated to approximately 10-kilometer accuracy. However, the problem arose with VW and Seat vehicles, where location data remained precise down to 10 centimeters. Why did this happen? My assumption is that while Cariad created the backend and ecosystem, each brand (Škoda, Seat, VW, and Audi) created their own UI/UX and apps, as mentioned earlier. In Audi and Škoda's car apps, developers truncated the GPS coordinates before sending them to the backend, while Seat and VW did not. This suggests the backend should have handled data anonymization rather than relying on front-end implementation.

With this data, the CCC could track individuals, politicians, police cars, military bases, and their visitors, people entering and leaving brothels, and more. They even discovered Volkswagen's secret testing location in Sweden. The video contains many fascinating details about these findings, which are presented in an anonymized format.

What cars were affected?

As mentioned, the leak included combustion engines and electric vehicles, though the tracking data was only found in this specific S3 bucket for EVs. The CCC presentation included screenshots showing all affected car models. It's important to note that the tracking data was less concerning for Škoda and Audi vehicles than for VW and Seat vehicles (Cupra being a Seat sub-brand).

Enrollment Data from combustion and electric cars

Cars where tracking data was captured. Audi and Škoda limited by 10 kilometers, but VW and Seat down to 10 centimeters

How did that happen?

Now, we enter the speculative section. Since there aren't many details available, I'll share my interpretation of what might have happened in a company of this size. The exact way this issue slipped through will likely remain unknown.

Let's examine the organizational structure. Volkswagen has subbrands—Seat, Škoda, VW, and Audi—plus Cariad, which handles the unified platform. Cariad likely developed the Spring Boot backend and provided expertise to the subteams. It's unclear whether these subteams operate within Cariad or the individual brands. As large EU-based companies, Volkswagen and Cariad must follow data protection laws that mandate specific roles and processes at certain company sizes. Presumably, there were checks in place to prevent such issues. This was evident in their response to CCC's report—they quickly connected with the right people and fixed the issue without denial, which isn't always the case.

Looking at the Spring Boot Actuator endpoint specifically, before version 1.5, this endpoint was publicly exposed by default without authentication. According to this blog post https://www.wiz.io/blog/spring-boot-actuator-misconfigurations, about 2.3% of Spring Boot Actuator instances in cloud environments still have an exposed heap dump endpoint without authentication.

Since Spring Boot version 1.5 is over 7 years old, that's unlikely to be the cause. Instead, someone must have explicitly enabled the heap dump endpoint in production without authentication.

Enabling this endpoint requires a simple configuration change in the application.properties or application.yml file. I see two possible scenarios:

The change was meant only for the test system, but somehow, the production configuration files were modified and deployed
The change was intentionally made in production to diagnose performance and memory issues through heap dumps

If it was the second scenario, it likely went through as a ticket. Perhaps it was labeled a simple configuration change, bypassing information security and other security-related processes. But why wasn't the endpoint protected?

I can think of two possibilities:

The developers didn't understand the security implications
They understood the risks but knew that protecting the endpoint required involving another team that was hard to reach, so they chose the quick fix of addressing memory issues first and closing the endpoint later.

Both scenarios seem plausible based on real-world experience. When organizational processes are cumbersome, people often take shortcuts to achieve their goals, sometimes overlooking security implications.

What can we learn from that?

The first key lesson is that no subdomain is truly hidden. Security through obscure subdomain names does not exist.
Engineers, as the first line of defense, must understand the full impact of their changes.
Configuration changes are often the root cause of security issues, and we'll see this pattern repeatedly in this series. Why? Because we tend to view them as "just small changes" since they don't modify any "real" code.

For CTOs, VPs, and other leaders, the lesson is to create security processes that are both robust and easy to follow. While checking user stories for data privacy issues is important, the process must be straightforward—otherwise, people will skip it. Ensure all code and configuration changes are appropriately reviewed and analyzed. Implement automated tools to scan your codebase for vulnerabilities. AI might help prevent such issues by flagging problematic commits in the future. However, issues will slip through even with processes and learnings in place. Even the world's largest companies, with enormous budgets, face these challenges. Why? Because modern software development is complex and becomes more complicated. While frameworks reduce complexity, they also obscure what happens behind the scenes. Many young engineers graduating from university haven't encountered a heap dump before and don't grasp the severity of making one publicly available. To be frank, when I graduated from university, I had no idea of a heap dump.

This case offers another crucial lesson: always sanitize values before storing them in an S3 bucket—for example, truncate GPS coordinates. Volkswagen does not have a justification for storing precise vehicle coordinates down to 10 centimeters. In fact, storing GPS values at all is questionable. Implement data sanitization and, if possible, data encryption at the backend level to ensure consistency when dealing with multiple frontends.

One positive takeaway from this incident is Volkswagen's response. According to CCC, they handled the issue professionally and fixed it promptly. While this might seem standard procedure, many companies instead respond with lawsuits or deny issues for years.

Screenshots were taken from the video “Wir wissen wo dein Auto steht” and subfinder GitHub Repository.

Thank you for reading my newsletter!

I hope you enjoyed that one. Have any suggestions or want to connect? Feel free to message me on Bluesky or Threads.

Sound of Development

Discussion about this post