Personally Identifiable Information (PII) is a no-no when it comes to collecting analytics data. Not only is it against Google’s strict guidelines, but you can also be in violation of federal, or EU legislation drafted to protect the privacy of individuals online. With that said, sometimes a flaw in your data collection can mean that PII is being passed into Google Analytics by accident. If you find PII in your analytics, what should you do and how can you prevent this from happening again in the future?
What’s Considered PII?
According to Google, Personally Identifiable Information (PII) can include (but is not limited to), “information such as email addresses, personal mobile numbers, and social security numbers.” This list can go on to include credit card numbers, addresses, first names, or anything that can distinguish or trace an individual’s identity. In short: if a piece of data can be used to distinguish the exact person who made that visit or performed that action, it can’t be passed to Google Analytics. While we all try to abide by privacy protections and various laws, collecting PII by accident is more common than one might think.
Recently, we discovered a client was unintentionally collecting email addresses in their event tracking. Due to how their tracking was set up, email addresses were accidentally being collected for months, without anyone noticing. PII can be found anywhere, from custom dimensions to query string parameters and data imports.
Audit Your Analytics Account
If you are the analytics owner, it’s imperative to make sure you audit your account on a regular basis. One method of checking your analytics data for Google Analytics is the PII Viewer. This Chrome extension allows you to map the user ID stored in Google Analytics to PII such as name and email address stored locally.
You can also manually go through your account, checking the following for PII:
- Query string parameters
- Event dimensions
- Custom dimensions
- Campaign parameters
- Site search
What Happens if Google Finds Out?
If Google discovers you’re collecting PII (intentionally or not), they could terminate your Google Analytics account and permanently delete all of your data. Depending on where you live, where your users live, and the type of breach, you may be violating a law that could mean fines or even misdemeanor charges. If you want to be safe, contact your legal team.
Steps to Take When PII is Discovered
If you discover personally identifiable information in GA, don’t panic but act immediately. This is not something to be taken lightly, and resolving this issue should become your number one priority.
Remove the Source
The first step is to get ahead of Google and make sure you immediately remove the source that’s capturing PII. Stop the data collection in its tracks and make sure nothing is coming through. For example, our client had set up event tracking to pass in an email address as the event action, so our first step was to remove that information from their event tracking.
Create a new View in Google Analytics.
Now that the site is not actively collecting PII data, create a new view. This new view should be clean of all PII.
Back up the existing View
This is something that’s trickier in practice and unfortunately doesn’t appear to be a feature of the free version of Google Analytics. For Google 360 users, you’re able to set up a BigQuery export of all your data: lucky you! However, for those using the free version of GA, this can get tricky. No money, no data.
For our client, they ended up creating Google data studio dashboards of their most important data, such as PDF downloads, number of logins, and sessions. After visualizing all the data they considered to be most important, they were then able to download and export it into Google Sheets.
This method does take a long time, and while not a perfect solution, they were able to hold on to the information they needed most. If you go this route, make sure you’re not storing the PII you accidentally collected as part of your backup; removing it from GA will put you in compliance with Google’s Terms of Service, but continuing to store the data elsewhere may still put you afoul of international privacy laws as mentioned above.
Delete the corrupted View
Since we can’t retroactively remove the PII data from a Google Analytics Account, you’ll need to ultimately delete the view entirely and start over with a new one. Losing your data can be hard, and it might seem unimaginable that you’d voluntarily delete all your historical analytics data. However, this price is worth paying to protect users’ privacy. I’m glad there are laws and guidelines in place to protect our information.
There is no perfect solution when finding PII in your analytics data, but in the end, the purpose of all this hassle is to ensure the integrity of your users’ identities. While you may have inadvertently started collecting PII by error, there are plenty of sites out there who might collect it for more nefarious purposes, and Google’s terms of service are there to protect users from that sort of abuse. Preventing PII from passing into Google Analytics is the ethical thing to do.