Thanks to how your Google Home voice assistant records our conversations, which are sometimes triggered by mistake, audio clips – both those recorded on purpose and otherwise – are being sent to engineers working on Google Home voice processing.

How it’s supposed to work: Google Home should only be activated when someone says the triggers “OK, Google” or “Hey, Google.” But it’s not hard to flip that switch accidentally: if someone nearby says “Google,” or even a word that sounds like “Google,” the speaker often starts recording.

The audio clips have included people’s bedroom sound symphonies, their kids’ or grandkids’ voices, payment information from transactions, medical information they divulge when searching on their ailments, and far more.

This all comes from a new report from Belgian broadcaster VRT News that relied on input from three Google insiders.

Listening in on the kids

With the help of a whistleblower, VRT listened to some of the clips. Its reporters managed to hear enough to discern the addresses of several Dutch and Belgian people using Google Home, in spite of the fact that some of them never said the listening trigger phrases. One couple looked surprised and uncomfortable when the news outlet played them recordings of their grandchildren.

The whistleblower who leaked the recordings was working as a subcontractor to Google, transcribing the audio files for subsequent use in improving its speech recognition. They reached out to VRT after reading about how Amazon workers are listening to what you tell Alexa, as Bloomberg reported in April.

They’re listening, but they aren’t necessarily deleting: a few weeks ago, Amazon confirmed – in a letter responding to a lawmaker’s request for information – that it keeps transcripts and recordings picked up by its Alexa devices forever, unless a user explicitly requests that they be deleted.

VRT talked to cybersecurity expert Bavo Van den Heuvel, who spotted potential dangers in the prospect of humans listening to our voice assistant recordings, given that they can be made just about anywhere: in a doctor’s office, in a business meeting, or where people deal with sensitive files, such as police stations, lawyers’ offices or courts.

It’s not just Dutch and Belgian contractors who are listening to Google Home requests, though those are the only recordings VRT listened to. The whistleblower showed the news outlet a platform with recordings from all over the world, meaning that there are likely thousands of contractors listening in on Assistant recordings. From VRT:

That employee let us look into the system in which the employees have to listen to recordings from the Google Assistant. There must be thousands of employees worldwide; in Flanders and the Netherlands, a dozen employees are likely to hear recordings from Dutch-speaking users.

‘Anonymous’ data?

Google’s well aware that its contractors can listen to these recordings, and it’s aware of the privacy questions that raises. To keep those contractors from identifying the people they’re listening to, Google strips identifying data from the recordings.

Of course, it’s common for data-gorging companies to point to a lack of identity details and equate that lack to a privacy shield. But in these days of Big Data, the claim has been proved to be flawed. After all, as we’ve noted in the past, data points that are individually innocuous can be enormously powerful and revealing when aggregated. That is, in fact, the essence of Big Data.

Take, for example, the research done by MIT graduate students a few years back to see how easy it might be to re-identify people from three months of credit card data, sourced from an anonymized transaction log.

The upshot: with 10 known transactions – easy enough to rack up if you grab coffee from the same shop every morning, park at the same lot every day and pick up your newspaper from the same newsstand – the researchers found they had a better than 80% chance of identifying you.