According to Google, approximately 20% of searches on its app are done through voice, and that number is only set to grow as time passes. Voice search is, in the end, far more accessible than typing. There is still some awkwardness over using it in public, but for the most part, people are comfortable asking Siri or Alexa to do any number of tasks for them, from setting an alarm to buying groceries online. As the Internet of Things continues to grow, virtual assistants will be able to do a whole lot more. This comes with its own can of worms, though, especially for privacy.

Voice assistants work by using something called Natural Language Processing. This field is concerned with the interpretation of human language by machines, including contextual meaning hidden in the phrasing or tone of speech. It is something machine learning can be combined with by its very nature, yielding far more valuable results than other methods. It also means that it gets better with time, something typical of ML software. It does this by continuously improving the code every single time it is fed a test case, which in this case is speech. A truly massive amount of data is needed to set up an ML; the more, the better. The problem comes when we think about how all this data is gotten, stored and how it will be used.

Virtual Assistants are always ‘listening’, waiting for their keyword to activate. This means that they still detect sound between active sessions, though this will generally not be recorded since these devices are used in homes and other private spaces. However, assistants can often mistake a sound for a keyword, causing them to inadvertently record extremely personal conversations for use in ML training. Assistants can also be connected with IoT devices like Roombas, which map out your house while cleaning it. Thus there is truly a vast wealth of information locked inside an Assistant. This requires two layers of security: the first is precisely who is allowed access to it, and the second is protection from outside hostile forces seeking unauthorized access, such as hackers.

The latter is mostly solved: companies have strict regulations on storing data, which is kept encoded and locked away. These can be found online, though the actual stuff is rather technical and difficult to understand unless you know computer science to a respectable degree. For example, here is Google’s.

The problem is the first, and there are several laws in this area that empower consumers and stakeholders to privacy. For example, there is the GPDR, a set of protocols needed to be followed in the European Union. However, not every country has such comprehensive legislation, and even with it, there are several disquieting possibilities already surfacing. Almost everyone has at one point talked about a product or service to a friend and subsequently seen advertisements related to that exact item everywhere they go. This sort of targeted advertising is effective yet worrying. It has already happened that social media knows about a person’s pregnancy before they do themselves. I know of an acquaintance who broke an arm and subsequently was bombarded with advertisements for vitamin D supplements and milk powders. With access to Assistant and connected IoT data, virtually all a person’s life would be stored in some database or the other, not a pleasant thought to consider.

In 2015, an Amazon Echo was present at the house of James Bates, where the dead body of his friend Victor Collins was found. In an unprecedented move, a search warrant was served, demanding the transcript and audio recordings of the Echo in the 48 hours around the murder. Amazon contested, filing a 90-page Motion to Quash in court, claiming that the materials were protected under the First Amendment. In the end, James himself allowed the inspection of the data to expedite the investigation. Still, it does raise several questions: does a search warrant allow someone access to Assistant information? Does a court order? Which crime is severe enough to allow this privacy breach? It isn’t difficult to imagine any totalitarian regime repressing its subjects using this method. The safety of minorities and underpowered groups could easily be threatened by the abuse of this information. The War on Drugs will be nightmarish with this level of access into someone’s life, with every single word being picked apart, its meaning examined and reexamined until it’s twisted beyond all recognition.

While Virtual Assistants definitely make life easier for many of us, it is worth considering how your privacy is affected by them. Otherwise, these digital vocal chords may end up spilling your beans, even if you have none to spill.

Proofread by Mokshit N.

Student at IIT Bombay by day, reader by heart. My linktree: