It seems like there is a never ending stream of new methods for hackers to steal our passwords. Between phishing emails, brute force password attacks, or a rise in VoIP protocol attacks — well, turns out that for a long time now, hackers have been able to figure out what you were typing on your keyboard, just by listening to the sounds it produces.

In fact, just this year, a study came out diving deep into the world of Acoustic Eavesdropping. More specifically, the study titled “Don’t Skype & Type! Acoustic Eavesdropping in Voice-Over-IP!” took a close look at how the old concept of acoustic eavesdropping has adopted to become an even bigger threat in modern times.

We were shocked to hear about this new method of attack, and wanted to take a close look at the study’s findings. So let’s dig in.

Acoustic What?

The idea of Acoustic Eavesdropping isn’t something incredibly revolutionary. In fact, the study refers to a number of previous looks into the concept. Even back in 1943, Bell Telephone engineers discovered a similar phenomenon that is scarily similar to what attackers can do today, with modern technologies.

But what exactly is Acoustic Eavesdropping? Well, the concept is pretty basic:

Just by recording and analyzing the sound your keyboard makes, every time you press down a different key, someone can pinpoint exactly what you were typing.

The reality is that every keyboard, and every individual key press, makes a distinct sound. Attackers can use these sounds to assemble an alphabet, and then decipher almost exactly what you were typing based off of that alphabet.

A New Rising Threat With New Technologies

In the past, this was generally not a big concern — most techniques required too much information, or required attackers to have physical access to their victim’s device. For example, previous attempts at acoustic eavesdropping would require the attacker to place a microphone near the victim’s keyboard.

So while Acoustic Eavesdropping has been a real threat for a long time, it has been fairly uncommon. The barriers to entry, if you will, are far too big for most attackers or hackers to bother with. Except the issue is now a compounding one: a lot of our devices have built-in microphones. Just about anything with a built in microphone like laptops, webcams and even tablets are probably the devices most at risk. So if your business is enacting a BYoD policy, this study might be of an interest to you.

The bigger issue now, however, is the introduction and popularity of VoIP. We’ll touch upon that below. First, some definitions to understand what we’re discussing.

Understanding The Language

Just as the study does, before we jump in to the details, it might help to define some key terms. Arvix’s study goes as far as explaining in depth attack models, and multiple profiling techniques. However, to keep it simple, we’re going to focus on just the crucial aspects we need to clear up.

Figure 1: Attack device and target device

According to the study, the different users and aspects of an attack are as follows:

  • The Attacker: Whomever is carrying out the attack, generally with malicious intent to gain information they otherwise do not have access to.
  • The Victim: Self explanatory, but the victim is the one being attacked, and having their typing eavesdropped in on.
  • Attack-Device: The attacker’s device. The study specifies devices as being desktop or laptop computers, and excluding smartphones and tablets.
  • Target-Device:  This is the device that the attacker is targeting. While the victim will be using this device, it of course doesn’t have to be their personal machine but could be a work device.
  • Target-Text: The information the victim typed on their device, therefore the target information the attacker is searching for.
  • Internet Connection: As we know, VoIP carries our voice data over the internet. This of course is important because its how VoIP works, but also because of how data is transmitted over the internet, its what makes Skype and Type attacks possible.

Essentially, the attacker is “a malicious user who aims to learn some private information about the victim.” For the simplicity of this study, it is assumed the attacker and victim also will be using a genuine, unmodified or altered version of the same exact VoIP software.

Specifically for this study, they looked at Skype, and extrapolated to Google Hangouts. But this could easily carry over into just about any other popular VoIP app.

So, How’s It Even Work?

In reality, Acoustic Eavesdropping — or the more modern Skype and Type attacks — are really complicated. We already understand the concept: the attacker records the sound the victim’s keyboard makes when they press a key while in a VoIP call. But the process is a bit more complicated than that.

Collecting The Data

Overall,  as we said before, these older style of attacks required the attacker to have physical access to the target-device. The attacker would have to simply place a microphone next to the victim’s keyboard, and record the keystrokes during a call. This, of course, could work even when they aren’t on a phone call. So, not everyone can get into your office, but lets say its your secretary that wants to find out some crucial information — they have easy access to your device and could simply hide a microphone under your stack of papers.

So lets assume access is available, and they have your key strokes recorded. That’s the key factor here, once the attacker gains access they have free reign to collect data. Except, the data they are collecting isn’t legible text, as I said, its the sounds your keyboard makes when you press any button.

Amplitude Graph

Once an attacker collects all of this data and information, its not as if they will see sounds peaks and instantly know what key was pressed. So then what’s next?

Making Sense of The Data

Well, after the attacker collect the victim’s keystroke data, the attacker would then need to utilize advanced analytical techniques to make sense of it all. This comes down science fiction sounding tech like “supervised or unsupervised machine learning” and “triangulation.”

Pretty much, AI helps the attacker turn random key click sounds into a use-able string of information, for example that password you typed into your Gmail account to check your email while holding a normal conversation. You know, normal things normal people do everyday.

Except, the AI alone doesn’t understand what a keystroke is, or know how to decipher typing from what are otherwise, generic clicking and clacking sounds. That’s where an extra step, and a layer of profiling, comes in.

Complicated, But Serious Enough To Worry

As if physical access to the target-device, and powerful AI isn’t enough, the data an attacker gains would only be usable if compared to an existing database. If an attacker has access to an existing database of popular keyboards, and the sounds those keyboards make, they can match up the data they took from your phone call with the information they already know.

Think of this as if they are cracking a code: your information is a series of ticks and clicks, but each tick and click corresponds directly to a key on a keyboard. If an attacker knows what sound the “A” key makes when pressed on a MacBook Pro keyboard, and the attacker knows their victim is using a MacBook Pro, they can fill in the pieces of the puzzle.

Attacks Are More Successful Than You Think

This is a bit of a double-edged sword. On one hand, you would think this barrier alone, requiring a reference database, would be enough to stop most attacks. And you would be correct, technically most attacks are stopped. The study stated that without a database of key sounds to compare to, attackers could only guess keystrokes with 40% accuracy . So the majority is prevented, but lets be honest, 40% is still a pretty high number.

The really scary part? When attackers DO have a reference database, that accuracy percentage shoots all the way up to 91.7%.

So, when everything matches up, the system proves to be fairly accurate – but the wall to climb over to gain the necessary information was too tall. This whole process is known as “keystroke profiling,” and does become a bit more complicated than this, but overall the main concern should be to protect your keyboard specifically.

New Style of Attacks – Remote Keyboard Autistic Eavesdropping

Except the whole premise of this paper is surrounding the discovery of an entirely new method of attack, one that is much easier to follow through. One that no longer requires physical access to the target-device, and a microphone or access into the device’s microphone.

This new form of attack still falls under the same concept of acoustic eavesdropping, but its where the Skype and Type name comes from. Also referred to as Remote Keyboard Autistic Eavesdropping, this new form of acoustic eavesdropping is even more worry some for a few reasons:

  • Attackers do not need physical access or control of the target-device.
  • Skype and Type attacks will work with a more limited amount of keystroke data that previous attempts.
  • These attacks leverage the exact VoIP software you are using, against you, again as opposed to an external microphone or sensor. This is not a security flaw, but rather exploiting how VoIP works.

That’s the really scary part, that even without  direct access to your machine, someone with malicious intent can simply use your own VoIP  app against you. Now the biggest examples given are Skype and Google Hangouts, but this could easily translate to other popular solutions. So how does this new form of attack even work, without a microphone listening in on your keystrokes?

Well, as I briefly explained above, the attacker simply uses the VoIP software against the victim. Again according to the study, “the attacker receives no additional acoustic information from the victim, besides what VoIP software transmits to the attack-device.”

Instead of a microphone placed near the victim’s keyboard, the attacker is able to extract these tiny little key click sounds from the data transmitted by the VoIP app.

So in reality, this is only an issue if the headset, IP desk phone or other useful Skype add-on devices you use for VoIP is sensitive enough to hear you pounding away at the keyboard. But, you’d be surprised by both how easy that is, and how much information can be extracted from such a tiny little sound.

Simple Measures Can Go a Long Way

When it comes to internet security, I hold a strong stance that everyone should in some way practice safe browsing. Even if you’re on a Mac, or an iPhone, browsing your work email or the news. It can be much, much easier than you think for an attacker to find their way into your device, and leverage that control over you or your business in some way.

The simplest security practices can really go a long way in securing your connection, or preventing unwanted eyes from prying into your crucial information. Even if your network is hidden by hundreds of hardware firewalls, a properly timed Skype and Type attack will render any defenses moot. In terms of preventing a Skype and Type attack, there are quite a few basic practices anyone could employ, along with some more advanced software protections.

In general, if you are discussing some really crucial and sensitive information, you probably should not be sharing such information in un-encrypted Skype calls. That should at least be common sense. However, we are all human and it is easy to forget, or just not understand the severity of the threat.

For starters, the simplest way to protect yourself from Remote Acoustic Eavesdropping would be:

  • Simply, do not type on your physical keyboard while on a Skype or VoIP call. Simple as that. If you don’t type anything, the attacker doesn’t have any information to steal.
  • If you need to be typing, utilize a software keyboard: most machines, Windows and Mac, allow users to display a virtual keyboard on their screen that require users to click on letters with their mouse. This simply prevents the keyboard sounds.
  • If you need to be typing, but cannot utilize a software keyboard, mute your microphone when typing. This can generally be done through a soft mute button on the VoIP app, or an in-line mute button included on most headsets and desk phones.
  • Prevent the need to type by logging into all of the crucial applications you will need prior to placing or receiving the call.
  • Utilize a password manager that will automatically fill out your login forms. Password managers are a great tool overall for anyone trying to improve their password security, and allow you to log into services during a VoIP call without typing.

This counter-measures would be the least invasive to your VoIP call in the end of the day. Less typing means less distractions, so its almost a win-win. However, its very easy for us to slip into a comfortable routine, or simply forget to mute our microphone, or hate having to click on a virtual keyboard. So for the more technical users, there are some advanced techniques that can be leveraged. Beyond securing your network and remote connections with a VPN, those would include:

  • What is known as a “Ducking” technique – software, or by the user manually, could lower the microphone volume and when a keystroke is detected, even overlap your typing with a completely different sound. The problem with this however is that it requires not only a trained user to find, implement and utilize this solution, but it could easily degrade your call quality. What if the software muffles your important presentation, instead of key strokes?
  • Another complicated technique to hide your keystrokes would be preforming “short random transformations” to the sound produced whenever a keystroke is detected. To put this simply, software would detect keystrokes and change the intensity and specific frequencies of the keystrokes. Essentially, the software changes the sound of the keystroke in transmission. So what the attacker receives is not the same sound your keyboard actually made.

Don’t Skype and Type!

At the end of the day, the easiest advice to follow to keep yourself, and information, safe is to just don’t type and Skype. Now, chances are that the average person will not have to worry about acoustic eavesdropping. In fact, the average user does not have to be very concerned with overall security.

But it is a fact of reality that there are assailants out there trying to make money, or find a competitive edge, in anyway possible. Even if this means a simple ransomware attack of holding your crucial and sensitive business information hostage, demanding a pay out of a pretty large sum of cash.

Not that everyone should start to panic and worry that their passwords are being stolen over the phone, but it is worth understanding the different methods of attack that exist and how to keep yourself safe.