Audio Fingerprinting & Speech Voices: How Browsers Track You

Introduction

Most tracking you can picture. A cookie is a little file. A canvas fingerprint is a hidden drawing. Both feel like something a website does to you.

Audio and voice fingerprinting are quieter than that. Your browser never makes a sound you can hear, yet the way it processes a tone is subtly its own. Your browser keeps a list of the voices it could speak with, yet you never ask it to say a word. A website can read both in a few milliseconds, learn a surprising amount about your device, and recognise you again later with no cookie, no login, and no permission prompt.

These are the two quietest members of the browser-uniqueness family. Canvas fingerprinting, along with WebGL and font fingerprinting, probes what your browser draws. TCP/IP and OS fingerprinting read the shape of your network packet, and a WebRTC leak can expose your real IP. Audio and voices sit beside all of them, less famous and just as stubborn. This piece is about how each one works, why neither cares about your VPN, which browsers actually defend against them, and how to see what your own browser is giving away right now.

Two Signals, One Family

It helps to separate the two from the start, because people often blur them together.

Audio fingerprinting is about how your device processes sound. A script asks the browser to build a short tone and run it through some audio math, then measures the exact numbers that come back. The result is a signature of your audio stack, the combination of your CPU, your audio drivers, and your browser build.

Voice fingerprinting is about what your device can say. A script asks the browser for its list of text-to-speech voices and reads the names, languages, and origins. The result is a signature of your operating system and the language packs installed on it.

Neither one plays anything. Neither one records anything from your microphone. Both run silently on the page, and both are what we call Privacy-axis signals: they make you more trackable, more re-identifiable across sites, without saying anything about whether you are trustworthy. That distinction runs through the whole Privacy & Trust Index, and it matters here, because hiding from these signals is never treated as something suspicious.

How Audio Fingerprinting Works

The trick has been studied since a 2016 Princeton measurement of the top million websites found audio fingerprinting already running in the wild. The method has barely changed since, because it works.

Here is the whole thing in one breath. A script creates an OfflineAudioContext, which is an audio engine that renders sound to memory instead of to your speakers. It generates a tone with an oscillator, usually a triangle wave. It runs that tone through a DynamicsCompressorNode, a piece of audio processing whose math magnifies tiny rounding differences. It renders the buffer, reads back the raw floating-point numbers that describe the waveform, and sums or hashes them into a single value.

That value is the fingerprint. Play it on the same browser and device a thousand times and you get the same number. Move to a different device and the number shifts.

Why does it shift at all, if everyone is running the same Web Audio specification? Because a specification describes what the answer should be, not the last digit of how the floating-point arithmetic lands. Those last digits depend on the processor, the audio driver, and how the browser was compiled. The compressor stage acts like a magnifying glass held over those differences.

Because it grows out of your processor and audio drivers, an audio context fingerprint is really a form of hardware fingerprinting. Like all device fingerprinting, it belongs to the machine rather than to a browser tab or a cookie, which is why clearing your data does nothing to it.

Stage	What it does	Why it leaks
Offline context	Renders a tone to memory, never to the speakers.	Silent and instant, so nothing warns you it ran.
Oscillator	Generates a clean, repeatable waveform.	A fixed input means every difference in the output is the device.
Compressor	Reshapes the signal with heavy floating-point math.	Amplifies the tiny rounding gaps between audio stacks.
Readback	Sums or hashes the rendered samples into one value.	Collapses the whole waveform into a stable, comparable signature.

How identifying is it on its own? Honestly, moderate. Measurements put the audio signal at only around five bits of entropy by itself, which is nowhere near enough to single you out of a crowd. Its power is not standalone uniqueness, it is stability and stubbornness. The value barely moves across reboots, private windows, and operating-system updates, so it makes an excellent thread to stitch other, noisier signals together over time. A fingerprint is rarely one attribute. It is a dozen weak signals combined, and audio is one of the most reliable threads in the bundle.

How Voice Fingerprinting Works

The second signal is simpler to read and, in many cases, more identifying.

Every modern browser exposes a text-to-speech feature through the Web Speech API. A single line, speechSynthesis.getVoices(), returns the full list of voices the browser can use. Each entry carries a name, a language, a flag for whether it runs on the device or in the cloud, and an internal address. The script never makes the browser speak. It just reads the menu. Reading that list to recognise a device is called speech synthesis fingerprinting, or simply voice fingerprinting.

That menu is revealing because the voices are not shipped by the browser, they are shipped by the operating system and whatever language packs you have added. Each platform has its own cast of characters.

Platform	Characteristic voices	What it tells a tracker
Windows	Microsoft voices such as David and Zira.	Windows, and often the language edition.
macOS and iOS	Apple voices such as Samantha and Alex.	An Apple device, and roughly which generation.
Android	Google text-to-speech voices.	Android, and the installed engine.
Chrome (any OS)	Adds remote Google voices on top of the local set.	A larger, more distinctive list, which raises uniqueness.

The exact set, the count, the order, and especially the languages combine into a high-entropy signal. The languages are the quiet giveaway. If your list carries Japanese, Arabic, and Hindi voices alongside English, a tracker can reasonably guess you work across languages or live in a multilingual region. You revealed that by installing a language pack months ago and forgetting about it. The browser remembers, and hands the list to any page that asks.

This is why voice fingerprinting pairs so well with the operating-system signal that leaks from your connection, covered in Passive OS Fingerprinting. One reads your OS from the network packet, the other reads it from the voice list. When two independent signals agree, a defender becomes much more confident.

Why They Survive a VPN

Here is the part that surprises people. You can wrap your connection in a VPN, a proxy, or even Tor, and neither of these fingerprints changes.

The reason is simple once you see it. A VPN changes the route your traffic takes and the IP address the destination sees. It does nothing to the audio math your processor performs or to the list of voices your operating system installed. Those live on your device, far above the network layer, untouched by where your packets travel.

So a fresh IP address from a far-off country, paired with the exact same audio fingerprint and voice list as yesterday, tells a tracker something useful: same device, new mask. The disguise that works on your location does not touch the disguise you would need at the device level. This is the recurring lesson of fingerprinting, and it is the same one behind the endless CAPTCHA loop: the safest profile is the most consistent one, not the most heavily disguised one.

Who Protects You, and Who Does Not

Browsers differ enormously in browser fingerprinting protection, and the gap has only widened. This is the single most useful thing to understand if you actually want to reduce your exposure.

Browser	Audio	Voices
Chrome and Edge	No protection by default.	No protection. The full voice list is exposed.
Brave	Farbling adds per-session noise to the audio output.	The default leaves the list readable, only the strictest mode empties it.
Firefox	Blocks known fingerprinting scripts by default, and Resist Fingerprinting normalises the value.	The standard build exposes the list, but Resist Fingerprinting returns it empty.
Safari	Safari 26 injects noise by default through its Advanced Fingerprinting Protection.	Safari 26 also shields the voice list by default.
Tor Browser	Built on Firefox RFP, so audio is normalised.	Voices are locked down for everyone alike.

Two different philosophies are at war in that table. Brave randomises: it adds a little fresh noise each session, a technique it calls farbling, so your fingerprint looks different every time. Safari's newer default protection works the same way. Firefox and Tor standardise: a privacy-preserving approach that tries to make every user look identical, so there is nothing to single out.

Recent research has tipped the argument toward standardising. A 2025 paper showed that statistical analysis can often see through randomisation, averaging the noise away across enough samples to recover the real value underneath. A fixed, shared value has no underlying signal to recover. This is worth knowing before you trust a "randomise everything" tool to protect you.

The two biggest platforms now pull in opposite directions. Apple switched fingerprinting protection on by default in Safari 26, covering audio, voices, and canvas in every window. Chrome, by far the most used browser, went the other way and ships none of this. Google retired its Privacy Sandbox effort in 2025 without delivering a single anti-fingerprinting defence for surfaces like audio, voices, or canvas. Independent analysis through early 2026 confirmed that these APIs remain fully readable in Chrome with nothing standing in the way. If you use Chrome, assume these signals are wide open.

Audio, Voices, and the Bot-Detection Arms Race

These signals are not only about advertising. They are now core to how anti-fraud and bot-detection systems decide whether you are a real person.

The reason is the same stubbornness that makes them good trackers. A script can change its User-Agent in a single line. Faking a believable audio fingerprint and a matching, internally consistent voice list, on demand and at scale, is much harder. Automated browsers and anti-detect browsers often get one signal right and another wrong, and that mismatch is exactly what a detector hunts for. A "Windows" User-Agent paired with a voice list full of Apple voices is not a person, it is a costume with a seam showing.

There is an interesting twist for 2026. As privacy-focused browsers standardise these surfaces, the signals get weaker for honest users and the whole detection industry has to adapt. Some anti-bot vendors now openly describe the slow erosion of classic fingerprinting and a shift toward behaviour and server-side signals instead. The fingerprint is not dead, but it is no longer the whole story.

What the packet.guru Privacy & Trust Index Reads

Most coverage of these techniques is written for the people running them. The Privacy & Trust Index turns the lens around and shows what your own browser gives away, in plain language.

You will find two cards for this topic on the dashboard, sitting beside Canvas in the browser-uniqueness group.

The Audio Fingerprint card reports whether your browser exposes a stable, readable audio signature or whether it is being protected. It only ever affects your Privacy score, never your Trust score. Hiding here is a win, not a mark against you.
The Speech Voices card reports whether your voice list is broadly readable. Same rule: it is a Privacy signal about how trackable you are, and a locked-down or normalised list reads as protection.

Two principles guide how the Index reads them, and both are worth stating plainly. First, these signals are about trackability, so masking them is always treated as a privacy win and never as something suspicious. Second, the Index describes what it sees and what it means for you, not the internal recipe it uses to judge it. The goal is to help you understand your exposure, not to hand a checklist to someone building a better disguise. General tools like the EFF's Cover Your Tracks and AmIUnique measure your overall browser uniqueness, while the Index works as a focused browser privacy and tracking test for the audio and voice signals in particular. To see the operating system your browser openly admits to in its headers, the HTTP Headers Checker shows the raw User-Agent alongside all of this.

How to Reduce Your Exposure

There is no single switch, but there are honest choices in your browser privacy settings, each with a trade-off.

The strongest protection comes from a browser that standardises these surfaces. Tor Browser is the high end, since every user is shaped to look the same, at the cost of speed and some broken sites. Firefox with Resist Fingerprinting gets close on the desktop, though it disables features like text-to-speech and can make some pages misbehave. Safari resists these signals by default from version 26 on, so recent Apple devices are well covered with no setup. Brave is the most usable option, with randomisation that defeats casual fingerprinting even if a determined tracker can sometimes see through it.

What does not help is worth saying too. A VPN does not touch these signals. Private or incognito mode does not either, because the audio math and voice list are the same whether or not history is being saved. Most fingerprinting browser extensions reach the easy surfaces and miss these deeper ones.

And the honest framing, the one packet.guru is built around, is that the aim is consistency, not disguise. A real Mac that looks like a real Mac, behind a VPN if you like, raises no contradictions. A machine wearing a costume that its own audio stack and voice list quietly contradict raises several, and contradictions are what get you flagged. The same logic governs the geography checks in Regional Integrity: one mismatch is a normal human story, several at once are a pattern.

FAQ

Q: Can a website hear me or use my microphone for this?

No. Audio fingerprinting never touches your microphone and never plays a sound. It renders a tone to memory, does math on the numbers, and reads the result. It needs no permission because it never accesses a real audio device. Voice fingerprinting only reads a list, it does not make the browser speak.

Q: Does a VPN or Tor change my audio fingerprint?

A VPN does not, because it only changes your network route, not your device's audio processing or voice list. Tor Browser does protect you, but not because of the network. It uses Firefox's anti-fingerprinting mode to normalise the audio output and lock down the voice list, so the protection comes from the browser, not the relays.

Q: Does incognito or private mode help?

No. Private mode stops your history and cookies from being saved, but the audio math your device performs and the voices your operating system installed are exactly the same. Both signals read identically in a private window.

Q: Which browser hides these signals best?

For maximum protection, Tor Browser, then Firefox with Resist Fingerprinting enabled, both of which standardise users to look alike. Safari has defended these surfaces by default since version 26, and Brave is a practical daily option that randomises them. Chrome and Edge offer nothing here by default.

Q: How do I stop audio and voice fingerprinting?

There is no single setting that turns them off, because both come from your browser and operating system rather than from a script you can simply block. The realistic options are a browser that resists fingerprinting, Tor Browser or Firefox with Resist Fingerprinting for the strongest effect, or Brave for a more usable balance. A VPN, private mode, and most extensions do not stop either signal.

Q: How do I see my own audio and voice fingerprint?

Run the Privacy & Trust Index and look at the Audio Fingerprint and Speech Voices cards. They show whether each signal is exposed or protected, and what that means for how trackable you are.

The Bottom Line

You will never hear your browser's audio fingerprint, and you will never ask it to read its voice list aloud. Both speak about you anyway, quietly, every time a page wants to listen, and both shrug off the VPN you trusted to hide you.

That is not a reason to panic. It is a reason to choose a browser that defends these surfaces if tracking worries you, and otherwise to aim for a setup that tells one consistent story rather than a heavily disguised one. The quietest signals are often the most honest, and honesty, oddly enough, is the thing that keeps you out of trouble online.

>The Sound of You:How Audio and Voice Fingerprinting Track a Browser With No Cookies