Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
58% Positive
Analyzed from 6738 words in the discussion.
Trending Topics
#browser#tor#fingerprinting#qubes#user#using#should#firefox#same#https
Discussion Sentiment
Analyzed from 6738 words in the discussion.
Trending Topics
Discussion (252 Comments)Read Original on HackerNews
I was expecting an ad for their product somewhere towards the end, but it wasn't there!
I do wonder though: why would this company report this vulnerability to Mozilla if their product is fingeprinting?
Isn't it better for the business (albeit unethical) to keep the vulnerability private, to differentiate from the competitors? For example, I don't see many threat actors burning their zero days through responsible disclosure!
No software wants to be fingerprinted. If it did, it would offer an API with a stable identifier. All fingerprinting is exploiting unintended behavior of the target software or hardware.
An example that comes to mind that I've seen is an anonymous app that allows for blocking users; you can programmatically block users, query all posts, and diff the sets to identify stable identities. However, the ability to block users is desired by the app developers; they just may not have intended this behavior, but there's no immediate solution to this. This is different than 'user_id' simply being returned in the API for no reason, which is a vulnerability. Then there's maybe a case of the user_id being returned in the API for some reason that MIGHT be important too, but that could be implemented another way more sensibly; this leans more towards vulnerability.
Ultimately most fingerprinting technologies use features that are intended behavior; Canvas/font rendering is useful for some web features (and the web target means you have to support a LOT of use cases), IP address/cookies/useragent obviously are useful, etc (though there's some case to be made about Google's pushing for these features as an advertising company!).
Unintended identification is less than ideal but frankly is just the nature of doing business and any number of niceties are lost by aggressively avoiding fingerprinting.
In software intentionally optimized to avoid any fingerprinting however it is a vulnerability.
The distinction being that fingerprinting in general is a less than ideal side effect that gives you a minor loss in privacy but in something like Tor Browser that fingerprinting can be life or death for a whistleblower, etc. It's the distinction between an annoyance and an execution.
Maybe because is not as serious as them and their title, made it to be? Did you read it fully?
The identifier described is not process lifetime stable, not machine stable, or profile stable, or installation stable. The article itself says it resets on a full browser restart...
So this is not a magic forever ID and not some hardware tied supercookie. Now what should we do with that title, and the authors of it?
Don't get your opsec advice from HN. Check whonix, qubes, grapheneos, kicksecure forums/wikis. Nihilist opsec, Privacyguides.
Make sure to exit Tor Browser at the end of a session. Make sure not to mix two uses in one session.
Whether they care is entirely separate.
Why don't browsers make it like phones where the server (app) has to be granted permission to access stuff?
A user agent that says the browser's version? Reasonable enough.
Being able to ask for fonts, if the system has them? Difficult to have font support without that.
Getting the user's timezone, language and keyboard layout? Reasonable.
The size of the screen, and the size of the browser window? Difficult to lay things out without that.
Of course a video or audio player needs to know which video formats your browser supports - how else to provide the right video?
Obviously javascript can get the time, and it's trivial to figure out the system's clock error by comparing that to the time on a server.
Before you know it, almost every browser is uniquely identifiable.
User agents as a concept are rather poorly thought out across the board and not all that useful but persist because that's just how technical cruft is.
Fonts should be provided by the website; if not provided the choice should take the form of a spec sent by the website including line height, sarifs or not, monospace or not, etc. There's little to no excuse for the current font situation IMO beyond poor design decisions that became heavily entrenched.
Timezone and other obviously private metadata should never be shared without the user explicitly granting permission on a case by case basis. The status quo here is completely inexcusable as is the continued failure to fix the problem.
Size of the physical screen should never be exposed under any circumstances. The current size of the browser window is reasonable on its face but now that fingerprinting is understood to be an issue should always be heavily letterboxed unless the user consents to sharing the exact value.
Video formats should be provided by the website as a list of offerings and the browser should respond with a choice; the user could optionally intervene. There's no reason to expose the full capabilities to a remote service.
Querying the current time should be gated behind an explicit permission. There's almost never a need for it. However from a fingerprinting perspective you also have to worry about correlating the rate of clock skew across clients. That can be solved by gating access to high resolution time counters behind an explicit permission as (once again) the vast majority of services have no legitimate use for such functionality.
Now we have actual criminal organizations and other real bad actors.
I'm sure we can come up with something better than advertise our whole local computing platform on every HTTP request.
No applications. No mail. No need for cookies.
I can use a "regular" browser for more enhanced stuff. But for simple content consumption, we can just have a "dumb" browser that can't do much.
> A user agent that says the browser's version? Reasonable enough.
No user agent. I'm guessing it will need it for JavaScript or HTML features, and dynamically update if using an old browser, but let's just not supply a user agent and let it be the reader's burden to have a reasonably decent browser.
> Being able to ask for fonts, if the system has them? Difficult to have font support without that.
What's the fallback if the system doesn't have them?
> Getting the user's timezone, language and keyboard layout? Reasonable.
Keyboard layout is irrelevant for viewing content. For timezone and language: Yeah, I can see the use cases, but these are in a small minority. Let there be a popup when requested, and the user can specify the timezone/language as requested.
> The size of the screen, and the size of the browser window? Difficult to lay things out without that.
Let's let this new browser return only from a (small) discrete set of sizes. It will pick the size closest to the actual browser window size and send that.
> Of course a video or audio player needs to know which video formats your browser supports - how else to provide the right video?
Same answer as user agent. Either let the user pick from a selection of video formats, or just hard code a reasonable one and put the onus on the user to have a browser that supports it.
> Obviously javascript can get the time, and it's trivial to figure out the system's clock error by comparing that to the time on a server.
This hypothetical browser could just not send the time :-) For 99% of content consumption, this function is not needed.
What I'm describing should be part of "Private mode". Or browsers should have an "Ultra-private" mode that is the above. If it's too complex/risky maintaining it all in one codebase ... fine. Just have a separate browser.
Right now, if I built such a browser, I'm sure a lot of sites meant for content would break. But in my fantasy world, using "Ultra-private" would be the default, and people who make sites will target them first.
I think much of the complexity in making a web browser is all the "other" stuff. Being able to run apps, cookie/privacy management, etc.
No support for forms. The browser is meant for content consumption. Not for interaction/creation.
One could argue that any JS capabilities to do network requests (including dynamically rendering content) would be disallowed.
Yes, I know, this is going pre-Web 2.0.
Yes, of course, most current sites won't work in that model. But I'll also say: Most current content sites don't need these capabilities. They have them because they know the browser supports them.
Again - a fantasy. I know only a few people will use it. I know that won't be enough to change web behavior. It would be nice, though, if sites carried a badge to indicate they conform to all of the above.
What you want exists, have at it
thankfully i think traditional web surfing is probably going to die out in the next 10 years, and progressively decline a lot much sooner than that as people start to interact with AI rather than browsers (or any software for that matter).
my feed of hackernews is going to be my AI agent giving it to me in plain text very soon, and soon after that i will probably never visit the internet again because it will be impossible to know what's real and fake
as a millennial it will be interesting to experience the full cycle of being born when nothing was online, to everything being online, to then again being entirely offline by the time i'm older
When i use Resist Fingerprinting my main issue is the timezone being set to UTC. most of the other stuff it does never causes issues. I guess sometimes sites need to read the canvas, but theres a permission box that allows that when needed. I wish there was a similar permission box for timezone.
The only other drawback to the "resist fingerprinting" option is you will encounter cloudflares' captcha checkbox everywhere and all of the time :(
All these things should be opt-in and like blocked by GDPR.
Apps have access to inconceivable amounts of identifiers and device characteristics, even on the well protected systems without Google Play services.
Like Android phones perhaps? Unfortunate Apple gives very little granular control.
But most ROMs don't allow controls for WiFi, Cell data, Phone ID, Phone number, User ID, local storage, etc...
And since browsers rival OSes for complexity (they are basically OSes in their own right already), any part of the system can be inadvertently exposed and exploited.
How does this "identifier" work with Javascript disabled
> In Firefox Private Browsing mode, the identifier can also persist after all private windows are closed, as long as the Firefox process remains running. In Tor Browser, the stable identifier persists even through the "New Identity" feature, which is designed to be a full reset that clears cookies and browser history and uses new Tor circuits.
1. Website fingerprints the browser, stores a cookie with an ID and a fingerprint.
2. During the next session, it fingerprints again and compares with the cookie. If fingerprint changed, notify server about old and new fingerprint.
Assume the same.
>The idea is to amass as much information as possible
Reminded, from 2012: https://www.wired.com/2012/03/ff-nsadatacenter/
https://metrics.torproject.org/rs.html
Also, does anyone know of any researchers in the academic world focusing on this issue? We are aware that EFF has a project that used to be named after a pedophile on this subject, but we are more looking for professors at universities or pure research labs ala MSR or PARC than activists working for NGOs, however pure their praxis :-)
As privacy geeks, we have become fascinated with the topic -- it seems that while we can achieve security through extensions like noscript or ublock origin or firefox containers (our personal "holy trinity"), anonymity slips through our fingers due to fingerprinting issues. (Especially if we lump stylometry in the big bucket of "fingerprinting".)
[1] https://web.archive.org/web/20260422190706/https://fingerpri...
You bring this up like it's a well known incident, but my googling can find no evidence of it? The only reason not say the name of the project would be if it's common knowledge, but it's not?
ChatGPT research reckons you're making it up, and I'd be curious if you have evidence to the contrary?
So what happened here is basically... AI told you that something that made you suspicious because you have zero subject matter expertise is suspect?
I'm not really sure how to react to someone who has a robot affirm their anxieties other than to stand by my previous statements and give a polite pointer at some terms to look up on Wikipedia rather than feed into a clanker.
You said it was “named after a pedophile”, that is wrong
>>The word panopticon derives from the Greek word for "all seeing" – panoptes.
The concept was invented by Jeremy Bentham, who died before Foucault was born.
Interesting that you named your HN account after a famous homophobe.
i also like anonbib as a central repo for interesting work.
https://www.freehaven.net/anonbib/topic.html
Hmm, I'm a little confused, since in 2021 Mozilla released experimental one-process-per-site:
> This fundamental redesign of Firefox’s Security architecture extends current security mechanisms by creating operating system process-level boundaries for all sites loaded in Firefox for Desktop
https://blog.mozilla.org/security/2021/05/18/introducing-sit...
Perhaps that is not fully released?
Or perhaps it is, but IndexedDB happens to live outside of that isolation?
If so, cool!
JS also dramatically improves security. TBB is stuck in a 90s mindset about privacy, as if Firefox exploits were not dime a dozen. Especially with AI making FF exploits more available, we can expect many tor sites to be actively attacking their visitors.
Tor endpoints are pretty easy to identify, there are plenty of handy databases for that, using it to begin with increases your uniqueness. If noscript was set to strictly disallow javascript by default, that decreases the degree to which it increases your signature relative to the baseline of using tor.
Then we have to account for the simple fact that many, many fingerprinting techniques rely on javascript, so taking them out of the picture reduces the unique identity that can be gleaned.
Are we absolutely, positively sure that the tradeoff is worth it? Without a strict repeatable measurement, I think I'm highly skeptical about whether or not a default of "allow" is a net boon to hiding your identity. I remember the rationale about the switch mostly being directed towards "most of the web is broken otherwise and that's bad."
If TBB changed to js off by default that signal would be less evident, and also, fingerprinting would be harder.
How so?
Tor Browser also doesn't spoof navigator.platform at all for some reason, so sites can still see when you use Linux, even if the User-Agent is spoofing Windows.
I've heard a handful of people say this but are there examples of what I would imagine would have to be server-side fingerprinting and the granularity? Since most fingerprinting I'm aware of is client-side, running via JS. While I expect server-side checks to be limited to things like which resources haven't be loaded by a particular user and anything else normally available via server logs either way, which could limit the pool but I wonder how effective in terms of tracking uniqueness across sites.
https://fingerprint.com/blog/disabling-javascript-wont-stop-...
There is also a method of fingerprinting using the favicon: https://github.com/jonasstrehle/supercookie
We're talking about users of the Tor browser, and I'd be very surprised if this was the case (that a majority keep JS turned on)
Basically every Tor guide (heh) tells you to turn it off because it's a huge vector for all types of attacks. Most onion sites have captcha systems that work without JS too which would indicate that they expect a majority to have it disabled.
That's why expansion of web standards is wrong. Browser should provide minimal APIs for interacting with device and features like IndexedDB can be implemented as WebAssembly library, leaking no valuable data.
For example, if canvas provided only access to picture buffer, and no drawing routines calling into platform-specific libraries, it would become useless for fingerprinting.
Or just open dev tools
Why is this global keyed only by the database name string in the first place?
The post mentions a generated UUID, why not use that instead, and have a per-origin mapping of database names to UUID somewhere? Or even just have separate hash-tables for each origin? Seems like a cleaner fix to me compared to sorting (imo, though admittedly, more of a complex fix with architectural changes)
Seems to me that having a global hashtable that shares information from all origins is asking for trouble, though I'm sure there is a good explanation for this (performance, historical reasons, some benefits of this architecture I'm not aware of, etc.).
namespace mozilla {
namespace dom::indexedDB {
using namespace mozilla::dom::quota;
using namespace mozilla::ipc;
using mozilla::dom::quota::Client;
The IndexedDB UUID is "shared across all origins", so why not use the contents of the database to identify browers, rather than the ordering?
The key vulnerability here is that, for the lifetime of that Firefox process, any website that makes that set of databases is going to see the exact same output ordering, no matter what the contents of those databases are. That makes this a fingerprint: it's a stable, high-entropy identifier that persists across time, even if the contents of those databases are not preserved. It is shared even across origins (where the contents would not be), and preserved after website data is deleted -- all a website has to do to re-acquire the fingerprint is recreate the databases with the same names and observe their ordering.
So it persists between anonymous sessions. So you could connect User A that logged out and reset the identity to User B who believed was using a fresh anonymous session and logged in afterwards.
https://blog.torproject.org/new-release-tor-browser-15010/
Seriously, I am saddened that Chromium dominates the browser market as much as it does, but at this point the herd-immunity of Chromium is necessary to keep users safe.
Because it's an isolated remote browser, you also get a lot of flexibility. You can run BrowserBox itself as an onion hidden service connected to the clearnet, or connect BrowserBox to browse over Tor, or even do both at the same time. Since this Firefox IndexedDB vulnerability relies on persisting state, you can completely avoid it by running BrowserBox (based on Chromium), and doing it ephemerally. There's actually a new GitHub action [0] that makes spinning up a purely ephemeral, disposable session incredibly easy and would be immune to this kind of process-level state tracking.
The action runs BrowserBox on a GitHub Action Runner, you can specify whether you want a CloudFlare tunnel, or a tor tunnel (which comes with torweb access). And there's a conveneince script you can use to run from the command-line - which does the setup then spits out your login link.
All you need is a BrowserBox license (not free), but then you can use it.
I would consider this a lightweight Tor-proxied Browser, not a replacement for Tor Browser, at this time as there are likely edges and leaks that the official Tor Browser has long patched. However, as cases liek this IDB bug demonstrate - no security is perfect. If you simply want a way to access tor, and add an extra "ephemeral" hop on a runner, itself over Tor, and not trying to do anything especially sensitive or life-threatening - it's probably good.
[0]: https://github.com/marketplace/actions/browserbox
[1]: https://github.com/BrowserBox/BrowserBox
And all browser devs should be required to actively fight against fingerprinting.
There is no legitimate need for fingerprinting in browsers.
It's more than a browser restart, it's a complete system wipe every time.
Tails is made on the premise that exactly this kind of trick will occur. Sometimes even persisting between browser restart. For that reason even the persistent storage is very limited. But that's optional and cautioned against for maximum anonymity.
What would be worrying with tails would be if there was some way for some hardware identifier to be exposed. Like a serial number or MAC address. But this kind of thing is exactly what it's made to protect against.
For those who want an ephemeral setup but prefer the Chromium engine over Firefox, you can achieve a similar "destroy after use" workflow using BrowserBox. It has a tor-run function that connects Chrome to a Tor SOCKS proxy and wraps all auxiliary network calls over torsocks.
You can easily spin up a purely ephemeral session using a GitHub action [0] so that absolutely no state persists once you close it. As a bonus, you can also run the BrowserBox instance itself as an onion hidden service while browsing over Tor.
[0]: https://github.com/marketplace/actions/browserbox
https://www.ndss-symposium.org/wp-content/uploads/ndss2021_1...
Says that Firefox has a bug that prevents favicons from being loaded from cache, which inadvertently protects against this technique. They filed a bug report on it in 2020 but nothing has happened with it yet: https://bugzilla.mozilla.org/show_bug.cgi?id=1618257
Dump the rendered window pixels out to a simple viewer. Mouse movement is still a pain to deal with, but I would default to spoofing it as moving between clicks, with some image parsing logic to identify menu traversal.
Then it should reboot the browser process regularly.
I've been waiting for someone to make a packaged 'VPC in a box' incorporating networking and linked VMs.
connects Chrome to a Tor SOCKS proxy and wraps all other browsing-related network calls over torsocks. It prevents local fingerprinting leaks (like this IndexedDB ordering bug) because the browser isn't running locally at all. You can host the BrowserBox instance as an onion hidden service, use it to browse over Tor, or both.
If you want to try an ephemeral "VPC in a box" style setup where the environment is destroyed after you're done, you can easily spin it up using this new GitHub action: https://github.com/marketplace/actions/browserbox (but you need a license key, obtainable at https://browserbox.io)
This is my attempt to make it easy to spin up bbx on ephemeral infrastructure that's mostly free (GitHub Actions runners are perfect).
Just use a network namespace individual pieces of software are way too easy to misconfigure.
This is dangerously incomplete and bad advice.
Qubes OS does not work the way you seem to think it does.
Creating a new identity in the Tor Browser inside a disposable VM does not automatically stop that VM and start a new disposable VM. That initial disposable VM launches the new identity from the existing process and therefore remains vulnerable, the same as any bare metal computer running Tor Browser would.
Virtualization is not magic.
A Qubes OS user needs to spin up a new disposable Whonix VM to sidestep this attack. Creating a new identity alone is ineffective in this threat model.
If you care about these projects as much as you say you do, please stop giving harmful advice. You do it in various places on the Internet and in every thread which gives you half a chance to do so, and these projects would be better off if you either took any of the extensive well-reasoned correction many people offer you, or opted to stop making such claims. The former would be ideal, the latter still vastly preferable to the existing state of affairs.
A Qubes OS user needs to start a new disposable Whonix workstation VM to sidestep this attack, NOT create a new identity in the same disposable VM's browser, which is exactly what this attack targets.
This is technically incorrect information and could get people in trouble if followed literally.
On Qubes OS, if a user creates a new identity inside a Whonix workstation disposable VM via the browser's new identity functionality, the new identity spawns within the same disposable VM. I just tested this on Qubes OS 4.3.
That, I assume would expose one to OP's vulnerability, as its still running in the same VM. I would be glad to learn that I'm incorrect in my unverified assumption.
Even Qubes OS users still need to be mindful to launch new disposable VM when keeping identities separate to sidestep this attack.
Joanna Rutkowska's understandable preference for older kernels had its advantages, but the current team is much more likely to ship somewhat newer kernels and I've been surprised by what hardware 4.3 has worked well on.
Beyond that, I'm currently running a kernel from late Feb/early Mar (6.19.5).
Driver support can still be an issue, and a Wi-Fi card that doesn't play nice with Linux in general is doing to be no different on Qubes OS.
> For security and product stakeholders, the key point is simple: even an API that appears harmless can become a cross-site tracking vector if it leaks stable process-level state.
This reads almost LLM-ish. The article on the whole does not appear so, but parts of it do.
Did you even read the article at all? Ah my children did bad in school, time to replace them with new children and a different spouse. This is what you're suggesting essentially. A browser is not just something you simply make out of thin air. There's decades of nuance to browser engines, and I'm only thinking of the HTML nuances, not the CSS or JS nuances.
>Physical isolation is a given safeguard that the digital world lacks
…
>In our digital lives, the situation is quite different: All of our activities typically happen on a single device. This causes us to worry about whether it’s safe to click on a link or install an app, since being hacked imperils our entire digital existence.
>Qubes eliminates this concern by allowing us to divide a device into many compartments, much as we divide a physical building into many rooms. …
Sold
https://doc.qubes-os.org/en/latest/introduction/intro.html
Having said that, fsflover exhibits a poor grasp of how this stuff works and all should be aware that even in Qubes OS, one would need to spawn new disposable VMs for each identity; relying on the Tor Browser's new identity creation within the same disposable VM would be little different from running Tor Browser on a traditional OS.
A user would have to manually start a new disposable VM for each identity.