[Whonix-devel] Exposing AnonVM Users with Dom0 Hardware Fingerprint Leaks

Tue Feb 17 16:16:20 CET 2015

On 2015-02-17 11:28 am, Joanna Rutkowska wrote:
> On 02/17/15 11:55, WhonixQubes wrote:
>> Hi Joanna,
>> 
>> 
>> On 2015-02-16 9:38 am, Joanna Rutkowska wrote:
>>> 
>>> Xen has support for emulating CPUID for HVM guests -- take a look at 
>>> the
>>> config examples in:
>>> 
>>> xen-4.1.6.1/tools/examples/xmexample.hvm-stubdom
>> 
>> 
>> I looked through the CPUID feature in this example file:
>> 
>> -
>> http://xenbits.xen.org/gitweb/?p=xen.git;a=blob_plain;f=tools/examples/xmexample.hvm-stubdom;hb=stable-4.1
>> 
>> 
>> More general info on CPUID for others:
>> - https://en.wikipedia.org/wiki/CPUID
>> 
>> Some of the very low-level x86 implementation details of it are beyond
>> me currently, but, from what I can glean it looks like it is generally
>> the right type of thing, since it seems to be baked into the Xen Dom0
>> layer beyond the reach of the HVM's OS.
>> 
> 
> The CPUID interception is implemented in the hypervisor via VT-x. Dom0
> has nothing to do with that...
> 

Err, oops, duh, I know better. I often conflate the terminology of Xen 
Hypervisor and Xen Dom0. Yes, rather, I meant Xen Hypervisor. :)

>> Would be looking for AnonVMs to simply not be able to know what CPU 
>> the
>> host machine is running on, by any means (barring covert channels or 
>> Xen
>> breakouts), but even including privilege-escalated malware in the VM.
>> 
>> 
>> 
>>> I haven't played with it, but see no reasons it should not work. I 
>>> can
>>> imagine we introduce a prefs for VMs (say "generic_cpuid" settable 
>>> via
>>> qvm-prefs) that would be resulting in additional config for cpuid
>>> emulation inserted in the config file for such VMs.
>> 
>> 
>> Sounds good.
>> 
>> 
>> 
>>> We would need to
>>> agree on good-enough-for-everybody CPUID config and stick to it then.
>>> Again, this would be use-able for anon VMs mostly.
>> 
>> 
>> Yes. Sounds like a plan.
>> 
>> I'm guessing that this would *not* limit the speed of the CPU(s) that
>> the HVM is exposed to? Just changes the info/attributes of the AnonVM
>> domain's CPU (including reported MHz?)?
>> 
>> 
> 
> No.
> 

Ok, so, *No*, it will not limit actual CPU operating speed.

But, would it also mask/fake the *reported* CPU speed info to being 
something universal/generic?

>> 
>>> However, this will not work for PV VMs, because the CPUID instruction 
>>> is
>>> not a privileged instruction, so malware in a PV VM can always 
>>> execute
>>> this instruction (even if we hooked Xen interface for CPUID-like info 
>>> to
>>> the guest) without trapping into XEN in PV operation.
>> 
>> 
>> That's too bad for excluding paravirtualized VMs.
>> 
> 
> BTW, it should be obvious, but let me point out that any
> compartmentalizing technology for x86 that is *not* based on VT-x/AMD-v
> would be prone to this problem. This is b/c CPUID is an *instruction*
> and its execution cannot otherwise be controlled by the OS, other than
> via VT-x intercept.
> 
>> However, if there is no way to achieve a masked CPU with PVMs, then so
>> be it.
>> 
>> Given the general statistical environment of AnonVM users, I think
>> unique CPU info is too important of a de-anonymization vector to hold
>> onto PVMs for.
>> 
>> 
>> 
>>> AFAIU, there are not personal identifying info returned by CPUID, but 
>>> I
>>> can see how this could be used as an additional fingerprinting 
>>> vector.
>> 
>> 
>> Right.
>> 
>> For example, subdividing the cross-section of privacy/anonymity users 
>> by
>> the following attributes would no doubt be a privacy/anonymity killer
>> for individual human identities...
>> 
>> # of unique combined mixtures of the following attributes:
>> - # of Qubes Users
>> - # of Qubes + Tor AnonVM Users
>> - # of Qubes + Whonix AnonVM Users
>> - # of CPU Model Info
>> - # of CPU Microcode Version
>> 
> 
> FWIW, CPU ucode, AFAIK, is not CPU-persistence -- it is applied on each
> boot.
> 
>> ...should be pretty easy to reveal individual people through their 
>> usage
>> of Qubes privacy/anonymity this way.
>> 
>> Although, AFAIK, other platforms are not totally immune from this. 
>> Some
>> just have a higher # of total users out in the world, but at their
>> technical expense of lacking strong security isolation to protect the
>> integrity of their privacy/anonymity systems.
>> 
>> 
> 
> Other platforms simply do not offer any meaningful separation between
> the apps that primary targeted apps (e.g. a Web browser used for anon
> browsing) and the hw specific personal identifying info (NIC MACs, IP,
> avilable WiFi networks in the neighborhood, etc). In these case if the
> attacker (e.g. NSA) exploits your anon Web browser they already get 
> you.
> In case of Qubes they can start gather info such as CPUID output and
> mining through a database of Qubes users. Quite a different level of
> threat IMHO.
> 

The former is a huge reason why I use Whonix in VMs, because of this 
fundamental architectural problem with systems like Tails, etc, which 
have access to bare metal and don't isolate the Tor proxy from apps.

With the latter scenario in Qubes, it does take an added step of linking 
2 data points together, in order to identify multiple AnonVMs as being 
owned by the same pseudonymous or real world user.

A big part of the problem here is that so few people are using Qubes + 
Whonix, that if 2 AnonVMs got trivially popped (via Firefox, 
Thunderbird, PDF, IMG, etc) and had the same CPU specs, it would no 
doubt predictably be the same user/person out in the world using that 
instance of Qubes + Whonix, since there are probably many more CPU 
models than such users at this point.

And if there is any personally identifying info/documents/etc inside one 
of the VMs, then it's a true identity game over for all known AnonVM 
activity simply tied back to a CPU model.

With GOV netflow and other vast personal activity history, such as 
technology purchases, software statistics/debug uploads, Qubes HCL 
report contributions, etc, it only gets easier to filter out key 
information and potentially infer identity based on 1 single AnonVM 
compromise.

By making the AnonVM OS technical environment entirely 
universal/generic, people could have multiple pseudonymous and/or 
personally identifiable info inside AnonVMs, and still have some 
meaningful confidence that they couldn't be linked by 1 or 2 intrusions 
of some simple malware.

I personally wouldn't be one bit surprised if such a de-anonymization 
has already happened for a Qubes AnonVM user based on these 
same-or-similar technical fingerprinting methods.

>> 
>>> Thus, perhaps we should consider distributing Whonix workstation
>>> template as an HVM template instead of a PVM one? Fortunately we do 
>>> have
>>> templates support for HVMs, so this should be perfectly possible.
>> 
>> 
>> Assuming there is no feasible way to accomplish this objective with
>> PVMs, then implementing the Whonix-Workstation in a HVM template with
>> "generic_cpuid" sounds like the right move.
>> 
>> Another anonymity upshot of HVMs is their, by default, non-seamless
>> fixed single windowing.
> 
> You can have seamless GUI for HVM VMs.
> 
>> Even though the seamless desktop mode of the new Qubes + Whonix
>> platform is sexy and smooth to use, it does expose another
>> semi-unique host machine attribute to the AnonVMs, which is the
>> host's unique display resolution size and pixel depth (maybe some
>> other related stuff too?).
> 
> Don't quite get it? Like 1600x900 instead of 1920x1080 you mean?
> 

Yes. Host machine screen pixel size and bit/color depth values.

In a Qubes VM/AnonVM one can run:

printenv

and get H=height, W=width, D=depth as the host machine's actual hardware 
display.

or install something like "hardinfo" package to view in GUI

It's yet another hardware fingerprint value that is semi-unique to the 
user's configuration.

VirtualBox + Whonix, for example, on purpose for a privacy/anonymity 
optimized environment, has a default universal/generic screen size 
setting of 1024x768.

And same 1024x768 universal/generic screen size exists with the original 
Qubes + Whonix HVM port.

>> Not as bad of an attribute as the host's
>> unique CPU info, but still would be best to make use of the fixed
>> single windowing for AnonVMs so this could be generic. Maybe both
>> seamless and non-seamless windowing options could be offered for
>> Whonix-Workstation HVM template, since some people hate
>> non-seamless.
>> 
> 
>> 
>> 
>>> Let me also point out the already discussed-multiple-times topic of
>>> potential covert channels between cooperative VMs, which might also 
>>> be
>>> potentially exploited in some scenarios to fingerprint user 
>>> environment.
>>> That is more difficult to address on PC architecture though, but some
>>> work on Xen-level is nevertheless very welcome (see #817).
>> 
>> 
>> Yes. I have read through some of your stuff on covert channels in the
>> past, including in the original Qubes architecture spec doc.
>> 
>> Just read through the thread linked in Qubes ticket #817 from 2014. 
>> Good
>> stuff.
>> 
>> 
>> 
>> WhonixQubes
>> 
>> 
> 
> joanna.

WhonixQubes