[Whonix-devel] [tor-dev] TBB Memory Allocator choice fingerprint implications

Mon Aug 19 18:42:36 CEST 2019

Hey Tom,

Thank you for your response. You've made some great points. My
response is inline.

On Mon, Aug 19, 2019 at 04:09:36PM +0000, Tom Ritter wrote:
> Okay I'm going to try and clear up a lot of misconceptions and stuff
> here.  I don't own Firefox's memory allocator but I have worked in it,
> recently, and am one of the people who are working on hardening it.
> 
> Firefox's memory allocator is not jemalloc. It's probably better
> referred to as mozjemalloc. We forked jemalloc and have been improving
> it (at least from our perspective.) Any analysis of or comparison to
> jemalloc is - at this point - outdated and should be redone from
> scratch against mozjemalloc on mozilla-central.
> 
> LD_PRELOAD='/path/to/libhardened_malloc.so' /path/to/program will do
> nothing or approximately nothing. mozjemalloc uses mmap and low level
> allocation tools to create chunks of memory to be used by its internal
> memory allocator. To successfully replace Firefox memory allocator you
> should either use LD_PRELOAD _with_ a --disable-jemalloc build OR
> Firefox's replace_malloc functionality:
> https://searchfox.org/mozilla-central/source/memory/build/replace_malloc.h

Completely agreed. And, using LD_PRELOAD to hook into the allocator is
improper, anyways, since it won't catch early uses of the allocator.
And, as you mention, it wouldn't even work with Firefox given
mozjemalloc. Firefox is not the only application to want to have
control over the allocator.

The only way to guarantee catching early allocator use is to switch
the system's allocator (ie, libc itself) to the new one. Otherwise,
the application will end up with two allocator implementations being
used: the application's custom one and the system's, included and used
within libc (and other system libraries, of course.)

> 
> Fingerprinting: It is most likely possible to be creative enough to
> fingerprint what memory allocator is used. If we were to choose from
> different allocators at runtime, I don't think that fingerprinting is
> the worst thing open to us - it seems likely that any attacker who
> does such a attack could also fingerprinting your CPU speed, RAM, and
> your ASLR base addresses which depending on OS might not change until
> reboot.

My post was more along the lines of: what system-level components, if
replaced, have a potentially visible effect on current (or future)
fingerprinting techniques?

And: If, or how, does breaking monocultures affect fingerprinting?
Breaking monocultures is typically done to help secure an environment
through diversity, causing an attacker to have to spend more resources
in quest for success.

> 
> The only reason I can think of to choose between allocators at runtime
> is to introduce randomness into the allocation strategy. An attacker
> relying on a blind overwrite may not be able to position their
> overwrite reliably AND it has the cause the process to crash otherwise
> they can just try again.
> 
> Allocators can introduce randomness themselves, you don't need to
> choose between allocators to do that.

I'm assuming you're talking about randomness of the address space?
When it comes to browsers, ASLR is dead. Local execution of
remotely-sourced arbitrary code, an attack vector ASLR was never meant
to protect against.

Thus, discussion of whether choice of allocator improves effectiveness
of ASLR when applied to the browser is moot.

> 
> In virtually all browser exploits we have seen recently the attacker
> creates exploitation primitives that allow partial memory read/write
> and then full memory read/write. Randomness introduced is bypassed and
> ineffective. I've seen a general trend away from randomness for this
> purpose. The exception is when the attacker is heavily constrained -
> like exploiting over IPC or in a network protocol. Not when the
> attacker has a full Javascript execution environment available to
> them.
> 
> When exploiting a memory corruption vulnerability, you can target the
> application's memory (meaning, target a DOM object or an ArrayBuffer)
> or you can target the memory allocator's metadata. While allocator
> metadata corruption was popular in the past, I haven't seen it used
> recently.
> 
> 
> 
> 
> Okay all that out of the way, let's talk about allocators.
> 
> I skimmed https://github.com/GrapheneOS/hardened_malloc and it looks
> like it has:
>  - out of line metadata
>  - double free protection
>  - guard regions of some type
>  - zero-filling
>  - MPK support
>  - randomization
>  - support for arenas
> 
> mozjemalloc:
>  - arenas (we call them partitions)
>  - randomization (support for, not enabled by default due to limited
> utility, but improvements coming)
>  - double free protection
>  - zero-filling
> In Progress:
>  - we're actively working on guard regions
> Future Work:
>  - out of line metadata
>  - MPK
> 
> harden_malloc definitely has more bells and whistles than mozjemalloc.
> But the benefit gained by slapping in an LD_PRELOAD and calling it a
> day is small to zero. Probably negative because you'll not utilize
> partitions by default. You'd need a particurally constrained
> vulnerability to actually prevent exploitation - it's more likely
> you'll just cost the attacker another 2-8 hours of work.

100% agreed with your thoughts on LD_PRELOAD here, with the additions
of my notes above.

> 
> Out of line metadata is on-the-surface-attractive but... that tends to
> only help when you have a off-by-one/four write and you corrupt
> metadata state because it's the only thing you *can* do. With out of
> line metadata, you can just corrupt a real object and effect a
> different type of corruption. I'm pretty skeptical of the benefit at
> this point, although I could be convinced. We don't see metadata
> corruption attacks anymore - but I'm not sure if it's because we find
> better exploit primitives or better vulnerabilities.
> 
> In particular, if you wanted to pursue hardened_malloc you would need
> to use replace_malloc and wire up the partitions correctly.
> Randomization will almost certainly not help (and will hurt
> performance)*. MPK sounds nice but you have to use it correctly (which
> requires application code changes), you have to ensure there are no
> MPK gadgets, and oh wait no one can use it because it's only available
> in Linux on server CPUs. =(
> 
> * One place randomization will help is on the other side of an IPC
> boundary. e.g. in the parent process. I'm trying to get that enabled
> for mozjemalloc in H2 2019.
> 
> In conclusion, while it's possible hardened_malloc could provide some
> small security increase over mozjemalloc, the gap is much smaller than
> it was when I advocated for allocator improvements 5 years ago, the
> effort is definitely non-trivial, and the gap is closing.

I'm curious about how breaking monocultures affect attacks. I think
supporting hardened_malloc (or <insert arbitrary allocator here>)
would provide at least the framework for academic exercises.

Thanks,

-- 
Shawn Webb
Cofounder / Security Engineer
HardenedBSD

Tor-ified Signal:    +1 443-546-8752
Tor+XMPP+OTR:        lattera at is.a.hacker.sx
GPG Key ID:          0xFF2E67A277F8E1FA
GPG Key Fingerprint: D206 BB45 15E0 9C49 0CF9  3633 C85B 0AF8 AB23 0FB2
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://www.whonix.org/pipermail/whonix-devel/attachments/20190819/f642636f/attachment.sig>