A single null byte. No use-after-free in the target. No double-free. Every mitigation the toolchain offers — Full RELRO, stack canary, NX, PIE — all active. That is the starting position.
This post documents a heap exploitation technique we are calling Schrödinger's Chunk. The name captures the core primitive: a chunk that exists simultaneously in two states, allocated from the program's perspective, freed from the allocator's perspective. We manufacture this condition from nothing by chaining five bugs in glibc 2.43's own allocator code. The target program is correct. The bugs are in glibc.
The canonical entry point is a null byte overflow. But Schrödinger's Chunk is not tied to one primitive. Any corruption that creates overlapping chunk views — including uncontrolled out-of-bounds writes, weak single-byte overwrites, or partial size corruptions — can serve as the entry point.
Every claim here is verified against the glibc 2.43.9000 source. All five bugs were reported to the glibc security team prior to this publication. Discovery date: 02 March 2026.
glibc 2.43: What Changed and Why It Matters
Several changes in this release shifted the attack surface in ways that are not immediately obvious.
Fastbins Removed
glibc 2.43 removed fastbins entirely. Everything small now flows through either the tcache (fast path) or the full consolidation machinery (_int_free_merge_chunk). Any bug in the tcache now affects a much larger fraction of all program allocations.
Tcache Expanded to 76 Bins
The tcache grew from 64 size classes to 76: 64 small bins plus 12 large bins. TCACHE_MAX_BINS = 76. The tcache_perthread_struct now holds uint16_t num_slots[76] (152 bytes) and tcache_entry *entries[76] (608 bytes), totaling 760 bytes.
Lazy Tcache Initialization
This is the structurally important change. In glibc 2.35 the struct was allocated during the first malloc(). In glibc 2.43 it is allocated lazily, during the first free(). The heap layout consequence is fully deterministic — given any heap address leak, the address of every subsequent allocation is computable by arithmetic.
if (__glibc_unlikely (tcache_inactive ()))
return tcache_free_init (mem);
Early init (glibc 2.35):
first malloc() → allocate tcache struct → allocate requested chunk
Lazy init (glibc 2.43):
malloc() #1 → allocate chunk directly (no tcache yet)
free() #1 → allocate tcache struct via tcache_free_init → re-free chunk
Resulting layout:
heap_base + 0x000: first malloc chunk (e.g. 0x20 for malloc(1))
heap_base + 0x020: tcache_perthread (0x310)
heap_base + 0x330: second malloc chunk
Relaxed Tcache Free Path
In __libc_free, when a freed chunk is tcache-eligible, glibc 2.43 takes an early return directly to tcache_put after only two checks. The check_inuse_chunk call from prior versions is gone. The size field is accepted as-is as long as it falls within a valid tcache bin range. This is how size-field corruption becomes exploitable here.
The Allocator's Bookkeeping
chunk_ptr
│
▼
┌───────────────────┐ ← chunk header
│ prev_size (8B) │ only valid when prev chunk is free
├───────────────────┤
│ size (8B) │ total chunk size | flags in low 3 bits
├───────────────────┤ ← user pointer (what malloc returns)
│ │
│ user data │
│ │
└───────────────────┘
Bit 0 of size is PREV_INUSE. When it is 0, prev_size is valid and tells the allocator how far back to walk. Safe-linking since glibc 2.32 encodes next pointers via PROTECT_PTR(pos, ptr) = (pos >> 12) ^ ptr. The key requires knowing the storage address — or so the design intends.
Five Bugs in glibc 2.43
Neither tcache_put_n nor tcache_get_n checks whether the list contains a cycle. A chunk whose next field decodes back to itself creates an infinite list. Every pop returns the same address. This is the foundation of Schrödinger's Chunk.
num_slots is a uint16_t. The logical maximum is 7. The increment in tcache_get_n has no bounds check. After two pops from a self-referencing loop, the value reaches 8. The capacity guard (num_slots != 0) keeps accepting frees that should be rejected.
Double-free detection uses a fast key check (Stage 1) to gate a slow list walk (Stage 2). But tcache_get_n unconditionally clears e->key = 0 on every pop. After our double pop, key is 0. tcache_key is non-zero. Stage 1 fails. Stage 2 never runs.
Because the tcache struct is always allocated during the first free(), and the first malloc() always lands at heap base, the layout is fixed. The safe-linking key for any chunk is addr >> 12. Pure arithmetic. No freed-memory read needed.
Chain link next fields use PROTECT_PTR. The entries[idx] head pointers in tcache_perthread_struct are stored raw. Any primitive reaching the tcache struct can redirect a bin head without computing a safe-linking key.
Bug 1 without Bug 2: no inflated counter to pass the capacity guard. Bug 1 without Bug 3: Stage 2 list walk finds the cycle and aborts. Bug 3 without Bug 1: no cleared key to exploit. None of these bugs is sufficient alone. The chain is multiplicative. Each is load-bearing.
Entry Points: What Primitive Does This Actually Need
Schrödinger's Chunk requires one thing: an overlapping chunk. One live program-visible handle and one allocator-visible free chunk referencing the same physical memory. Once you have that, the rest runs from glibc's own bugs.
Program view: "M is live, I can read and write through it." Allocator view: "M is free, it is in a bin." Reading through the program handle reads allocator metadata. Writing through it corrupts allocator metadata. Any corruption path that leaves these two views inconsistent is sufficient.
Path A: Null Byte Overflow (Canonical)
The null byte at buf[n] lands on byte 0 of the adjacent chunk's size field, clearing PREV_INUSE. When that adjacent chunk is freed, the allocator walks backward by prev_size bytes, finds a fake free chunk we constructed in our live buffer, passes the self-pointing unlink check, and merges. The merged chunk enters the unsorted bin — but our live handle still covers that memory.
Before:
┌──────────────────┬──────────────────┐
│ [prev] 0x510 │ [victim] 0x500 │
│ in use │ PREV_INUSE=1 │
└──────────────────┴──────────────────┘
After null byte at offset 0x508 (clears victim's PREV_INUSE):
┌──────────────────┬──────────────────┐
│ [prev] 0x510 │ [victim] 0x500 │
│ in use │ PREV_INUSE=0 ← │
└──────────────────┴──────────────────┘
After free(victim) → _int_free_merge_chunk runs:
┌────────────────────────────────────────────────────┐
│ [merged] 0xA00 in unsorted bin │
│ prev handle still valid → OVERLAP ACHIEVED │
└────────────────────────────────────────────────────┘
| Entry Point | Bytes Controlled | Overlap Mechanism | Reliability |
|---|---|---|---|
| Null byte overflow | 0 (always \x00) | PREV_INUSE clear → backward consolidation | Deterministic |
| Uncontrolled OOB write | 0 (data-dependent) | Size field corruption → wrong tcache bin | Retry-based (~44%) |
| Controlled single byte | 1 byte, chosen value | Size field corruption → wrong tcache bin | Deterministic |
| Multi-byte partial overflow | Partial | Size + prev_size corruption | Varies |
| Real UAF | Full write to freed chunk | Direct tcache entry corruption | Depends on heap leak |
The Exploit Chain, Phase by Phase
heap_base + 0x000: [taste] 0x20 ← first malloc() before tcache exists
heap_base + 0x020: [tcache] 0x310 ← allocated by tcache_free_init on first free()
heap_base + 0x330: [prev] 0x510 ← source of null byte overflow (0x508 usable bytes)
heap_base + 0x840: [victim] 0x500 ← consolidation target
heap_base + 0xD40: [barrier] 0x90 ← prevents top-chunk merge when victim is freed
Write a fake chunk inside prev's data with fd = bk = prev_user (self-pointing, passes unlink integrity check). Set victim->prev_size = 0x500. Write the null byte at prev_user + 0x508, clearing victim's PREV_INUSE. Free victim. _int_free_merge_chunk walks back 0x500 bytes, finds the fake chunk, passes all size consistency checks, runs the no-op unlink, merges. Resulting chunk of 0xA00 enters the unsorted bin at prev_user. The prev handle still covers this memory. Overlap achieved.
The merged chunk is the only unsorted bin entry. Its fd = bk = main_arena + 0x08. These sit at prev_user + 0x10 and prev_user + 0x18, inside our overlap region. Read them through the prev handle. Subtract the known main_arena offset. Libc base acquired with no additional primitive required.
Carve small chunk E from the merged region. Free it to tcache. Bug 4: the layout is deterministic so we know E_user from arithmetic alone. Compute self_loop = (E_user >> 12) ^ E_user. Through the overlap, overwrite E->next with self_loop. REVEAL_PTR(E->next) now decodes back to E_user. The bin head never advances. Every pop returns E. Bug 1: no cycle detection means this goes unchallenged.
A = malloc(0x88) returns E_user. B = malloc(0x88) returns E_user again. Two independent entries in the program's tracking table pointing to the same physical memory. num_slots overflows to 8, exceeding the logical maximum of 7. No bounds check prevents it.
Both pops cleared E->key = 0. Free A. Stage 1 gate: (0 == tcache_key) = false. Stage 2 never runs (Bug 3). Capacity guard: num_slots = 8 != 0 (Bug 2 enabling Bug 3). E is re-freed into tcache. Handle B still points to E_user. Writing through B is a use-after-free into live tcache metadata. The target program has no UAF. We built one from glibc's own bugs.
Through handle B, overwrite E->next with PROTECT_PTR(E_user, tracking_table_addr). The safe-linking key was computed arithmetically via Bug 4 — no freed-memory read. The chain reads: E_user → tracking_table. Two more pops: first returns E_user (discard), second returns tracking_table. We hold an allocation overlapping the program's own bookkeeping. Full arbitrary read and write.
FSOP to Shell
Full RELRO eliminates GOT overwrites. The target is stdout's _IO_FILE structure. The regular vtable pointer at +0xd8 is range-validated since glibc 2.24. However, the wide character vtable _wide_data->_wide_vtable has no range check. It dispatches to __doallocate (at +0x68 in _IO_jump_t) when the wide character buffer needs initialization. No validation ever touches it.
printf("prompt")
│
▼ stdout->vtable = _IO_wfile_jumps (legitimate — passes range check)
│
▼ _IO_wfile_xsputn → _IO_wfile_overflow
│ (wide buffer is NULL, needs allocation)
│
▼ _IO_wdoallocbuf
│
▼ fp->_wide_data->_wide_vtable->__doallocate(fp)
│ ▲ NOT range-checked
│
▼ our controlled function pointer
│
▼ shell
| stdout field | offset | value written |
|---|---|---|
| _flags | +0x00 | 0xFBAD2000 — clears _IO_UNBUFFERED, enables wide path |
| _lock | +0x88 | our zeroed fake lock in the overlap region |
| _wide_data | +0xa0 | our fake _IO_wide_data in the overlap region |
| vtable | +0xd8 | _IO_wfile_jumps — legitimate, passes range check, triggers dispatch |
off-by-one null byte
│
▼ fake chunk (fd=bk=self, size=0x501) — all integrity checks satisfied
▼ PREV_INUSE cleared on victim
_int_free_merge_chunk: backward consolidation
merged chunk 0xA00 at prev_user → unsorted bin
│
▼ overlap: prev handle over free merged chunk
read unsorted bin fd → libc base
│
▼ carve E from overlap, free to tcache
Bug 4 (lazy init): E_user computable → safe_key known
write PROTECT_PTR(E,E) to E->next through overlap
Bug 1 (no cycle detection): tcache[0x90] = E→E→E→...
│
▼ double pop: A=E_user, B=E_user
Bug 2 (no count bound): num_slots = 8, exceeds max
│
▼ free(A): re-free E_user
Bug 3 (key clearing bypasses gate): verify skipped
Bug 2 (inflated count): tcache_put accepted
write(B): UAF write to freed tcache metadata
│
▼ tcache poison: E->next → tracking_table
two pops → arbitrary allocation at tracking_table
│
▼ arbitrary R/W via tracking_table overlay
│
▼ FSOP: corrupt stdout (_flags, _lock, _wide_data, vtable)
fake _IO_wide_data → fake _wide_vtable
__doallocate (+0x68) → win()
│
▼ shell
What Makes This Different
Null byte poisoning and PREV_INUSE tricks produce chunk overlap and apply tcache poisoning from that overlap. They still need to read freed memory to recover the safe-linking key. One shot. Fixed geometry.
Synthesizes a genuine use-after-free from a program that has none. The manufactured UAF is a persistent read/write handle to freed memory, independent of the overlap region. Safe-linking key recovery is eliminated entirely — Bug 4 makes the key a function of heap base alone.
| # | Location | Defect | Proposed Fix |
|---|---|---|---|
| 1 | tcache_put_n / tcache_get_n | No cycle detection in tcache freelist | Check in tcache_put that new entry address differs from existing entries |
| 2 | tcache_get_n | num_slots++ has no upper bound | Clamp to mp_.tcache_count on every increment |
| 3 | tcache_get_n / __libc_free | e->key = 0 on every pop defeats the double-free gate | Remove the key-based gate; always walk the list, or defer key clearing |
| 4 | tcache_free_init | Lazy init creates a fully deterministic heap layout | Add randomized padding before tcache struct allocation |
| 5 | tcache_put_n | entries[] head pointers stored raw, next fields use PROTECT_PTR | Apply PROTECT_PTR consistently to head pointer assignments |
Schrödinger's Chunk demonstrates that the heap exploitation landscape on modern glibc is not settled. A performance optimization (lazy tcache init) made safe-linking keys computable. Removing fastbins concentrated all small-chunk traffic onto a path with five linked defects. A single null byte starts the chain. A program with no UAF, no double-free, and every compiler mitigation active ends with a shell. Discovery: 02 March 2026.
Working on something
that can't fail?
If you're a CISO, security engineering lead, or decision-maker facing a problem at this level of complexity — reach out directly. Every engagement is confidential. NDA is standard. We work with organizations that take security seriously enough to talk to researchers.