Expanding the list of "Hardened Packages"

Mon Apr 1 17:54:36 UTC 2013

On 04/01/2013 04:58 AM, Adam Jackson wrote:
> On Fri, 2013-03-29 at 10:48 -0700, John Reiser wrote:
> 
>> -fPIE code is larger and takes longer to execute.  The cost varies from
>> minimal (< 2%) in many cases to 10% or more for "non-dynamic" arrays on i686.
> 
> Citation needed.

ftp://ftp.inf.ethz.ch/doc/tech-reports/7xx/766.pdf  which is cited by
the FESCO ticket  https://fedorahosted.org/fesco/ticket/1104#comment:11

It's also easy to see the mechanism:
$ cat foo.c
extern int a[];

void foo(int j) { a[j]=j; }
$ gcc -m32 -fPIE -O -S foo.c
$ cat foo.s  # edited for brevity
foo:  # 25 bytes; about 15 cycles  (incl. 3*3 cycles data cache fetch latency)
	call	__x86.get_pc_thunk.cx
	addl	$_GLOBAL_OFFSET_TABLE_, %ecx
	movl	4(%esp), %eax
	movl	a at GOT(%ecx), %edx
	movl	%eax, (%edx,%eax,4)
	ret
$ gcc -m32 -O -S foo.c
$ cat foo.s  # edited for brevity
foo:  # 12 bytes; about 6 cycles  (incl. 1*3 cycles data cache fetch latency)
	movl	4(%esp), %eax
	movl	%eax, a(,%eax,4)
	ret
$

-fPIE forces an additional level of run-time indirection which often costs around
13 bytes (CALL + ADD + fetch GOT - d32) and 2 to 5 cycles (fetch @GOT and cache latency).
Some of the cost might be shared with other nearby uses, but scarcity of registers
often inhibits sharing or requires spill code.

> 
>> -fPIE for Thumb mode on ARM is particularly painful.
> 
> Citation needed.

The same code above applies.  Thumb mode has no double indexing,
so an explicit ADD is required.  Registers are in still in short supply;
HI registers (>=8) have dedicated usage or restricted access.  Also, the
range of the offset in base_register+offset addressing mode is severely
restricted, which often requires more explicit ADDs.

--