On Wed, Jun 11, 2014 at 2:29 PM, Alexei Starovoitov
> On Wed, Jun 11, 2014 at 1:23 PM, Andy Lutomirski <email@example.com> wrote:
>> On my VM, getpid takes about 70ns. Before this patch, adding a
>> single-instruction always-accept seccomp filter added about 134ns of
>> overhead to getpid. With this patch, the overhead is down to about
> Is this the gain from patch 4 into patch 5 or from patch 0 to patch 5?
> 13ns is still with seccomp enabled, but without filters?
13ns is with the simplest nonempty filter. I hope that empty filters
>> I'm not really thrilled by this patch. It has two main issues:
>> 1. Calling into code in kernel/seccomp.c from assembly feels ugly.
>> 2. The x86 64-bit syscall entry now has four separate code paths:
>> fast, seccomp only, audit only, and slow. This kind of sucks.
>> Would it be worth trying to rewrite the whole thing in C with a
>> two-phase slow path approach like I'm using here for seccomp?
>> Signed-off-by: Andy Lutomirski <firstname.lastname@example.org>
>> arch/x86/kernel/entry_64.S | 45
>> include/linux/seccomp.h | 4 ++--
>> 2 files changed, 47 insertions(+), 2 deletions(-)
>> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
>> index f9e713a..feb32b2 100644
>> --- a/arch/x86/kernel/entry_64.S
>> +++ b/arch/x86/kernel/entry_64.S
>> @@ -683,6 +683,45 @@ sysret_signal:
>> FIXUP_TOP_OF_STACK %r11, -ARGOFFSET
>> jmp int_check_syscall_exit_work
>> +#ifdef CONFIG_SECCOMP
>> + /*
>> + * Fast path for seccomp without any other slow path triggers.
>> + */
>> + /* Build seccomp_data */
>> + pushq %r9 /* args */
>> + pushq %r8 /* args */
>> + pushq %r10 /* args */
>> + pushq %rdx /* args */
>> + pushq %rsi /* args */
>> + pushq %rdi /* args */
>> + pushq RIP-ARGOFFSET+6*8(%rsp) /* rip */
>> + pushq %rax /* nr and junk */
>> + movl $AUDIT_ARCH_X86_64, 4(%rsp) /* arch */
It wouldn't shock me if this pair of instructions were
microarchitecturally bad. Maybe I can save a few more cycles by using
bitwise arithmetic or a pair of movls and explicit rsp manipulation
here. I haven't tried.
>> + movq %rsp, %rdi
>> + call seccomp_phase1
> the assembler code is pretty much repeating what C does in
> populate_seccomp_data(). Assuming the whole gain came from
> patch 5 why asm version is so much faster than C?
> it skips SAVE/RESTORE_REST... what else?
> If the most of the gain is from all patches combined (mainly from
> 2 phase approach) then why bother with asm?
The whole gain should be patch 5, but there are three things going on here.
The biggest win is skipping int_ret_from_sys_call. IRET sucks.
There's some extra win from skipping SAVE/RESTORE_REST, but I haven't
benchmarked that. I would guess it's on the order of 5ns. In theory
a one-pass implementation could skip int_ret_from_sys_call, but that
will be awkward to implement correctly.
The other benefit is generating seccomp_data in assembly. It saves
about 7ns. This is likely due to avoiding all the indirection in
syscall_xyz and to avoiding prodding at flags to figure out the arch
Fiddling with branch prediction made no difference that I could measure.
> Somehow it feels that the gain is due to better branch prediction
> in asm version. If you change few 'unlikely' in C to 'likely', it may
> get to the same performance...
> btw patches #1-3 look good to me. especially #1 is nice.