On Fri, Jun 13, 2014 at 2:22 PM, Alexei Starovoitov <email@example.com> wrote:
> On Tue, Jun 10, 2014 at 8:25 PM, Kees Cook <firstname.lastname@example.org> wrote:
>> This adds the new "seccomp" syscall with both an "operation" and "flags"
>> parameter for future expansion. The third argument is a pointer value,
>> used with the SECCOMP_SET_MODE_FILTER operation. Currently, flags must
>> be 0. This is functionally equivalent to prctl(PR_SET_SECCOMP, ...).
>> Signed-off-by: Kees Cook <email@example.com>
>> Cc: firstname.lastname@example.org
>> arch/x86/syscalls/syscall_32.tbl | 1 +
>> arch/x86/syscalls/syscall_64.tbl | 1 +
>> include/linux/syscalls.h | 2 ++
>> include/uapi/asm-generic/unistd.h | 4 ++-
>> include/uapi/linux/seccomp.h | 4 +++
>> kernel/seccomp.c | 63
>> kernel/sys_ni.c | 3 ++
>> 7 files changed, 69 insertions(+), 9 deletions(-)
>> diff --git a/arch/x86/syscalls/syscall_32.tbl
>> index d6b867921612..7527eac24122 100644
>> --- a/arch/x86/syscalls/syscall_32.tbl
>> +++ b/arch/x86/syscalls/syscall_32.tbl
>> @@ -360,3 +360,4 @@
>> 351 i386 sched_setattr sys_sched_setattr
>> 352 i386 sched_getattr sys_sched_getattr
>> 353 i386 renameat2 sys_renameat2
>> +354 i386 seccomp sys_seccomp
>> diff --git a/arch/x86/syscalls/syscall_64.tbl
>> index ec255a1646d2..16272a6c12b7 100644
>> --- a/arch/x86/syscalls/syscall_64.tbl
>> +++ b/arch/x86/syscalls/syscall_64.tbl
>> @@ -323,6 +323,7 @@
>> 314 common sched_setattr sys_sched_setattr
>> 315 common sched_getattr sys_sched_getattr
>> 316 common renameat2 sys_renameat2
>> +317 common seccomp sys_seccomp
>> # x32-specific system call numbers start at 512 to avoid cache impact
>> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
>> index b0881a0ed322..1713977ee26f 100644
>> --- a/include/linux/syscalls.h
>> +++ b/include/linux/syscalls.h
>> @@ -866,4 +866,6 @@ asmlinkage long sys_process_vm_writev(pid_t pid,
>> asmlinkage long sys_kcmp(pid_t pid1, pid_t pid2, int type,
>> unsigned long idx1, unsigned long idx2);
>> asmlinkage long sys_finit_module(int fd, const char __user *uargs, int
>> +asmlinkage long sys_seccomp(unsigned int op, unsigned int flags,
>> + const char __user *uargs);
> It looks odd to add 'flags' argument to syscall that is not even used.
> It don't think it will be extensible this way.
> 'uargs' is used only in 2nd command as well and it's not 'char __user *'
> but rather 'struct sock_fprog __user *'
> I think it makes more sense to define only first argument as 'int op' and the
> rest as variable length array.
> Something like:
> long sys_seccomp(unsigned int op, struct nlattr *attrs, int len);
> then different commands can interpret 'attrs' differently.
> if op == mode_strict, then attrs == NULL, len == 0
> if op == mode_filter, then attrs->nla_type == seccomp_bpf_filter
> and nla_data(attrs) is 'struct sock_fprog'
Eww. If the operation doesn't imply the type, then I think we've
totally screwed up.
> If we decide to add new types of filters or new commands, the syscall
> won't need to change. New commands can be added preserving backward
> The basic TLV concept has been around forever in netlink world. imo makes
> sense to use it with new syscalls. Passing 'struct xxx' into syscalls
> is the thing
> of the past. TLV style is more extensible. Fields of structures can become
> optional in the future, new fields added, etc.
> 'struct nlattr' brings the same benefits to kernel api as protobuf did
> to user land.
I see no reason to bring nl_attr into this.
Admittedly, I've never dealt with nl_attr, but everything
netlink-related I've even been involved in has involved some sort of