[Top] [All Lists]

Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filt

To: Thomas Gleixner <>
Subject: Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering
From: Ingo Molnar <>
Date: Thu, 26 May 2011 11:15:18 +0200
Cc: Peter Zijlstra <>, Will Drewry <>, Steven Rostedt <>, Frederic Weisbecker <>, James Morris <>,, Eric Paris <>,,, "Serge E. Hallyn" <>, Ingo Molnar <>, Andrew Morton <>, Tejun Heo <>, Michal Marek <>, Oleg Nesterov <>, Jiri Slaby <>, David Howells <>, Russell King <>, Michal Simek <>, Ralf Baechle <>, Benjamin Herrenschmidt <>, Paul Mackerras <>, Martin Schwidefsky <>, Heiko Carstens <>,, Paul Mundt <>, "David S. Miller" <>, "H. Peter Anvin" <>,, linux-arm-kernel <>,,,,,, Linus Torvalds <>
In-reply-to: <alpine.LFD.2.02.1105251836030.3078@ionos>
Original-recipient: rfc822;
References: <> <> <> <> <> <1306254027.18455.47.camel@twins> <> <alpine.LFD.2.02.1105242239230.3078@ionos> <> <alpine.LFD.2.02.1105251836030.3078@ionos>
User-agent: Mutt/1.5.20 (2009-08-17)
* Thomas Gleixner <> wrote:

> > If anything then that should tell you something that events and 
> > seccomp are not just casually related ...
> They happen to have the hook at the same point in the source and 
> for pure coincidence it works because the problem to solve is 
> extremly simplistic. And that's why the diffstat is minimalistic, 
> but that does not prove anything.

Here are the diffstats of the various versions of this proposed 
security feature:

       bitmask (2009):  6 files changed,  194 insertions(+), 22 deletions(-)
 filter engine (2010): 18 files changed, 1100 insertions(+), 21 deletions(-)
 event filters (2011):  5 files changed,   82 insertions(+), 16 deletions(-)

The third variant, 'event filters', is actually the most 
sophisticated one of all and it is not simplistic at all.

The main reason why the diffstat is small is because it reuses over 
ten thousand lines of pre-existing kernel code intelligently. Are you 
interpreting that as some sort of failure of the patch? I think it's 
a very good thing.

To demonstrate the non-simplicity of the feature:

 - These security rules/filters can be sophisticated like:

   sys_close() rule protecting against the closing of 

                  "fd == 0 || fd == 1 || fd == 2"

   sys_ioperm() rule allowing port 0x80 access but nothing else:

                  "from != 128 || num != 1"

   sys_listen() rule limiting the max accept() backlog to 16 entries:

                  "backlog > 16"

   sys_mprotect(), sys_mmap[2](), sys_unmap() and sys_mremap() rule
   protecting the first 1 MB NULL pointer guard range:

                  "addr < 0x00100000"

   sys_setscheduler() rule protecting against the switch to 
   non-SCHED_OTHER scheduler policies:

                  "policy != 0"

   Most of these examples are finegrained access restrictions that 
   AFAIK are not possible with any of the LSM based security measures 
   that Linux offers today.

 - These security rules/filters can be safely used and installed by 
   unprivileged userspace, allowing arbitrary end user apps to define 
   their own, flexible security policies.

 - These security rules/filters get automatically inherited into child 
   tasks and child tasks cannot mess with them - they cannot even 
   query/observe that these filters *exist*.

 - These security rules/filters nest on each other in basically 
   arbitrary depth, giving us a working, implemented, stackable LSM

 - These security rules/filters can be extended to arbitrary more 
   object lifetime events in the future, without changing the ABI.

 - These security rules/filters, unlike most LSM rules, can execute
   not just within hardirqs but also within deeply atomic contexts
   such as NMI contexts, putting far less restrictions on what can
   be security/access checked.

 - Access permission violations can be set up to generate events of
   the violations into a scalable ring-buffer, providing unprivileged
   security-auditing functionality to the managing task(s).

I'd call that anything but 'simplistic'.



<Prev in Thread] Current Thread [Next in Thread>