[Top] [All Lists]

Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filt

To: Ingo Molnar <>
Subject: Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering
From: Eric Paris <>
Date: Fri, 13 May 2011 11:10:49 -0400
Cc: James Morris <>, Will Drewry <>,, Steven Rostedt <>, Frederic Weisbecker <>,,, Peter Zijlstra <>, "Serge E. Hallyn" <>, Ingo Molnar <>, Andrew Morton <>, Tejun Heo <>, Michal Marek <>, Oleg Nesterov <>, Jiri Slaby <>, David Howells <>, Russell King <>, Michal Simek <>, Ralf Baechle <>, Benjamin Herrenschmidt <>, Paul Mackerras <>, Martin Schwidefsky <>, Heiko Carstens <>,, Paul Mundt <>, "David S. Miller" <>, Thomas Gleixner <>, "H. Peter Anvin" <>,, Peter Zijlstra <>,,,,,,, Linus Torvalds <>
In-reply-to: <>
Original-recipient: rfc822;
References: <> <> <> <> <> <> <>
[dropping microblaze and roland]

lOn Fri, 2011-05-13 at 14:10 +0200, Ingo Molnar wrote:
> * James Morris <> wrote:

> It is a simple and sensible security feature, agreed? It allows most code to 
> run well and link to countless libraries - but no access to other files is 
> allowed.

It's simple enough and sounds reasonable, but you can read all the
discussion about AppArmour why many people don't really think it's the
best.  Still, I'll agree it's a lot better than nothing.

> But if i had a VFS event at the fs/namei.c::getname() level, i would have 
> access to a central point where the VFS string becomes stable to the kernel 
> and 
> can be checked (and denied if necessary).
> A sidenote, and not surprisingly, the audit subsystem already has an event 
> callback there:
>         audit_getname(result);
> Unfortunately this audit callback cannot be used for my purposes, because the 
> event is single-purpose for auditd and because it allows no feedback (no 
> deny/accept discretion for the security policy).
> But if had this simple event there:
>       err = event_vfs_getname(result);

Wow it sounds so easy.  Now lets keep extending your train of thought
until we can actually provide the security provided by SELinux.  What do
we end up with?  We end up with an event hook right next to every LSM
hook.  You know, the LSM hooks were placed where they are for a reason.
Because those were the locations inside the kernel where you actually
have information about the task doing an operation and the objects
(files, sockets, directories, other tasks, etc) they are doing an
operation on.

Honestly all you are talking about it remaking the LSM with 2 sets of
hooks instead if 1.  Why?  It seems much easier that if you want the
language of the filter engine you would just make a new LSM that uses
the filter engine for it's policy language rather than the language
created by SELinux or SMACK or name your LSM implementation.

>  - unprivileged:  application-definable, allowing the embedding of security 
>                   policy in *apps* as well, not just the system
>  - flexible:      can be added/removed runtime unprivileged, and cheaply so
>  - transparent:   does not impact executing code that meets the policy
>  - nestable:      it is inherited by child tasks and is fundamentally 
> stackable,
>                   multiple policies will have the combined effect and they
>                   are transparent to each other. So if a child task within a
>                   sandbox adds *more* checks then those add to the already
>                   existing set of checks. We only narrow permissions, never
>                   extend them.
>  - generic:       allowing observation and (safe) control of security relevant
>                   parameters not just at the system call boundary but at other
>                   relevant places of kernel execution as well: which 
>                   points/callbacks could also be used for other types of 
> event 
>                   extraction such as perf. It could even be shared with audit 
> ...

I'm not arguing that any of these things are bad things.  What you
describe is a new LSM that uses a discretionary access control model but
with the granularity and flexibility that has traditionally only existed
in the mandatory access control security modules previously implemented
in the kernel.

I won't argue that's a bad idea, there's no reason in my mind that a
process shouldn't be allowed to control it's own access decisions in a
more flexible way than rwx bits.  Then again, I certainly don't see a
reason that this syscall hardening patch should be held up while a whole
new concept in computer security is contemplated...


<Prev in Thread] Current Thread [Next in Thread>