* Martin Michlmayr <email@example.com> [2006-02-28 11:41]:
> I get the following non-fatal oops on SGI IP22 (2.6.16-rc5) when
> running "md5sum /dev/mem". I know it's not very smart to run this
> command but nevertheless we shouldn't oops. FWIW, i386 reports
> "md5sum: /dev/mem: Bad address".
Right, so we had a fun discussion about this on IRC today... The
bottom line is that the kernel cannot to anything about it and root
should know what they're doing.
12:39 < ladis> tbm: you cannot read from /dev/mem randomly ;-)
12:46 < tbm> ladis: yeah, but my point is that it shouldn't oops/segfault
12:48 < ladis> tbm: Well, you acessed GIO space and MC asserted BERR interrupt.
That's pretty valid behaviour.
12:50 < tbm> ladis: a oops and segfault doesn't seem like valid behaviour from
the perspective of an end user. Why cannot it catch such an access and return
"Bad address" like i386 does
12:51 < p2-mate> tbm: if you do hw access in userland, you're on your own :)
12:53 < tbm> p2-mate: I still maintain that an oops is not an acceptable
behaviour from a user POV
12:56 < Bacchus> tbm: I'm missing some context here - what is oopsing?
12:57 < p2-mate> tbm: rm /dev/mem
12:57 < p2-mate> tbm: problem solved :)
12:57 < tbm> p2-mate: so why does /dev/mem exist in the first place...
12:57 < tbm> Bacchus: print a traceback
12:57 < p2-mate> tbm: otherwise the X server does not work
12:58 < Bacchus> tbm: Doesn't work very well but I've never seen a traceback
12:59 < tbm> well, isn't this thing called an oops? Or what's the right
12:59 < tbm> Data bus error, epc == ffffffff881b0cf0, ra == ffffffff881c9ab4
12:59 < tbm> Oops[#8]:
12:59 < tbm> Cpu 0
12:59 < tbm> $ 0 : 0000000000000000 0000000000000004 ffffffff80090000
12:59 < tbm> $ 4 : 00000000100023a8 ffffffff80090000 0000000000001000
12:59 < tbm> ..
12:59 < tbm> Call Trace:
12:59 < tbm> [<ffffffff880902bc>] vfs_read+0xfc/0x1b8
12:59 < tbm> ..
12:59 < ladis> p2-mate: That's problem of bloody crappy random number generator
12:59 < ladis> tbm: Right, it is Data bus error and I implemented it ;-)
12:59 < tbm> p2-mate: right, but my point is that the file is there and so it
can be expected that some users try to read it
13:00 < Bacchus> Okay - and a DBE when doing what?
13:00 < tbm> so either we shouldn't ship the file, or we should handle reading
from it gracefully
13:00 < p2-mate> tbm: well, perhapd it should not be there ?
13:00 < ladis> tbm: There was long debate about acesing /dev/mem on debian-mips
archive few years ago
13:00 < tbm> Bacchus: doing "md5sum /dev/mem"
13:00 < ladis> tbm: search for xdm
13:00 < p2-mate> BERR is imprecise ?
13:00 < Bacchus> tbm: You're kidding?
13:00 < tbm> Bacchus: which may be a stupid thing, but still shouldn't oops and
13:00 < ladis> tbm: It *will* always segfault on certain machines
13:01 < tbm> so why does i386 manage to produce a nice "Bad address" error?
13:01 < tbm> why is that not possible on mips?
13:01 < Bacchus> tbm: FOr this operation even formatting your hard disk would
13:01 < geoman> heh, I can confirm the oops on 2.6.16-rc4
13:01 < ladis> aiiie ;-)
13:01 < tbm> Bacchus: well, that's what i disagree with. If /dev/mem is so
dangerous, it shouldn't exist.
13:02 < Bacchus> Welcome to UNIX :)
13:02 < ladis> tbm: No. It is very powerfull. And only root can cope with
13:02 < ladis> tbm: Remember userspace drivers
13:03 < geoman> hmm, speaking of display managers and oopses
13:03 < Bacchus> tbm: /dev/mem gives free access to any and all devices in the
system just like the kernel. No safety net.
13:03 < ladis> tbm: And even removing /dev/mem doesn't prevent you from writing
program that does mmap
13:03 < geoman> I seem to recall that wdm causes a non-fatal oops on ip22
13:04 < geoman> perhaps it is related
13:04 < ladis> geoman: sure it is
13:05 < ladis> display managers authors are insane i386 centric idiots thinking
that reading enough /dev/mem gives you enough randomness for security purposes
13:05 < ladis> I never got this point...
13:05 < Bacchus> tbm: And yes, there are many that argue that /dev/mem should
13:11 < geoman> yep, wdm causes an identical oops
13:12 < tbm> ok, at least my O2 boots again
13:13 < tbm> anyway, I do agree with you that /dev/mem is dangerous and that
root should know what they're doing
13:13 < tbm> however, the tiny bit I don't understand is that if the kernel
manages to recognize this wrong access and issue a BERR, why cannot it simply
return an error to the userland program
13:13 < tbm> but anyway, I guess there are more important things to worry about
13:14 < geoman> well, my understanding is that the kernel should never oops
13:15 < Bacchus> tbm: By the time we receive a bus error we're dead in the
water. Game over. Tilt. Insert coin to continue ;-)
13:18 < geoman> heh, doing "md5sum /dev/mem" on my O2 doesn't return any error
13:18 < geoman> I think it is really trying to take the md5sum of all that
random crud ;)
13:19 < Bacchus> A bus error just doesn't have enough knowledge about what went
wrong, aside of a few
carefully controlled scenarios like hw probing.
13:37 < Bacchus> geoman: So how does wdm trigger it?
13:38 < ladis> Bacchus: Read above
13:38 < ladis> Bacchus: <ladis> display managers authors are insane i386
centric idiots thinking that
reading enough /dev/mem gives you enough randomness for security
13:39 < Bacchus> ladis: Eh... You meant you were serious?!?
13:39 < ladis> Bacchus: They are indeed doing that.
13:39 < ths> Bacchus: Sure. Wdm dies on my Indy.
13:39 < geoman> Bacchus: by simply tring to start it
13:39 < ladis> ...and the reason is above..
13:48 * Bacchus googles for wdm ...
13:50 < Bacchus> Btw, wtf does wdm work for non-root then?
13:50 < geoman> nope
13:51 < geoman> if you try to run it non-root, it returns a message that "Only
root wants to run wdm"
13:55 < ladis> Bacchus:
13:55 < ladis> Bacchus: That's start of pretty nice discussion. Read on your
own risc ;-)