[Top] [All Lists]

Re: [PATCH 00/05] robust per_cpu allocation for modules

To: Steven Rostedt <>
Subject: Re: [PATCH 00/05] robust per_cpu allocation for modules
From: Nick Piggin <>
Date: Sun, 16 Apr 2006 12:47:09 +1000
Cc: LKML <>, Andrew Morton <>, Linus Torvalds <>, Ingo Molnar <>, Thomas Gleixner <>, Andi Kleen <>, Martin Mares <>,,,,, Chris Zankel <>, Marc Gauthier <>, Joe Taylor <>, David Mosberger-Tang <>,,,,,,,,,,,,,
Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=s1024;; h=Received:Message-ID:Date:From:User-Agent:X-Accept-Language:MIME-Version:To:CC:Subject:References:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=0Y0ORF1oarZ8AXm1g7Pizo0yBb+ji4fqmn0ipmML1auVaaYFtidM2OXa+tJRz02rr56cdfJ5I9Mw0ZoXbxwr38/WwDPN9eshOTe8q/u06tvAtk2U6+rh69sp0ptLAC2ERqxB27SWYwmn7qyXppBsQBpjVYyXuDnhaP+mpt3Po3U= ;
In-reply-to: <>
Original-recipient: rfc822;
References: <1145049535.1336.128.camel@localhost.localdomain> <> <>
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20051007 Debian/1.7.12-1
Steven Rostedt wrote:
On Sat, 15 Apr 2006, Nick Piggin wrote:

Steven Rostedt wrote:

would now create a variable called per_cpu_offset__myint in
the .data.percpu_offset section.  This variable will point to the (if
defined in the kernel) __per_cpu_offset[] array.  If this was a module
variable, it would point to the module per_cpu_offset[] array which is
created when the modules is loaded.

If I'm following you correctly, this adds another dependent load
to a per-CPU data access, and from memory that isn't node-affine.

If so, I think people with SMP and NUMA kernels would care more
about performance and scalability than the few k of memory this

It's not just about saving memory, but also to make it more robust. But
that's another story.

But making it slower isn't going to be popular.

Why is your module using so much per-cpu memory, anyway?

Since both the offset array, and the variables are mainly read only (only
written on boot up), added the fact that the added variables are in their
own section.  Couldn't something be done to help pre load this in a local
cache, or something similar?

It it would still add to the dependent loads on the critical path, so
it now prevents the compiler/programmer/oooe engine from speculatively
loading the __per_cpu_offset.

And it does increase cache footprint of per-cpu accesses, which are
supposed to be really light and substitute for [NR_CPUS] arrays.

I don't think it would have been hard for the original author to make
it robust... just not both fast and robust. PERCPU_ENOUGH_ROOM seems
like an ugly hack at first glance, but I'm fairly sure it was a result
of design choices.

SUSE Labs, Novell Inc.
Send instant messages to your online friends
<Prev in Thread] Current Thread [Next in Thread>