On Fri, May 22, 2009 at 11:19:00AM +1000, Greg Ungerer wrote:
> Atsushi Nemoto wrote:
>> On Wed, 20 May 2009 15:26:04 +0100, Ralf Baechle <email@example.com> wrote:
>>>> Now the vmalloc area starts at 0xc000000000000000 and the kernel code
>>>> and data is all at 0xffffffff80000000 and above. I don't know if the
>>>> start and end are reasonable values, but I can see some logic as to
>>>> where they come from. The code path that leads to this is via
>>>> __vunmap() and __purge_vmap_area_lazy(). So it is not too difficult
>>>> to see how we end up with values like this.
>>> Either start or end address is sensible but not the combination - both
>>> addresses should be in the same segment. Start is in XKSEG, end in CKSEG2
>>> and in between there are vast wastelands of unused address space exabytes
>>> in size.
>>>> But the size calculation above with these types of values will result
>>>> in still a large number. Larger than the 32bit "int" that is "size".
>>>> I see large negative values fall out as size, and so the following
>>>> tlbsize check becomes true, and the code spins inside the loop inside
>>>> that if statement for a _very_ long time trying to flush tlb entries.
>>>> This is of course easily fixed, by making that size "unsigned long".
>>>> The patch below trivially does this.
>>>> But is this analysis correct?
>>> Yes - but I think we have two issues here. The one is the calculation
>>> overflowing int for the arguments you're seeing. The other being that
>>> the arguments simply are looking wrong.
>> The wrong combination comes from lazy vunmapping which was introduced
>> in 2.6.28 cycle. Maybe we can add new API (non-lazy version of
>> vfree()) to vmalloc.c to implement module_free(), but I suppose
>> fallbacking to local_flush_tlb_all() in local_flush_tlb_kernel_range()
>> is enough().
> Is there any performance impact on falling back to that?
> The flushing due to lazy vunmapping didn't seem to happen
> often in the tests I was running.
It would depend on the workload. Some depend heavily on the performance
of vmalloc & co. What I'm wondering now is if we no tend to always flush
the entire TLB instead of just a few entries. The real cost of a TLB
flush is often not the flushing but the eventual reload of the entries.
That's factors that are hard to predict so benchmarking would be