On Thu, 1 May 2008 01:13:46 -0700
Andrew Morton wrote:
> On Wed, 30 Apr 2008 00:03:38 -0700 Arjan van de Ven
> > > First of all:
> > > I 100% agree with Andrew that our biggest problems are in
> > > reviewing code and resolving bugs, not in finding bugs (we
> > > already have far too many unresolved bugs).
> > I would argue instead that we don't know which bugs to fix first.
> How about "a bug which we just added"? One which is repeatable.
> Repeatable by a tester who is prepared to work with us on resolving
> it. Those bugs.
> Rafael has a list of them. We release kernels when that list still
> has tens of unfixed regressions dating back up to a couple of months.
I know he does. But I will still argue that if that is all we work from,
all of those equally, we're doing the wrong thing.
I'm sorry, but I really do not consider "ext4 doesn't compile on m68k"
on that list to be as relevant as a "i915 drm driver crashes" bug which is
us for a while and not on that list, just based on the total user base for
either of those.
Does that mean nobody should fix the m68k bug?
Someone who cares about m68k for sure should work on it, or if it's easy
for an ext4 developer,
sure. But if the ext4 person has to spend 8 hours on it figuring cross
compilers, I say
we're doing something very wrong here. (no offense to the m68k people, but
a few of you; maybe I should have picked voyager instead)
Maybe that's a "boggle" for you; but for me that's symptomatic of where we
We don't make (effective) prioritization decisions. Such decisions are
hard, because it
effectively means telling people "I'm sorry but your bug is not yet
unpopular, especially if the reporter is very motivated on lkml. And it
will involve a
certain amount of non-quantifiable judgement calls, which also means we
won't always be
right. Another hard thing is that lkml is a very self-selective audience. A
bug may be
reported three times there, but never hit otherwise, while another bug
might not be reported
at all (or only once) while thousands and thousands of people are hitting
Not that we're doing all that bad, we ARE fixing the bugs (at least the
are frequently hit. So I wouldn't blindly say we're doing a bad job at
prioritizing. I would
rather say that if we focus only on what is left afterwards without doing a
we'll *always* have a negative view of quality, since there will *always*
be bugs we don't
fix. Linux well over ten million users (much more if you count embedded
A lot of them will have "standard" hardware, and a bunch of them will have
Cosmic rays happen. As do overclocking and bad DIMMs. And some BIOSes are
just weird etc etc.
If we do not prioritize effectively we'll be stuck forever chasing ghosts,
or we'll be stuck
saying "our quality sucks" forever without making progress.
Another trap is to only look at what goes wrong, not on what goes right...
we tend to only
see what goes wrong on lkml and it's an easy trap to fall into doomthinking
Are we doing worse on quality? My (subjective) opinion is that we are doing
better than last year.
We are focused more on quality. We are fixing the bugs that people hit
most. We are fixing most
of the regressions (yes, not all). Subsystems are seeing flat or lower
bugcounts/bugrates. Take ACPI,
the number of outstanding bugs *halved* over the last year. Of course you
can pick a single
bug and say "but this one did not get fixed", but that just loses the big
proves the point :). All of this with a growing userbase and a rate of
development that's a bit
faster than last year as well.
Can we do better? Always. More testing will help. Both to detect things
early, and by
letting us figure out which bugs are important. Just saying "more testing
is not relevant
because we're not even fixing the bugs we have now" is just incorrect.
More testers helps. Wider range of hardware/usages allows us to find better
in the hard to track down bugs. More testers means more people willing to
see if they
can diagnose the bugs at least somewhat themselves, via bisection or
otherwise. That's important,
because that's the part of the problem that scales well with a growing