Gmane
Favicon
From: Dan Steinicke <dan@...>
Subject: Re: [Chandler-dev] Meeting on Functional tests and Tinderboxes White Board notes
Newsgroups: gmane.org.osaf.devel
Date: 2006-09-27 16:43:58 GMT (2 years, 40 weeks, 1 day, 13 hours and 58 minutes ago)
I would like to see this discussion continue.  We probably need to have 
another meeting about this, I am not attached to that but I would like 
us to follow through on this and come up with some clear solutions to 
the problems that are causing so much frustration.

Below are my ideas of what the issues are and possible ways we could 
start dealing with them.  I encourage others to voice their opinions on 
what the issues are and what we should do about them.   I especially 
encourage people to correct me if you feel I am miss-stating something.

1)  We need a new agreement on how to respond when the tinderboxes go 
red or orange for a long period of time.  Bryan has made the following 
proposal:
       after one week of mostly red or orange tinderbox
   A) P1 bugs get filed for each failure
   B) The failing tests get disabled on platforms on which they fail 
(with bug numbers in the comments)
   C) People are assigned to fix those bugs and re-enable the tests

2) A number of ideas were given about the functional tests and 
ChandlerTestLib (formerly known as QAUITestAppLib).  We should decide 
what ideas we want to implement, give them priorities and assign people 
to make the changes.  The ideas I heard were: (in no particular order)
   A) Not always try to emulate a user typing and clicking
   B) Have more asserts for self checking of the test library code
   C) Remove dependencies between the tests
   D) Regularly run the tests on their own as well as in the suite
   E) Make the tests more robust.  Remove assumptions and add 
abstraction code, example: TestLib should have functions that aid 
setting up tests moving from view to view and checking where you are etc
   F) Insure tests fail when there is an exception (there is already a 
bug on this)
   G) Default to stopping tests on first error (bug 6751)
   H) Document how to run the tests better

3) A number of improvements to the way tests are logged.  Again we 
should decide what we want to implement, prioritize and assign the tasks.
   A) Provide one log containing all log output (chandler, twisted, 
tests) in chronological order. 
   B) Remove the false links at the top of the on line logs
   C) Failing tests should print a line showing how the test was run
   D) Make tests raise on failure so there would always be a traceback 
pointing to the failure
   E) Have the default produce more log output

4)  At least one suggestion was made about the scripting functions.
   A) Remove the 'magic' of the app_ns object and re-write so that the 
miss leading None type errors are reduced or avoided.

5) Hardware
   A) Improve cycle times by using faster computers to run the tinderboxes

6) do_tests
   A) Re-write do_tests in python

Dan

Dan Steinicke wrote:
> Below is a copy of what was written on the white board during the 
> meeting.
>
>
>
> Goals of functional tests/ tinderbox
>
>   1.      Knowing when the code is broken
>   2.      Knowing when chandler doesn't run
>   3.      Increase productivity
>   4.      We want constantly working code
>   5.      Points us to the error so we can fix it fast
>
> Test users (Organization)
>
>   1.      QA
>   2.      Devs
>   3.      build/ release
>
> Frustration:
>
>   1.      Too much time spent on running and fixing functional tests
>   2.      Functional tests are hard to fix
>   3.      Functional tests are hard to debug
>   4.      Reports are not clear enough
>   5.      Hard to run tests under debugger
>   6.      Too many different ways to run tests
>   7.      Functional tests are dependent on one another
>   8.      Tests are not 100% deterministic
>   9.      Test framework is not reliable
>   10.    Tests fail for the wrong reasons
>   11.    Too many assumptions in test framework
>   12.    Tests always trying to emulate a user makes the tests slower 
> and  less reliable than necessary
>   13.    Some failures are not caught
>   14.    The scripting assert messages are logged and get in the way 
> of  finding real errors
>   15.    Can't use debuggers with do_tests (can't do_tests.sh --gdb )
>
> Suggestions:
>
>   1.      Re-write do_tests in python (there is a bug on this)
>   2.      Tests should be run both individually and together in suite
>   3.      Document how to run the tests better (all flags, options)
>   4.      Improve script library
>   5.      Make the framework less brittle
>   6.      Faster tinderbox hardware for faster test cycles
>   7.      Improve logs
>   8.      Change defaults for more log output
>
>
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
>
> Open Source Applications Foundation "chandler-dev" mailing list
> http://lists.osafoundation.org/mailman/listinfo/chandler-dev
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Open Source Applications Foundation "chandler-dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/chandler-dev