Gmane
Gravatar
From: Michel Fortin <michel.fortin <at> michelf.com>
Subject: [ANN] MDTest 1.0
Newsgroups: gmane.text.markdown.general
Date: 2007-07-03 15:19:02 GMT (2 years, 2 days, 13 hours and 55 minutes ago)
I'm announcing today a new testsuite application for Markdown called  
MDTest. I've been using it as a replacement for John's Markdown Test  
for some time now I think it's ready for release.

First, I want to say Markdown Test has been very useful in developing  
PHP Markdown. The testsuite covers many cases, however I still found  
the need to add other tests for PHP Markdown, to cover bugs specific  
to my implementation, or simply to covers areas for which Markdown  
Test had no test case. The tests I added cover practically everything  
that was changed or fixed in PHP Markdown in the last couple of  
years; I expect that to be useful to others.

I've also made a new testsuite dedicated to PHP Markdown Extra, with  
tests for all the features. It has been very practical, and worked  
very well with Markdown Test, except for one thing.

HTML normalization is the weak point of Markdown Test. Because  
different outputs can have the same HTML meaning, Markdown Test has  
an option to compare outputs filtered through Tidy. Personally, I  
have to say that I find Tidy to be more trouble than it is worth:  
because it fixes too much the HTML garbage you send to it, it often  
fixes the strange markup a bug in Markdown can generate.

I've made some time ago a modification allowing Markdown Test to  
print a diff of the expected output whenever a test failed, and I've  
been using it since instead of Tidy because I find it more reliable  
to check the differences myself. The only problem with this approach  
is that having made a couple of architectural changes to PHP  
Markdown, especially in the HTML block parser and in the way it  
generates hashes, it often generate whitespace differently around  
tags; having changed back the list processing so that it no longer  
produce strangely-indented list items and sublists (a workaround no  
longer necessary for PHP Markdown), it forces me to check many things  
manually every time.

So I decided recently that I'd build a better tool. It's called  
MDTest, and it's written in PHP. The idea is that by using PHP I can  
leverage PHP 5's HTML parser and normalize whitespace directly from  
the DOM tree. This is what it does, and I've found it to be a much  
better alternative than Tidy (although not perfect either). MDTest  
can also show a short diff for every failed test, which may prove  
useful to double check things with normalization off. And because  
it's written in PHP, it can call the Markdown function by itself  
without the need for an intermediary script. It can still call  
Markdown.pl or any other script, like Markdown Test does.

The benchmarking info is a lot more detailed too. It shows the time  
it takes to process each test, and compiles a nice summary at the  
end. The total parse time excludes the time MDTest is running --  
reading files, creating a diff or normalizing the output. MDTest also  
point out the average, minimum, first-quartile, median, third- 
quartile, and maximum time for running tests, and it provides these  
times as a difference with the minimum time too. I think this will  
allow some cross-implementation benchmarking by mostly eliminating  
overhead time for starting a process when calling a script.

For instance, here are the benchmarks for PHP Markdown on my iBook:

                        Total   Avg.   Min.    Q1.   Med.    Q3.   Max.
     Parse Time (ms):    1356     29      2      4      7     22    554
     Diff. Min. (ms):    1237     26      0      1      4     20    551

And with Markdown.pl 1.0.2b8:

                        Total   Avg.   Min.    Q1.   Med.    Q3.   Max.
     Parse Time (ms):   11143    242    177    182    196    225   1696
     Diff. Min. (ms):    2986     64      0      5     18     48   1519

This benchmark includes time for processing three testsuites: one is  
John Gruber's Markdown testsuite, the one comming with Markdown Test,  
the two others are PHP Markdown's and PHP Markdown Extra's own  
testsuites.

MDTest works with multiple testsuites (which are simply test folders)  
and by default use all the folders it can find in its directory with  
the .mdtest extension. Having a separate testsuite for PHP Markdown  
is useful in the sense that I don't have to mix my tests with John's  
and merge the changes whenever he updates his; it's useful also  
because I don't have to wait for an update to John's testsuite to add  
my own tests, and because I can put in some extra tests if needed.  
Eventually, both could be merged I suppose. PHP Markdown Extra's  
testsuite simply covers the features of PHP Markdown Extra.

There is still a little strangeness about some test cases right now:  
PHP Markdown fails some tests from John's base Markdown testsuite,  
this because John's testsuite is expecting empty title attributes on  
images and/or links when no title has been specified (a bug I'd say).  
There's an opposite test in the PHP Markdown testsuite checking that  
the titles are indeed not present when not specified. I think both  
should be harmonised. Also PHP Markdown Extra will fail PHP  
Markdown's extensive Emphasis test because Extra doesn't accept  
underscore-emphasis in the middle of a word while regular Markdown  
does. There's a special test in the Extra testsuite for this emphasis  
deviation.

So I'm publishing today MDTest, bundled with John's Markdown Test  
testsuite (unchanged) and the two others I have made for PHP Markdown  
and PHP Markdown Extra.

In the MDTest folder, you'll find the mdtest.php script, which you  
can run on the command line. It works pretty much like Markdown Test,  
except it takes one-letter arguments (because of an implementation  
detail). The normalization feature requires PHP 5, so Mac OS X's  
default PHP install won't work for that, unfortunately. Here's the  
help page:

     MDTest Usage
     ============

     ./mdtest.php [-dnvh] [-l library_path [-f function]] [-t test_dir]*
     ./mdtest.php [-dnvh] [-s script_path] [-t test_dir]*

      Options      | Description
      -------      | -----------
      -n           | normalize HTML output before compare
      -d           | show a diff of output vs. expected output
      -l library   | PHP library to load (like markdown.php)
      -f function  | PHP function to call (Markdown)
      -s script    | script to execute (like Markdown.pl)
      -t test_dir  | testsuite directory to use
      -v           | display MDTest version
      -h           | show this help

There's also a handy index.php file which provides a simple web  
interface to the script. What's particularly nice is that it takes  
all the files in the Implementation subfolder and creates a menu to  
select what script, or PHP library, to test.

You can download it here (167 Kb):
<http://www.michelf.com/docs/projets/mdtest-1.0.zip>

(MDTest is available under the GNU General Public License.)

Michel Fortin
michel.fortin <at> michelf.com
http://www.michelf.com/