Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Chris Kuklewicz <haskell <at> list.mightyreason.com>
Subject: Announcing Text.Regex.Lazy (0.33)
Newsgroups: gmane.comp.lang.haskell.libraries
Date: Tuesday 21st March 2006 20:48:04 UTC (over 11 years ago)
Announcing : Text.Regex.Lazy (0.33)
Where : http://sourceforge.net/projects/lazy-regex
Who : Chris Kuklewicz 
License : BSD, except for DFAEngine.hs which is LGPL (derived from CTK
light)

What:  This is an alternative to Text.Regex along with some enhancements. 
GHC's
Text.Regex marshals the data back and forth to c-arrays to call libc and
this is
far too slow (and strict).  This module understands regular expression
Strings
via a Parsec parser and creates an internal data structure
(Text.Regex.Lazy.Pattern).  This is then transformed into a Parsec parser
to
process the input String, or into a DFA table for matching against the
input
String or FastPackedString.  The input string is consumed lazily, so it may
be
an arbitrarily long or infinite source.


The main modules of interest are:

(*) Text.Regex.Lazy.Compat is supposed to be a drop in replacement for
Text.Regex which uses Parsec and lazy matching.

(*) Text.Regex.Lazy.Full allows for different strategies and for expanded
syntax
.  This uses Parsec and a choice of lazy or strict matching.

(*) Text.Regex.Lazy.CompatDFA uses a fast lazy DFAEngine for regex
matching.

(*) And an early version of Text.Regex.Lazy.DFAEngineFPS applies the
DFAEngine
to a Data.FastPackedString  (untested).


Why might you use this?

(+) You would rather not translate a regular expression into
a hard coded predicate for matching or filtering a string.

(+) You can parse something via parenthesized subgroup capture of a regular
expression and not have to write all the parsec manually.

(+) You need to filter a large input (from stdin, for example) with a regex
and
want to do it lazily instead of all at once.

(+) You want to build your own extensions to regex syntax for your project
and
would rather not have to rewrite one of the c-libraries to do it.


What might you contribute?

(.) Use it and report rough edges, incompatibilities, and bugs.

(.) You can think of clever analysis and optimizations (e.g. taking the
Pattern
data as input).

(.) Your favorite extended syntax (e.g. giving meaning to various
backslashed
letters, etc.).

(.) You have a sinister regular expression & input to contribute as an evil
test
case.
 
CD: 3ms