Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Ashley Yakeley <ashley <at> semantic.org>
Subject: ANN: unicode-properties 3.2.0.0, unicode-names 3.2.0.0
Newsgroups: gmane.comp.lang.haskell.libraries
Date: Tuesday 2nd September 2008 04:54:38 UTC (over 9 years ago)
unicode-properties 3.2.0.0, unicode-names 3.2.0.0

These two packages are representations in Haskell of various data in the 
Unicode 3.2.0 Character Database. Unicode 3.2.0 was the latest version 
of the Unicode standard at the time I wrote most of the code; later I 
may move the packages to the latest version (currently 5.1.0).

The unicode-properties package contains functions to determine general 
category, case, and a wide range of other properties, as well as to do 
decomposition and case-folding.

The unicode-names package contains just one function, getCharacterName, 
for getting the name of a character. It's separated out because it's a 
sufficiently large proportion of the total data.

Both packages use the type "Char" to represent Unicode characters (more 
pedantically, codepoints). In GHC Char has the range 
['\x0'..'\x10FFFF'], matching the Unicode standard. The packages won't 
work with compilers that restrict Char to a smaller range.

Hackage:
<http://hackage.haskell.org/cgi-bin/hackage-scripts/package/unicode-properties>
<http://hackage.haskell.org/cgi-bin/hackage-scripts/package/unicode-names>

Source for both packages: <http://code.haskell.org/unicode-properties/>
Most of the data is auto-generated at build time from files downloadable 
from the Unicode web-site.

I expect Don will have them both in Arch Linux within the hour.

-- 
Ashley Yakeley
 
CD: 3ms