Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Pascal J. Bourguignon <pjb <at> informatimago.com>
Subject: Re: using pathnames containing wildcard characters
Newsgroups: gmane.lisp.clisp.general
Date: Tuesday 31st January 2012 20:56:31 UTC (over 5 years ago)
[email protected] (Don Cohen) writes:

> Is there a way to separate CR from LF or create an encoding with that
> property?  We should be able to get back from (directory ...) one
> pathname containing a CR and another containing a LF.
>
> In the past I've always resorted to binary IO in such cases, but that
> doesn't seem to be an option in the case of (directory ...).
> If I had such an encoding then perhaps I would not need to read files
> as bytes and then translate them to characters via code-char.

Unix consider pathnames to be sequences of bytes.  Yes, binary.
Pathname components cannot contain the bytes 0 or 47, but otherwise all
the other values from 1 to 255 are valid.

File systems will indeed contain pathnames whose bytes are obtained from
encoding strings using various coding systems.  And having a pathname
component that contains bytes 10, 13, and 13+10 in sequence are
perfectly valid.


So if you want to design a CL physical pathname that is able to
represent all the unix pathnames, you need either to find a way to
encode/decode vectors of bytes into strings, or merely to define some
data type to represent vectors of bytes as valid pathname components.


valid pathname directory n. a string, a list of strings, nil, :wild,
    :unspecific, or some other object defined by the implementation to
    be a valid directory component.

valid pathname name n. a string, nil, :wild, :unspecific, or some other
   object defined by the implementation to be a valid pathname name.


I wouldn't mind allowing vector of bytes as physical pathname
components, and returning vector of bytes as soon as the pathname
component doesn't contains only bytes encoding ASCII printable
characters.   The application may always use babel to convert between
vectors of bytes and strings, if it can determine an encoding, and a
mapping for control codes.

But I guess one may argue for an encoding such as URL encoding, which
could be useful to write wildcard pathname components as string.

   "%e9*%e9"
vs.
   #(233 42 233)

But "%e*%e9" would be wrong.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
A bad day in () is better than a good day in {}.


------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
clisp-list mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/clisp-list
 
CD: 15ms