Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Don Cohen <don-sourceforge-xxzw <at> isis.cs3-inc.com>
Subject: Re: using pathnames containing wildcard characters
Newsgroups: gmane.lisp.clisp.general
Date: Friday 3rd February 2012 19:05:21 UTC (over 5 years ago)
>I notice that CHARSET:ISO-8859-1 is almost right:
   (with-open-file (f "/tmp/bytes" :direction :output :element-type
                      '(unsigned-byte 8) :if-does-not-exist :create)
     (loop for i below 256 do (write-byte i f)))

  This test may have fooled you.  Line-terminator transformation in
  stream functions is different from usage in the FFI or via
  ext:convert-string-to/from-bytes.

I don't understand what you think might be confusing.
I hope you agree that the code above simply writes all of the 8 bit
bytes to a file.  The code that you did not include:
 (with-open-file (f "/tmp/bytes" :external-format CHARSET:ISO-8859-1)
   (loop for i from 0 while (setf c (read-char f nil nil)) 
         unless  (= i (char-code c)) do (princ (cons i (char-code c)))))
 (13 . 10)
shows that reading with external-format CHARSET:ISO-8859-1 recovers
all of those bytes as corresponding characters except for CR => LF.
If I could create an encoding that printed nothing on the example
above, then I would be happy to use it for reading pathnames and lots
of other things that I now read as bytes.

  However, for pathnames, these days I advise against using Latin-1 on
  the sole merit that it happens to be 1:1.  Modern UNIX environments
  use UTF-8 and we've seen enough of those badly programmed apps that
  output "" when they should not.

I don't know how to interpret this "use UTF-8".  It looks to me like
unix file names are sequences of bytes, not restricted to things that
can be parsed into UTF-8.  What we need for reading unix file names 
as character strings seems to be the encoding that I wish I had - one
that maps 1-1 between chars and bytes.

  Round-trip is not trivial.  For instance, an ssh or sshfs from Linux
  to MacOS shows a bug *somewhere* among sshfs, bash, readline and one
  of the two OS when you'll discover that  reveals itself as  + a!
  (I noticed this when using backspace in bash within ssh.)

Again, I don't understand what you're trying to tell me here.
Does this have something to do with lisp or reading file names?
 
CD: 3ms