|
简单的方法是用Windows下的Notepad打开文件,点击"File"->"save as",出现的保存 文件对话框中有"Encoding"选择框中显示的内容就是你现在文件的编码方式了:)
如果用程序判读是不是Unicode文件的话,请参考
http://blog.csdn.net/fmddlmyy/archive/2005/05/04/372148.aspx 中关于Unicode的一些解释。 用二进制形式读文件的BOM进行判断。 UTF8的BOM是:EF BB BF Unicode Big-Endian的BOM是:FE FF
Unicode Little-Endian的BOM是:FF FE
假设:a.txt 是UTF-8文件
f = open( 'a.txt', 'rb') print f.read( 3 )
看看输出是不是'\xef\xbb\xbf'
哈哈:)解释的可能不是很清楚。 见谅啊:-)
在07-8-13,gaohawk <gaohawk <at> gmail.com> 写道:
用python如何察看文件的编码格式呢?
发件人: violet
kz
发送时间:
2007-08-13 11:50:39
抄送:
主题: Re:
[python-chinese]用codecs以utf8的方式读取文件出现UnicodeDecodeError
呵呵;-) 抱歉~~刚才编码设置错了。 你检查下你的"a.txt"文件是不是"utf8"格式的:)
在07-8-13,gaohawk
<gaohawk <at> gmail.com> 写道:
晕,好像你这封信的编码就有问题。
发件人: violet
kz
发送时间:
2007-08-13 11:38:23
抄送:
主题: Re:
[python-chinese]用codecs以utf8的方式读取文件出现UnicodeDecodeError
hi锟斤拷
>UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-1:
invalid data
锟斤拷锟斤拷一锟铰★拷C:\a.txt"锟斤拷锟斤拷募锟斤拷遣锟斤拷锟経TF-8锟斤拷锟斤拷摹锟斤拷锟斤拷锟斤拷utf-8锟侥憋拷锟斤拷模锟接︼拷锟?没锟斤拷什么锟斤拷锟斤拷摹锟?锟斤拷锟斤拷锟絬tf-8锟斤拷锟诫,锟酵伙拷锟斤拷2锟斤拷 
gaohawk wrote:
> 锟斤拷锟诫:
> import codecs
> f = codecs.open(filename="C:\\a.txt", mode='rb', encoding="utf8",
> errors='strict', buffering=1)
> text = f.read()
> print text
> f.close()
> 锟斤拷锟斤拷锟斤拷息锟斤拷
> Traceback (most recent call last):
> File "E:\Python25\Lib\test.py", line 6, in
<module >
> text = f.read()
> File "E:\Python25\lib\codecs.py", line 606, in read
> return self.reader.read(size)
> File "E:\Python25\lib\codecs.py", line 418, in read
> newchars, decodedbytes = self.decode(data, self.errors)
> UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-1:
> invalid data
> 锟斤拷锟剿很讹拷胤锟斤拷锟斤拷锟斤拷锟斤拷UnicodeDecodeError锟斤拷锟酵讹拷锟斤拷锟劫★拷锟酵凤拷锟斤拷位锟斤拷牛锟斤拷锟斤拷锟斤拷锟?
> 锟斤拷锟斤拷锟斤拷
> ------------------------------------------------------------------------
> gaohawk
> 2007-08-13
> ------------------------------------------------------------------------
>
> _______________________________________________
> python-chinese
_______________________________________________
python-chinese
_______________________________________________ python-chinese Post:
send python-chinese <at> lists.python.cn
Subscribe: send subscribe to python-chinese-request <at> lists.python.cn Unsubscribe:
send unsubscribe to python-chinese-request <at> lists.python.cn Detail
Info: http://python.cn/mailman/listinfo/python-chinese
-- /*-- Zhou Kai --*/ /*-- violet.kz <at> gmail.com --*/
_______________________________________________ python-chinese Post: send python-chinese <at> lists.python.cn
Subscribe: send subscribe to python-chinese-request <at> lists.python.cn Unsubscribe: send unsubscribe to
python-chinese-request <at> lists.python.cn Detail Info:
http://python.cn/mailman/listinfo/python-chinese
-- /*-- Zhou Kai --*/ /*-- violet.kz <at> gmail.com --*/
_______________________________________________
python-chinese
Post: send python-chinese <at> lists.python.cn
Subscribe: send subscribe to python-chinese-request <at> lists.python.cn
Unsubscribe: send unsubscribe to python-chinese-request <at> lists.python.cn
Detail Info: http://python.cn/mailman/listinfo/python-chinese
|