Check whether a file contains valid UTF-8. Returns 0 for valid ...
文章推薦指數: 80 %
Check whether a file contains valid UTF-8. Returns 0 for valid UTF-8, prints an error message to STDOUT and returns 1 for invalid. - check_utf8.py.
Skiptocontent
Allgists
BacktoGitHub
Signin
Signup
Sign in
Sign up
{{message}}
Instantlysharecode,notes,andsnippets.
manics/check_utf8.py
CreatedFeb4,2013
Star
0
Fork
0
Star
Code
Revisions
1
Embed
Whatwouldyouliketodo?
Embed
Embedthisgistinyourwebsite.
Share
Copysharablelinkforthisgist.
Clonevia
HTTPS
ClonewithGitorcheckoutwithSVNusingtherepository’swebaddress.
LearnmoreaboutcloneURLs
DownloadZIP
CheckwhetherafilecontainsvalidUTF-8.Returns0forvalidUTF-8,printsanerrormessagetoSTDOUTandreturns1forinvalid.
Raw
check_utf8.py
ThisfilecontainsbidirectionalUnicodetextthatmaybeinterpretedorcompileddifferentlythanwhatappearsbelow.Toreview,openthefileinaneditorthatrevealshiddenUnicodecharacters.
LearnmoreaboutbidirectionalUnicodecharacters
Showhiddencharacters
#!/usr/bin/envpython
#CheckwhetherafilecontainsvalidUTF-8
#Fromhttp://stackoverflow.com/a/3269323
importcodecs
importsys
importos
defcheckFile(filename):
try:
withcodecs.open(filename,encoding='utf-8',errors='strict')asf:
forlineinf:
pass
return0
exceptIOErrorase:
sys.stderr.write('IOerror:%s\n'%e)
return2
exceptUnicodeDecodeError:
sys.stdout.write('%scontainsinvalidUTF-8\n'%filename)
return1
if__name__=='__main__':
iflen(sys.argv)!=2:
p=sys.argv[0]
sys.stderr.write('Usage:'+p[p.rfind('/')+1:]+'
延伸文章資訊
- 1How to write a check in python to see if file is valid UTF-8?
Could be simpler by using only one line: codecs.open("path/to/file", encoding="utf-8", errors="st...
- 2How to write a check in python to see if file is ... - Exchangetuts
As stated in title, I would like to check in given file object (opened as binary stream) is valid...
- 3How can Python check if a file name is in UTF8?
How can Python check if a file name is in UTF8? I have a PHP script that creates a list of files ...
- 4How to write a check in python to see if file is valid UTF-8?
- 5Is there a Linux command to find out if a file is UTF-8? - Super User