Check whether a file contains valid UTF-8. Returns 0 for valid ...
文章推薦指數: 80 %
Check whether a file contains valid UTF-8. Returns 0 for valid UTF-8, prints an error message to STDOUT and returns 1 for invalid. - check_utf8.py.
Skiptocontent
Allgists
BacktoGitHub
Signin
Signup
Sign in
Sign up
{{message}}
Instantlysharecode,notes,andsnippets.
manics/check_utf8.py
CreatedFeb4,2013
Star
0
Fork
0
Star
Code
Revisions
1
Embed
Whatwouldyouliketodo?
Embed
Embedthisgistinyourwebsite.
Share
Copysharablelinkforthisgist.
Clonevia
HTTPS
ClonewithGitorcheckoutwithSVNusingtherepository’swebaddress.
LearnmoreaboutcloneURLs
DownloadZIP
CheckwhetherafilecontainsvalidUTF-8.Returns0forvalidUTF-8,printsanerrormessagetoSTDOUTandreturns1forinvalid.
Raw
check_utf8.py
ThisfilecontainsbidirectionalUnicodetextthatmaybeinterpretedorcompileddifferentlythanwhatappearsbelow.Toreview,openthefileinaneditorthatrevealshiddenUnicodecharacters.
LearnmoreaboutbidirectionalUnicodecharacters
Showhiddencharacters
#!/usr/bin/envpython
#CheckwhetherafilecontainsvalidUTF-8
#Fromhttp://stackoverflow.com/a/3269323
importcodecs
importsys
importos
defcheckFile(filename):
try:
withcodecs.open(filename,encoding='utf-8',errors='strict')asf:
forlineinf:
pass
return0
exceptIOErrorase:
sys.stderr.write('IOerror:%s\n'%e)
return2
exceptUnicodeDecodeError:
sys.stdout.write('%scontainsinvalidUTF-8\n'%filename)
return1
if__name__=='__main__':
iflen(sys.argv)!=2:
p=sys.argv[0]
sys.stderr.write('Usage:'+p[p.rfind('/')+1:]+'
延伸文章資訊
- 1Python 3 Notes: Reading and Writing Methods
If you run into problems, visit the Common Pitfalls section at the bottom of this ... myfile = op...
- 2Unicode HOWTO — Python 3.10.7 documentation
- 3how can python check if a file name is in utf8? - splunktool
Create a file object using the open() function. Along with the file name, specify: 'r' for readin...
- 4Unicode HOWTO — Python 3.10.7 documentation
If bytes are corrupted or lost, it's possible to determine the start of the next UTF-8-encoded co...
- 5Is there a Linux command to find out if a file is UTF-8? - Super User