utf-16le[BOM] to utf-8 file solution - GitHub
文章推薦指數: 80 %
http://stackoverflow.com/questions/22459020/python-decode-utf-16-file-with-bom. import codecs. encoded_text = open('utf16lebom_file', 'rb').read() #you ... Skiptocontent Allgists BacktoGitHub Signin Signup Sign in Sign up {{message}} Instantlysharecode,notes,andsnippets. ctjoy/utf16leBOM_to_utf8.py LastactiveMar25,2018 Star 0 Fork 0 Star Code Revisions 2 Embed Whatwouldyouliketodo? Embed Embedthisgistinyourwebsite. Share Copysharablelinkforthisgist. Clonevia HTTPS ClonewithGitorcheckoutwithSVNusingtherepository’swebaddress. LearnmoreaboutcloneURLs DownloadZIP utf-16le[BOM]toutf-8filesolution Raw utf16leBOM_to_utf8.py ThisfilecontainsbidirectionalUnicodetextthatmaybeinterpretedorcompileddifferentlythanwhatappearsbelow.Toreview,openthefileinaneditorthatrevealshiddenUnicodecharacters. LearnmoreaboutbidirectionalUnicodecharacters Showhiddencharacters #http://stackoverflow.com/questions/22459020/python-decode-utf-16-file-with-bom importcodecs encoded_text=open('utf16lebom_file','rb').read()#youshouldreadinbinarymodetogettheBOMcorrectly bom=codecs.BOM_UTF16_LE#printdir(codecs)forotherencodings assertencoded_text.startswith(bom)#makesuretheencodingiswhatyouexpect,otherwiseyou'llgetwrongdata encoded_text=encoded_text[len(bom):]#stripawaytheBOM decoded_text=encoded_text.decode('utf-16le') f=open('utf8_file','wb') f.write(decoded_text.encode('utf8')) f.close() Signupforfree tojointhisconversationonGitHub. Alreadyhaveanaccount? Signintocomment Youcan’tperformthatactionatthistime. Yousignedinwithanothertaborwindow.Reloadtorefreshyoursession. Yousignedoutinanothertaborwindow.Reloadtorefreshyoursession.
延伸文章資訊
- 1在Python中將帶BOM的UTF - 程式人生
我想將它們(理想情況下)轉換為沒有BOM的UTF-8。似乎 codecs.StreamRecoder(stream, encode, decode, Reader, Writer, errors...
- 2Convert UTF-8 with BOM to UTF-8 with no BOM in Python
- 3Why Python 3 doesn't write the Unicode BOM - Peter Bloomfield
According to the Python documentation on reading and writing Unicode data: Some encodings, such a...
- 4Python: 關於Unicode 的BOM - 傑克! 真是太神奇了! - 痞客邦
至於UTF-8 編碼: 是將Unicode 編碼的字串資料轉成8 位元序列(轉換規則如下表: UTF-8 ... 寫檔時, 要依據需求自己先寫入一個BOM ( write('\ufeff') ).
- 5utf-16le[BOM] to utf-8 file solution - GitHub
http://stackoverflow.com/questions/22459020/python-decode-utf-16-file-with-bom. import codecs. en...