Adding BOM (unicode signature) while saving file in python
文章推薦指數: 80 %
My method of adding BOM is by writing ansi characters '" at the beginning of the file, then open file in UTF-8 and write desired data: Home Public Questions Tags Users Companies Collectives ExploreCollectives Teams StackOverflowforTeams –Startcollaboratingandsharingorganizationalknowledge. CreateafreeTeam WhyTeams? Teams CreatefreeTeam Collectives™onStackOverflow Findcentralized,trustedcontentandcollaboratearoundthetechnologiesyouusemost. LearnmoreaboutCollectives Teams Q&Aforwork Connectandshareknowledgewithinasinglelocationthatisstructuredandeasytosearch. LearnmoreaboutTeams AddingBOM(unicodesignature)whilesavingfileinpython AskQuestion Asked 11years,7monthsago Modified 7monthsago Viewed 28ktimes 34 HowcanIaddBOM(unicodesignature)whilesavingfileinpython: file_old=open('old.txt',mode='r',encoding='utf-8') file_new=open('new.txt',mode='w',encoding='utf-16-le') file_new.write(file_old.read()) Ineedtoconvertfiletoutf-16-le+BOM.Nowscriptisworkinggreat,exceptthatthereisnoBOM. python Share Improvethisquestion Follow askedMar5,2011at8:31 QiaoQiao 16.2k2828goldbadges8686silverbadges117117bronzebadges 6 forlineinfile_old:file_new.write(line)isvastlymorememoryefficient.Andwhycan'tyouuseanyofthenumeroussoftwarethatalreadydoesthis? – user395760 Mar5,2011at8:41 Texteditorshavetoopenfileto"saveas",anditisprettybig.Othersoftwareisshareware,orhardtofind.Besides,I'mjustlearningpython.Savebylinemaybemoreefficient,butismorecomplex. – Qiao Mar5,2011at8:47 Ifthefileisprettybig,thatmaybeallthemorereasontoconvertitlinebyline--despiteits"complexity". – martineau Mar5,2011at17:30 Anditalsodependsonhowoftenscriptisexecuted.Inmycasefileis100mb,convertedin<10secondsonceamonth. – Qiao Mar5,2011at17:51 1 @JohnMachinactuallyhadthecorrectanswerhere. – Omnifarious Dec7,2012at22:10 | Show1morecomment 9Answers 9 Sortedby: Resettodefault Highestscore(default) Trending(recentvotescountmore) Datemodified(newestfirst) Datecreated(oldestfirst) 45 Writeitdirectlyatthebeginningofthefile: file_new.write('\ufeff') Share Improvethisanswer Follow answeredMar5,2011at9:11 OcasoProtalOcasoProtal 18.5k88goldbadges7575silverbadges8080bronzebadges 3 1 Thank,itisoftenveryhardtofigureoutsuchsimplethingforfirsttime. – Qiao Mar5,2011at9:29 1 @Qiao:It'snotthatsimple.Seemyanswer. – JohnMachin Apr20,2011at12:35 1 @Ocaso:OnPython2.xthelackofubefore'\ufeff'mightcauseaUnicodeencodingerror,specificallyifyouareexplicitlysettingtheencodingviacodecs.open(). – ccpizza Mar13,2016at14:29 Addacomment | 39 It'sbettertouseconstantsfrom'codecs'module. importcodecs f.write(codecs.BOM_UTF16_LE) Share Improvethisanswer Follow answeredApr20,2011at3:58 kriomantkriomant 2,11811goldbadge1313silverbadges2020bronzebadges 5 7 Thisisactuallythewronganswerfortworeasons.First,@JohnMachinhasthewriteanswer.Don'tuse'utf-16le',justuse'utf-18'.Secondly(andmoreimportantly)giventhattheOP'scodesetsanencodingthiswon'tresultinthecorrectbehavioratall.EspeciallyonPython3.Youaregivingbytestoathingthatwantsastr. – Omnifarious Dec7,2012at21:59 Err..'right',not'write'.That'swhathappenswhenyouplaywithcodeinonewindowandwritecommentsinanother. – Omnifarious Dec7,2012at22:10 3 @Omnifarious:Andutf-16,notutf-18.:p – jamesdlin Apr2,2013at18:37 @jamesdlin:Oh,oops!Ineedtodobetterabouttyposlikethat. – Omnifarious Apr2,2013at20:13 5 InPython3,using(default)text-modeopen,iterrorsbecauseyoutossitbytes,notstring,asOmnifariousalreadyhinted.Castingthebytestoastring,asinf.write(str(codecs.BOM_UTF8)),getsyoub'\xef\xbb\xbf'atthestartofyourfile. – RolfBly Aug6,2014at10:08 Addacomment | 22 WhydoyouthinkyouneedtospecificallymakeitUTF16LE?Justuse'utf16'astheencoding,PythonwillwriteitinyourendiannesswiththeappropriateBOM,andalltheconsumerneedstobetoldisthatthefileisUTF-16...that'sthewholepointofhavingaBOM. IftheconsumerisinsistingthatthefilemustbeencodedinUTF16LE,thenyoudon'tneedaBOM. Ifthefileiswrittenthewaythatyouspecify,andtheconsumeropensitwithUTF16LEencoding,theywillgeta\ufeffatthestartofthefile,whichisanuisance,andneedstobeignored. Share Improvethisanswer Follow answeredApr20,2011at6:22 JohnMachinJohnMachin 79.5k1111goldbadges138138silverbadges183183bronzebadges 1 "WhydoyouthinkyouneedtospecificallymakeitUTF16LE?"winapicanbeveryparticularinwhatitwants,itonlyacceptsUCS-2LEwithBOM – jrh Apr5,2021at20:38 Addacomment | 22 JustchoosetheencodingwithBOM: withcodecs.open('outputfile.csv','w','utf-8-sig')asf: f.write('a,é') (Inpython3youcandropthecodecs.) Share Improvethisanswer Follow answeredDec12,2017at10:49 KaralgaKaralga 40533silverbadges1111bronzebadges 2 1 orreplacecodecsmodulebyio,whichworksonbothpython2and3 – DaniloSilva Jan24,2019at17:24 1 This2017answershouldbevotedabovethe2011answers, – DanielChin Feb27at18:53 Addacomment | 6 Ihadasimilarsituationwherea3rdpartyappdidnotacceptthefileIgeneratedunlessithadaBOM. ForsomereasoninPython2.7thefollowingdoesnotworkforme write('\ufeff') Ihadtosubstituteitwith write('\xff\xfe') andthatisthesameas write(codecs.BOM_UTF16_LE) myfinaloutputfilewaswrittenwiththefollowingcode importcodecs mytext="Helpme" withopen("c:\\temp\\myFile.txt",'w')asf: f.write(codecs.BOM_UTF16_LE) f.write(mytext.encode('utf-16-le')) Thisanswermaybeuselessfortheoriginalaskerbutitmayhelpsomeonelikemewhostumblesuponthisissue Share Improvethisanswer Follow answeredAug14,2013at18:14 cagecage 39311goldbadge44silverbadges88bronzebadges Addacomment | 5 ForUTF-8withBOMyoucanuse: defaddUTF8Bom(filename): f=codecs.open(filename,'r','utf-8') content=f.read() f.close() f2=codecs.open(filename,'w','utf-8') f2.write(u'\ufeff') f2.write(content) f2.close() Share Improvethisanswer Follow answeredApr27,2015at15:36 vitperovvitperov 1,3071717silverbadges1919bronzebadges 1 1 Solid!Waswritingto.csvfileinPythonwhichExcelneverreallyopenedcorrectlyduetomissingBOM,nowit'sallgood.Thanks! – pixelphantom Jan6,2016at19:00 Addacomment | 0 vitperov'sanswerforpython3: defadd_utf8_bom(filename): withcodecs.open(filename,'r','utf-8')asf: content=f.read() withcodecs.open(filename,'w','utf-8')asf2: f2.write('\ufeff') f2.write(content) return Share Improvethisanswer Follow answeredMar25,2017at13:07 GuySoftGuySoft 1,6832222silverbadges2727bronzebadges Addacomment | 0 TRYIT: defadd_bom(file,bom:bytes): withopen(file,'r+b')asf: org_contents=f.read() f.seek(0) f.write(bom+org_contents) USAGE: importcodecs ... file='test.txt' withopen(file,'w',encoding='utf-8')asf:#withoutBOM f.write('A') add_bom(file,codecs.BOM_UTF16_LE) #TEST withopen(file,'rb')asf: print(f.read())#b'\xff\xfeA' Share Improvethisanswer Follow answeredNov20,2019at6:30 CarsonCarson 4,47222goldbadges2828silverbadges3838bronzebadges Addacomment | 0 MymethodofaddingBOMisbywritingansicharacters'"atthebeginningofthefile,thenopenfileinUTF-8andwritedesireddata: #CreatefilewithANSIencoding file=open("file.txt","a",encoding="ansi",errors='ignore') #AddBOMatthebeginningofthefileBOM0xEFBBBF file.write("") #Closefile file.close() #OpenfileinUTF-8andwritedata file=open("file.txt","a",encoding="utf-8",errors='ignore') file.write("Writeyourdatahere,Enjoy!!") Share Improvethisanswer Follow answeredFeb21at13:11 AhmedKHABERAhmedKHABER 2155bronzebadges Addacomment | YourAnswer ThanksforcontributingananswertoStackOverflow!Pleasebesuretoanswerthequestion.Providedetailsandshareyourresearch!Butavoid…Askingforhelp,clarification,orrespondingtootheranswers.Makingstatementsbasedonopinion;backthemupwithreferencesorpersonalexperience.Tolearnmore,seeourtipsonwritinggreatanswers. Draftsaved Draftdiscarded Signuporlogin SignupusingGoogle SignupusingFacebook SignupusingEmailandPassword Submit Postasaguest Name Email Required,butnevershown PostYourAnswer Discard Byclicking“PostYourAnswer”,youagreetoourtermsofservice,privacypolicyandcookiepolicy Nottheansweryou'relookingfor?Browseotherquestionstaggedpythonoraskyourownquestion. TheOverflowBlog HowtoearnamillionreputationonStackOverflow:beofservicetoothers Therightwaytojobhop(Ep.495) FeaturedonMeta BookmarkshaveevolvedintoSaves Inboximprovements:markingnotificationsasread/unread,andafiltered... Revieweroverboard!Orarequesttoimprovetheonboardingguidancefornew... CollectivesUpdate:RecognizedMembers,Articles,andGitLab Shouldweburninatethe[script]tag? Linked 4 PythonWritingaUCS-2LittleEndian(utf-16-le)filewithBOM 2 UnicodeByteOrderMark(BOM)asapythonconstant? 1 Writingunicodewithpython-whatiswrongwiththischaracter 0 HowcanGhostscriptenableUnicodeinOutline/Bookmarks? 1 texttocsvfileforjapanesecharacters(Errorsinarrangements) 0 Whydoescastinganinttostringinpython3givemeoutputinchinese 0 StrangebehaviorwhentryingtocreateandwritetoatextfileonmacOS 0 writingutf8notworkingwhenopeningfilewithnotepad 1 ExportCSVtoopenasutf-8inExcelonMACusingPython2.7 0 Automatingthestepsofloading.csvfileinExcelwithPython Seemorelinkedquestions Related 940 HowdoIdeterminethesizeofanobjectinPython? 385 Unicode(UTF-8)readingandwritingtofilesinPython 231 Howtoconvertastringtoutf-8inPython 247 WritingUnicodetexttoatextfile? 1499 CreatingasingletoninPython 3063 HowdoIdeleteafileorfolderinPython? 726 Howtomakeatimezoneawaredatetimeobject 1323 HowtomoveafileinPython? 101 ConvertUTF-8withBOMtoUTF-8withnoBOMinPython 61 ReadingUnicodefiledatawithBOMcharsinPython HotNetworkQuestions MakeaCourtTranscriber Canananimalfilealawsuitonitsownbehalf? Areyougettingtiredofregularcrosswords? Howdouncomputablenumbersrelatetouncomputablefunctions? Whataretheargumentsforrevengeandretribution? Whyistherealotofcurrentvariationattheoutputofabuckwhenabatteryisconnectedattheoutput? StrangeFruitfromTomatoPlant Howdocucumbershappen?Whatdoes"verypoorlypollinatedcucumber"meanexactly?Howcanpollinationbe"uneven"? ArethereanyspellsotherthanWishthatcanlocateanobjectthroughleadshielding? Doesindecentexposurerequireintentionality? sshhowtoallowaverylimiteduserwithnohometologinwithpubkey Whenisthefirstelementintheargumentlistregardedasafunctionsymbolandwhennot? DidMS-DOSeverdropabilitytosupportnon-IBMPCcompatiblemachines? AmIreallyrequiredtosetupanInheritedIRA? Could"nocloning"beusedasadefenceforquantumencryption? Adecimal-basedunitoftime LeavingaTTjobthenre-enteringacademia:Areaofbusinessandmanagement Whydoesn'ttheMBRS1100SchottkydiodehaveanexponentialI/Vcharacteristic? HowtofindthebordercrossingtimeofatraininEurope?(Czechbureaucracyedition) Changelinkcolorbasedinbackgroundcolor? Whyare"eat"and"drink"differentwordsinlanguages? WhytheneedforaScienceOfficeronacargovessel? Sortbycolumngroupandignoreothercolumnsfailingforthisexample,why? WhydidGodprohibitwearingofgarmentsofdifferentmaterialsinLeviticus19:19? morehotquestions Questionfeed SubscribetoRSS Questionfeed TosubscribetothisRSSfeed,copyandpastethisURLintoyourRSSreader. lang-py Yourprivacy Byclicking“Acceptallcookies”,youagreeStackExchangecanstorecookiesonyourdeviceanddiscloseinformationinaccordancewithourCookiePolicy. Acceptallcookies Customizesettings
延伸文章資訊
- 1utf-16le[BOM] to utf-8 file solution - GitHub
http://stackoverflow.com/questions/22459020/python-decode-utf-16-file-with-bom. import codecs. en...
- 2[python] 解決生成csv file編碼問題(with BOM) - JysBlog
當我們使用UTF-8生成csv時,並未在header生成BOM訊息,所以Excel會依照Unicode編碼讀取,就會有亂碼產生。 實作. 下面是簡單的生成csv的python程式:.
- 3在Python中將帶BOM的UTF - 程式人生
我想將它們(理想情況下)轉換為沒有BOM的UTF-8。似乎 codecs.StreamRecoder(stream, encode, decode, Reader, Writer, errors...
- 4Python: 關於Unicode 的BOM - 傑克! 真是太神奇了! - 痞客邦
至於UTF-8 編碼: 是將Unicode 編碼的字串資料轉成8 位元序列(轉換規則如下表: UTF-8 ... 寫檔時, 要依據需求自己先寫入一個BOM ( write('\ufeff') ).
- 5python utf8 bom,在Python中将没有BOM的UTF-8转换为带有 ...
I have a set of files which are usually UTF-8 with BOM. ... that can take any known Python encodi...