Remove \ufeff from a string in Python | bobbyhadz
文章推薦指數: 80 %
Use the str.replace() method to remove \ufeff BOM character from a string, e.g. result = my_str.replace('\ufeff', '') . ☰HomeBookAboutContactsHomeBookAboutContactsGitHubLinkedinTwitterRemove\ufefffromastringinPythonBorislavHadzhievLastupdated:Aug14,2022PhotofromUnsplashRemove\ufefffromastringinPython#Usethestr.replace()methodtoremove\ufeffBOMcharacterfromastring, e.g.result=my_str.replace('\ufeff','').Thereplace()methodwillremove the\ufeffcharacterfromthestringbyreplacingitwithanemptystring.main.pyCopied!#✅remove\ufefffromastring my_str='\ufefffirstline' result=my_str.replace('\ufeff','') print(repr(result))#👉️'firstline' #----------------------------------------- #✅remove\ufeffwhenreadingfromafile #👇️explicitlysetencodingtoutf-8-sig withopen('example.txt','r',encoding='utf-8-sig')asf: lines=f.readlines() print(lines) The\ufeffcharacterisabyteordermark(BOM)andisinterpretedasa zero-widthnon-breakingspace.TheBOMcharactercausesanissuewhenweuseanincorrectcodectodecodebytesthatwereencodedusingadifferentcodec.IfyouhaveastringthatcontainsaBOMcharacter,usethestr.replace() methodtoremoveit.main.pyCopied!my_str='\ufefffirstline' result=my_str.replace('\ufeff','') print(repr(result))#👉️'firstline' Thestr.replace methodreturnsacopyofthestringwithalloccurrencesofasubstringreplaced bytheprovidedreplacement.Themethodtakesthefollowingparameters:NameDescriptionoldThesubstringwewanttoreplaceinthestringnewThereplacementforeachoccurrenceofoldcountOnlythefirstcountoccurrencesarereplaced(optional)Themethoddoesn'tchangetheoriginalstring.StringsareimmutableinPython.Ifyougottheerror"UnicodeEncodeError:'ascii'codeccan'tencodecharacteru'\ufeff'"whentryingtoreadfromafile,explicitlysettheencodingkeywordargumenttoutf-8-sig.main.pyCopied!withopen('example.txt','r',encoding='utf-8-sig')asf: lines=f.readlines() print(lines) Theopen()functiontakesanencodingkeywordargument,whichcanbesetto utf-8-sigtotreatthebyteordermarkasmetadatainsteadofastring.Whendecoding,theutf-8-sigcodecskipstheBOMbyteifitappearsasthe firstbyteinthefile.Whenusingtheutf-8encoding,theuseofthebyteordermark(BOM)is discouragedandshouldbeavoided.IwroteabookinwhichIshareeverythingIknowabouthowtobecomeabetter,moreefficientprogrammer.YoucanusethesearchfieldonmyHomePagetofilterthroughallofmyarticles.ShareShareShareShareShareBorislavHadzhievWebDeveloperTwitterGitHubLinkedinSUPPORTME:)AboutContactsPolicyTerms&ConditionsTwitterGitHubLinkedinCopyright©2022BorislavHadzhievSearchforposts0..................................................................................................................................................................
延伸文章資訊
- 1Remove \ufeff from a string in Python | bobbyhadz
Use the str.replace() method to remove \ufeff BOM character from a string, e.g. result = my_str.r...
- 2<U+FEFF> character showing up in files. How to remove them?
1) In your terminal, open the file using vim: vim file_name. 2) Remove all BOM characters: :set n...
- 3linux文件格式转换:<U+FEFF> character showing up ... - 博客园
You can easily remove them using vim, here are the steps: 1) In your terminal, open the file usin...
- 4How can I remove the BOM from a UTF-8 file?
Oddly with vim 8 on a mac, I have a csv utf-8 file made by Excel and it starts with <feff> , yet ...
- 5php - Can't remove \ufeff from a string - Stack Overflow
3) I open the file with fopen() and read the file with fgetcsv() . The first column it always hav...