UTF-8 vs UTF-8 with BOM - Super User

文章推薦指數: 80 %
投票人數:10人

The Unicode Standard permits the BOM in UTF-8, but does not require or recommend its use. Byte order has no meaning in UTF-8, so its only use in ... SuperUserisaquestionandanswersiteforcomputerenthusiastsandpowerusers.Itonlytakesaminutetosignup. Signuptojointhiscommunity Anybodycanaskaquestion Anybodycananswer Thebestanswersarevotedupandrisetothetop Home Public Questions Tags Users Companies Unanswered Teams StackOverflowforTeams –Startcollaboratingandsharingorganizationalknowledge. CreateafreeTeam WhyTeams? Teams CreatefreeTeam Teams Q&Aforwork Connectandshareknowledgewithinasinglelocationthatisstructuredandeasytosearch. LearnmoreaboutTeams UTF-8vsUTF-8withBOM AskQuestion Asked 2years,4monthsago Modified 3monthsago Viewed 10ktimes 4 ThelatestNotepad.exehasaSaveasUTF-8andUTF-8withBOM. IsUTF-8withBOMtheoldUTF?WhatisUTF-8now? windows-10notepad Share Improvethisquestion Follow askedMay21,2020at2:38 OldGeezerOldGeezer 1,15166goldbadges1616silverbadges3737bronzebadges 3 1 Differentsitebutsamequestionansweredhere:stackoverflow.com/questions/2223882/… – MC10 May21,2020at2:49 1 Thisansweralsoanswersthat.Noneedforthedownvoteeither;goodquestionforthissiteaswell. – Giacomo1968 May21,2020at3:28 docs.microsoft.com/en-us/windows/win32/api/winbase/… – Mark May21,2020at5:45 Addacomment  |  2Answers 2 Sortedby: Resettodefault Highestscore(default) Datemodified(newestfirst) Datecreated(oldestfirst) 7 UTF-8isUTF-8regardlessofwhetheraBOMexists. SavingafilewithaBOM(byteordermark)isnotreallyneededforUTF-8. ThefactthatNotepadallowsthesavingoffilesin“UTF-8”or“UTF-8withBOM”seemstobeanoptionthatexiststoallowflexibilityincaseswhereaBOM(byteordermark)isneeded.Butingeneral,justsavingthefilewithoutaBOM—meaningplainUTF-8—isreallythebestwaytohandletextfileswithUTF-8content. AsexplainedontheWikipediapageforbyteordermark: “BOMuseisoptional.ItspresenceinterfereswiththeuseofUTF-8bysoftwarethatdoesnotexpectnon-ASCIIbytesatthestartofafilebutthatcouldotherwisehandlethetextstream.” Andthearticledelvesdeeperintoitbystatingthefollowing;boldemphasisismine: “TheUTF-8representationoftheBOMisthe(hexadecimal)bytesequence0xEF,0xBB,0xBF. TheUnicodeStandardpermitstheBOMinUTF-8,butdoesnotrequireorrecommenditsuse.ByteorderhasnomeaninginUTF-8,soitsonlyuseinUTF-8istosignalatthestartthatthetextstreamisencodedinUTF-8,orthatitwasconvertedtoUTF-8fromastreamthatcontainedanoptionalBOM.ThestandardalsodoesnotrecommendremovingaBOMwhenitisthere,sothatround-trippingbetweenencodingsdoesnotloseinformation,andsothatcodethatreliesonitcontinuestowork.TheIETFrecommendsthatifaprotocoleither(a)alwaysusesUTF-8,or(b)hassomeotherwaytoindicatewhatencodingisbeingused,thenit"SHOULDforbiduseofU+FEFFasasignature." NotusingaBOMallowstexttobebackwards-compatiblewithsomesoftwarethatisnotUnicode-aware.Examplesincludeprogramminglanguagesthatpermitnon-ASCIIbytesinstringliteralsbutnotatthestartofthefile.” AsforwhyMicrosoftcaresaboutsavingUTF-8withaBOMinNotepad?Thisexplainsitwell;seemstobeaspecificrequirementofMicrosoftprogrammingtoolsandnotanyothernon-Microsofttooloutthere: “Microsoftcompilersandinterpreters,andmanypiecesofsoftwareonMicrosoftWindowssuchasNotepadtreattheBOMasarequiredmagicnumberratherthanuseheuristics.ThesetoolsaddaBOMwhensavingtextasUTF-8,andcannotinterpretUTF-8unlesstheBOMispresentorthefilecontainsonlyASCII.GoogleDocsalsoaddsaBOMwhenconvertingadocumenttoaplaintextfilefordownload.” SounlessyouexplicitlyneedtosaveaUTF-8filewithaBOMtobesetforafile,justdon’tworryaboutthatsavingoption. Share Improvethisanswer Follow editedJul3at22:00 answeredMay21,2020at3:54 Giacomo1968Giacomo1968 50.5k1818goldbadges158158silverbadges203203bronzebadges 4 2 Iwonderwhystandardizingonfilemetadatatospecifytheencodingtypeisapoorerchoicethanmakingeveryoneaddingalltheextralogictoinfertheactualencodinginuse. – OldGeezer May21,2020at3:58 @OldGeezerBecausemetadataanbefudgedand“lie.”Itisbettertocreateastandardthatdoesn’trequiremetadataforfilecontentparsingthanhopethateveryapplicationintheworld—newandold—canunderstandthatnewlyintroducedmetadata. – Giacomo1968 May21,2020at4:06 1 @OldGeezerMetadatadoesn'ttransferwell.Uploadyourfiletoawebsiteandallmetadata,exceptforfilename,islost.AndBOMisn'tperfecteither,it'sfineunlessanotherencodinghappenstointerpretitascorrectcharactersandyouhavetouseheuristicsanyway.Compatibilitywithlegacystandardsishard. – gronostaj May21,2020at8:45 AutoHotkeyrequiresBOMinitsconfigurationfile(ifyouuseextendedUTF-8characters).SoeventhoughNotepaddisplaysitcorrectwithoutBOM,itwillnotworkuntilyousaveitwith"UTF-8withBOM"encoding. – AxelBregnsbo Aug8at7:52 Addacomment  |  -3 Theotheransweriswrong.Itissomepoliticalthing. ANSIisthedefaulttextformatinWindowsandhasbeenfor36years. InWindowsfilesareassumedtobeANSI.ThereforeyoualwaysuseaBOM.Unixprogramsthatcan'thandleBOMsarenotUnicodecompliant. Iwritetexteditors.Iftheuserdoesn'tspecifyitisANSI-ALWAYS. AssumingyouwillgetBOMlessUnicodemeansyouhavetocallhttps://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-istextunicodetoguesstheformat.Hardlyproperprogramming. Share Improvethisanswer Follow editedMay22,2020at6:23 answeredMay21,2020at7:41 MarkMark 68633silverbadges33bronzebadges 9 5 "InWindowsfilesareassumedtobeANSI[...]Iftheuserdoesn'tspecifyitisANSI-ALWAYS"-eitheryou'rereferringtosomesubsetofWindowssoftware(andthisanswershouldclarifywhatsubsetitis)orthisisincorrect.AllcompetenttexteditorscanheuristicallydetectUTF-8withoutBOM,regardlessofplatform.EvenNotepaddoes(testedwithWindows10v1909build18363.836). – gronostaj May21,2020at7:58 4 That'syouropinion,notafact.I'veliterallycreatedaUTF-8filewithnon-ASCIIcharactersinSublimeText,confirmedinhexviewthatthereisnoBOMandsomecharactersareencodedmultibyte,andthenopenedthatfileinNotepad.Itworkedjustfine.Whetheryoulikeitornot,it'sjustnottruethatWindowssoftwareassumesANSIunlessindicatedotherwisebyBOM. – gronostaj May21,2020at8:21 2 Methodofcreatingthefileisirrelevant. – gronostaj May21,2020at8:39 2 Letmerepeat:NotepadwillcorrectlyopenanUTF-8filewithoutBOM,evenonWindows7SP1.Soit'snotassumingANSIsinceatleast2011.Your(original)openingsentenceisfactuallyincorrect.NotepadonWindows10willalsobydefaultsaveasUTF-8withoutBOM,soyour(new)openingsentenceisalsoincorrect. – gronostaj May22,2020at7:32 3 No,itwon't.UTF-8withoutBOMisthedefaultforsavinginWindows10v1909.Ialsodon'tseehowtheotheranswerisa"Unixanswer". – gronostaj May22,2020at8:49  |  Show4morecomments YourAnswer ThanksforcontributingananswertoSuperUser!Pleasebesuretoanswerthequestion.Providedetailsandshareyourresearch!Butavoid…Askingforhelp,clarification,orrespondingtootheranswers.Makingstatementsbasedonopinion;backthemupwithreferencesorpersonalexperience.Tolearnmore,seeourtipsonwritinggreatanswers. Draftsaved Draftdiscarded Signuporlogin SignupusingGoogle SignupusingFacebook SignupusingEmailandPassword Submit Postasaguest Name Email Required,butnevershown PostYourAnswer Discard Byclicking“PostYourAnswer”,youagreetoourtermsofservice,privacypolicyandcookiepolicy Nottheansweryou'relookingfor?Browseotherquestionstaggedwindows-10notepadoraskyourownquestion. TheOverflowBlog HowtoearnamillionreputationonStackOverflow:beofservicetoothers Therightwaytojobhop(Ep.495) FeaturedonMeta BookmarkshaveevolvedintoSaves Inboximprovements:markingnotificationsasread/unread,andafiltered... Linked 19 2exactcopiesofautorun.inf,oneworksonedoesnt 0 Howdoesthetextdecoderknowswhichtextencoderisusedtoencode? Related 19 Unicode,UnicodeBigEndianorUTF-8?Whatisthedifference?Whichformatisbetter? 4 HowcanIrestoreNotepadafterhavingitinfectedbyavirus? 10 Whydoesnotepadcrashondesktopfilesinthesave-asdialog? 7 Replacingnotepad.exeinWindows7 0 Killaprocesswhenuserpresses“logoff”,“shutdown”or“restart” 23 ChangingthedefaultANSItoUTF-8inNotepad 2 ConvertbetweenUTF-8to1255onlineandlocally? 2 Recoverthetextfromanotepaddumpfile 2 Runningnotepad.exelaunchesTextpad-IwantittolaunchNotepad HotNetworkQuestions ShouldIusepwdortildeplus(~+)? Awordfor"amessagetomyself" Whenisthefirstelementintheargumentlistregardedasafunctionsymbolandwhennot? Whyarefighterjetssoloudwhendoingslowflight? Workplaceidiomfor"beiGelegenheit"-ordertodoeventually,butdonotprovidepriority WhytheneedforaScienceOfficeronacargovessel? Whyare"eat"and"drink"differentwordsinlanguages? MakinganODEexact,whenformula'sofexactnessdonotprovideasolution Howtoelegantlyimplementthisoneusefulobject-orientedfeatureinMathematica? Sapiensdominabiturastris—isitnotPassivevoice? Howdouncomputablenumbersrelatetouncomputablefunctions? Findanddeletepartiallyduplicatelines Howtoremovetikznode? Isitcorrecttochangetheverbto"being"in"Despitenoonewashurtinthisincident…"? ArethereanyspellsotherthanWishthatcanlocateanobjectthroughleadshielding? MLmodellingwheretheoutputaffectstheDGP Canaphotonturnaprotonintoaneutron? DidMS-DOSeverdropabilitytosupportnon-IBMPCcompatiblemachines? MakeaCourtTranscriber sshhowtoallowaverylimiteduserwithnohometologinwithpubkey InD&D3.5,whathappenswhenyouplopaheadbandofintellectonananimal? 2016PutnamB6difficultsummationproblem WhydoNorthandSouthAmericancountriesoffercitizenshipbasedonunrestrictedJusSoli(rightofsoil)? WhydidGodprohibitwearingofgarmentsofdifferentmaterialsinLeviticus19:19? morehotquestions Questionfeed SubscribetoRSS Questionfeed TosubscribetothisRSSfeed,copyandpastethisURLintoyourRSSreader. Yourprivacy Byclicking“Acceptallcookies”,youagreeStackExchangecanstorecookiesonyourdeviceanddiscloseinformationinaccordancewithourCookiePolicy. Acceptallcookies Customizesettings  



請為這篇文章評分?