How to add a UTF-8 BOM in Java? - Stack Overflow

文章推薦指數: 80 %
投票人數:10人

As noted in section 23.8 of the Unicode 9 specification, the BOM for UTF-8 is EF BB BF . That sequence is what you get when using UTF-8 encoding on '\ufeff' ... Home Public Questions Tags Users Companies Collectives ExploreCollectives Teams StackOverflowforTeams –Startcollaboratingandsharingorganizationalknowledge. CreateafreeTeam WhyTeams? Teams CreatefreeTeam Collectives™onStackOverflow Findcentralized,trustedcontentandcollaboratearoundthetechnologiesyouusemost. LearnmoreaboutCollectives Teams Q&Aforwork Connectandshareknowledgewithinasinglelocationthatisstructuredandeasytosearch. LearnmoreaboutTeams HowtoaddaUTF-8BOMinJava? AskQuestion Asked 11years,10monthsago Modified 28daysago Viewed 86ktimes 28 IhaveaJavastoredprocedurewhichfetchesrecordfromthetableusingResultsetobjectandcreatesaCSVfile. BLOBretBLOB=BLOB.createTemporary(conn,true,BLOB.DURATION_SESSION); retBLOB.open(BLOB.MODE_READWRITE); OutputStreambOut=retBLOB.setBinaryStream(0L); ZipOutputStreamzipOut=newZipOutputStream(bOut); PrintStreamout=newPrintStream(zipOut,false,"UTF-8"); out.write('\ufeff'); out.flush(); zipOut.putNextEntry(newZipEntry("filename.csv")); while(rs.next()){ out.print("\""+rs.getString(i)+"\""); out.print(","); } out.flush(); zipOut.closeEntry(); zipOut.close(); retBLOB.close(); returnretBLOB; ButthegeneratedCSVfiledoesn'tshowthecorrectGermancharacter.OracledatabasealsohasaNLS_CHARACTERSETvalueofUTF8. Pleasesuggest. javacharacter-encodingoracle10gbyte-order-mark Share Improvethisquestion Follow editedMar28,2020at19:41 informatik01 15.8k1010goldbadges7373silverbadges102102bronzebadges askedDec8,2010at15:10 FaddFadd 76022goldbadges88silverbadges1919bronzebadges 8 2 Justincaseyouhaven'tcomeacrossthisbefore,notethattheUnicodestandarddoesnotrequireorrecommendusingaBOMwithUTF-8.Itisn'tillegal,either,butshouldn'tbeusedindiscriminately.Seehereforthedetails,includingsomeguidelinesonwhenandwheretouseit.IfyouaretryingtoviewthecsvfileinWindows,thisisprobablyavaliduseoftheBOM. – MarceloCantos Dec8,2010at15:16 Yes,wearetryingtotheviewthecsvinWindows,butthegeneratedcsvstillshowsgarbledcharacterforgermancharacters.IsthistherightwaytosettheBOM? – Fadd Dec8,2010at15:20 Yes,that’sright.TheUnicodestandardrecommendsagainstusingaso-calledBOM(itisn’treally)withUTF-8. – tchrist Dec8,2010at17:05 4 @tchrist:itrecommendsagainstusingaBOMwhendealingwithsoftwareandprotocolsthatexceptsASCII-onlychars.IftheOPknowsthattheWindowssoftwarehe'susingwillusetheBOMtodetectthatthefileisactuallyencodedinUTF-8(wedon'tcareaboutthefactthatitain'taBOM,wecareaboutthefactthatitcanallowsomesoftwaretodetectthattheencodingisUTF-8).AlsonotethatifyouhadaBOMtoUTF-8andsomesoftwarefail,thenthesesoftwarearebroken,becauseaBOMatthebeginningofanUTF-8isperfectlyvalid. – SyntaxT3rr0r Dec8,2010at17:20 4 ForthecompletenessoftheBOMdiscussion.Excel2003strictlyrequirestheBOMinUTF-8encodedCSVfiles.Otherwisemultibytecharsareunreadable. – Michael-O Jan9,2012at14:00  |  Show3morecomments 8Answers 8 Sortedby: Resettodefault Highestscore(default) Trending(recentvotescountmore) Datemodified(newestfirst) Datecreated(oldestfirst) 78 BufferedWriterout=newBufferedWriter(newOutputStreamWriter(newFileOutputStream(...),StandardCharsets.UTF_8)); out.write('\ufeff'); out.write(...); Thiscorrectlywritesout0xEF0xBB0xBFtothefile,whichistheUTF-8representationoftheBOM. Share Improvethisanswer Follow editedJul5,2017at12:39 JulienH.-SonarSourceTeam 5,0441919silverbadges2424bronzebadges answeredNov14,2011at11:18 astroastro 77511goldbadge55silverbadges33bronzebadges 1 4 Thiscodeissensitivetodefaultplatformencoding.OnWindows,Iendedupwith0x3Fwrittentothefile.ThecorrectwaytogettheBufferedWriteris:BufferedWriterout=newBufferedWriter(newOutputStreamWriter(newFileOutputStream(theFile),StandardCharsets.UTF_8)) – JulienH.-SonarSourceTeam Jul5,2017at12:22 Addacomment  |  17 JustincasepeopleareusingPrintStreams,youneedtodoitalittledifferently.WhileaWriterwilldosomemagictoconvertasinglebyteinto3bytes,aPrintStreamrequiresall3bytesoftheUTF-8BOMindividually: //Printutf-8BOM PrintStreamout=System.out; out.write('\ufeef');//emits0xef out.write('\ufebb');//emits0xbb out.write('\ufebf');//emits0xbf Alternatively,youcanusethehexvaluesforthosedirectly: PrintStreamout=System.out; out.write(0xef);//emits0xef out.write(0xbb);//emits0xbb out.write(0xbf);//emits0xbf Share Improvethisanswer Follow answeredMar30,2016at14:29 ChristopherSchultzChristopherSchultz 19.6k99goldbadges5959silverbadges7777bronzebadges Addacomment  |  12 TowriteaBOMinUTF-8youneedPrintStream.print(),notPrintStream.write(). AlsoifyouwanttohaveBOMinyourcsvfile,IguessyouneedtoprintaBOMafterputNextEntry(). Share Improvethisanswer Follow answeredDec8,2010at15:41 axtavtaxtavt 236k4141goldbadges501501silverbadges476476bronzebadges 3 Aren’tallPrintStreamsfundamentallyflawedbecausetheydiscardallerrorsthatmayoccuronthestream,includingI/Oerrors,fullfilesystems,networkinterruptions,andencodingmismatches?Ifthisisnottrue,couldyoupleasetellmehowtomakethemreliable(becauseIwanttousethem)?Butifitistrue,couldyoupleaseexplainwhenitcouldeverbeappropriatetouseanoutputmethodthatsuppressescorrectnessconcerns?Thisisaseriousquestion,becauseIdon’tunderstandwhythiswassetuptobesodangerous.Thanksforanyinsights. – tchrist Dec8,2010at17:09 @tchrist-itistruethatPrintStreamssuppresserrors.However...1)theyarenotentirelydiscarded-youcanchecktoseeifanerrorhasoccurred.2)Therearecaseswhereyoudon'tneedtoknowabouterrors.Anindisputablecaseiswhenyouaresendingcharacterstoastreamthatiswritingtoanin-memorybuffer. – StephenC Jan15,2013at22:46 @tchristIguess,thisisallcausedbyusingcheckedexceptions.Normally,you'djustthrowonanyerrorandbehappy.YoucouldmakeanexistingPrintStream"safe"bywrappingeachcallandaddingcheckErrorandconditionallythrow.Buttheinformationabouttheexceptionislost.Soyes,PrintStreamisahopelesscrap. – maaartinus Jul16,2014at10:15 Addacomment  |  11 PrintStream#print Ithinkthatout.write('\ufeff');shouldactuallybeout.print('\ufeff');,callingthejava.io.PrintStream#printmethod. Accordingthejavadoc,thewrite(int)methodactuallywritesabyte...withoutanycharacterencoding.Soout.write('\ufeff');writesthebyte0xff.Bycontrast,theprint(char)methodencodesthecharacterasoneorbytesusingthestream'sencoding,andthenwritesthosebytes. Asnotedinsection23.8oftheUnicode9specification,theBOMforUTF-8isEFBBBF.ThatsequenceiswhatyougetwhenusingUTF-8encodingon'\ufeff'.See:WhyUTF-8BOMbytesefbbbfcanbereplacedby\ufeff?. Share Improvethisanswer Follow editedJul28,2021at22:44 BasilBourque 276k9292goldbadges785785silverbadges10641064bronzebadges answeredDec8,2010at15:42 StephenCStephenC 679k9292goldbadges780780silverbadges11851185bronzebadges 2 Isn’ttheonlysafewaytodoencodedoutputinJavaistousetherarely-seenOutputStreamWriter(OutputStreamout,CharsetEncoderenc)foroftheconstructor,theonlyoneofthefourwithanexplicitCharsetEncoderargument,andneverusingthePrintStreamthatyou’verecommendedhere? – tchrist Dec8,2010at17:13 1 @tchrist-1)No.2)Ididn'trecommendPrintStream.IsimplysaidhowtodowhattheOPaskedtodousingthePrintStreamhewasalreadyusing.3)InthiscasePrintStreamshouldbesafebecausebecauseitisfollowedbyotheractionsthatwillcausewritestotheunderlyingstream(socket)andthrowanexceptionifthepreviousPrintStreamwriteshadsilentlyfailed. – StephenC Jan15,2013at22:54 Addacomment  |  7 YouAddThisForFirstOfCSVString StringCSV=""; byte[]BOM={(byte)0xEF,(byte)0xBB,(byte)0xBF}; CSV=newString(BOM)+CSV; ThisWorkForMe. Share Improvethisanswer Follow editedFeb11,2021at14:20 answeredJul15,2020at15:48 SilentSilent 10533silverbadges88bronzebadges Addacomment  |  2 Ifyoujustwantto modifythesamefile (withoutnewfileanddeleteoldoneasIhadissueswiththat) privatevoidaddBOM(FilefileInput)throwsIOException{ try(RandomAccessFilefile=newRandomAccessFile(fileInput,"rws")){ byte[]text=newbyte[(int)file.length()]; file.readFully(text); file.seek(0); byte[]bom={(byte)0xEF,(byte)0xBB,(byte)0xBF}; file.write(bom); file.write(text); } } Share Improvethisanswer Follow editedSep13at11:21 answeredJun24,2021at14:03 timguytimguy 1,76111goldbadge1818silverbadges3333bronzebadges Addacomment  |  0 Inmycaseitworkswiththecode: PrintWriterout=newPrintWriter(newFile(filePath),"UTF-8"); out.write(csvContent); out.flush(); out.close(); Share Improvethisanswer Follow answeredDec19,2013at9:01 RocioRocio 11 Addacomment  |  0 HereasimplewaytoappendBOMheaderonanyfile: privatestaticvoidappendBOM(Filefile)throwsException{ FilebomFile=newFile(file+".bom"); try(FileOutputStreamoutput=newFileOutputStream(bomFile,true)){ byte[]bytes=FileUtils.readFileToByteArray(file); output.write('\ufeef');//emits0xef output.write('\ufebb');//emits0xbb output.write('\ufebf');//emits0xbf output.write(bytes); output.flush(); } file.delete(); bomFile.renameTo(file); } Share Improvethisanswer Follow answeredDec22,2020at15:24 DavidDavid 16611silverbadge66bronzebadges Addacomment  |  YourAnswer ThanksforcontributingananswertoStackOverflow!Pleasebesuretoanswerthequestion.Providedetailsandshareyourresearch!Butavoid…Askingforhelp,clarification,orrespondingtootheranswers.Makingstatementsbasedonopinion;backthemupwithreferencesorpersonalexperience.Tolearnmore,seeourtipsonwritinggreatanswers. Draftsaved Draftdiscarded Signuporlogin SignupusingGoogle SignupusingFacebook SignupusingEmailandPassword Submit Postasaguest Name Email Required,butnevershown PostYourAnswer Discard Byclicking“PostYourAnswer”,youagreetoourtermsofservice,privacypolicyandcookiepolicy Nottheansweryou'relookingfor?Browseotherquestionstaggedjavacharacter-encodingoracle10gbyte-order-markoraskyourownquestion. TheOverflowBlog HowtoearnamillionreputationonStackOverflow:beofservicetoothers Therightwaytojobhop(Ep.495) FeaturedonMeta BookmarkshaveevolvedintoSaves Inboximprovements:markingnotificationsasread/unread,andafiltered... Revieweroverboard!Orarequesttoimprovetheonboardingguidancefornew... CollectivesUpdate:RecognizedMembers,Articles,andGitLab Shouldweburninatethe[script]tag? Linked 39 settingaUTF-8injavaandcsvfile 2 CharacterencodingUTFandISO-8859-1inCSV 0 AddBOMinthebeginningofaString 198 MicrosoftExcelmanglesDiacriticsin.csvfiles? 28 What'sthebestwaytoexportUTF8dataintoExcel? 6 WhyUTF-8BOMbytesefbbbfcanbereplacedby\ufeff? 3 Java.Appendingstringtofile,endedwithstrangeoutput 1 ExcelnotshowingEurosymbolcorrectlyingeneratedCSVfile 2 javacreatingfileweirdcharacterinnotepad 1 HowtoaddaUTF-8BOMinKotlin? Seemorelinkedquestions Related 4193 WhatarethedifferencesbetweenaHashMapandaHashtableinJava? 7539 IsJava"pass-by-reference"or"pass-by-value"? 3839 HowdoIefficientlyiterateovereachentryinaJavaMap? 4319 AvoidingNullPointerExceptioninJava 4567 HowdoIread/convertanInputStreamintoaStringinJava? 3502 WhentouseLinkedListoverArrayListinJava? 3971 HowdoIgeneraterandomintegerswithinaspecificrangeinJava? 974 What'sthedifferencebetweenUTF-8andUTF-8withBOM? 3409 HowdoIconvertaStringtoanintinJava? 3623 HowcanIcreateamemoryleakinJava? HotNetworkQuestions WhydidGodprohibitwearingofgarmentsofdifferentmaterialsinLeviticus19:19? WherewasthisneonsignofadragondisplayedinLosAngelesinthe1990s?Isitstilltherenow? Howdouncomputablenumbersrelatetouncomputablefunctions? WhydoNorthandSouthAmericancountriesoffercitizenshipbasedonunrestrictedJusSoli(rightofsoil)? Myfavoriteanimalisa-singularandpluralform Howdoyoucalculatethetimeuntilthesteady-stateofadrug? Doyoupayforthebreakfastinadvance? UnderstandingElectricFieldsLinesandhowtheyshow‘like’chargesrepelling ArethereanyspellsotherthanWishthatcanlocateanobjectthroughleadshielding? Isitcorrecttochangetheverbto"being"in"Despitenoonewashurtinthisincident…"? Idon'tunderstandif"per"meaningexactamountforeachunitordoesitmean"onaverage" AmIreallyrequiredtosetupanInheritedIRA? Howtoremovetikznode? ElectronicCircuitsforSafeInitiationofPyrotechnics? sshhowtoallowaverylimiteduserwithnohometologinwithpubkey Canananimalfilealawsuitonitsownbehalf? WhatdothecolorsindicateonthisKC135tankerboom? Howdocucumbershappen?Whatdoes"verypoorlypollinatedcucumber"meanexactly?Howcanpollinationbe"uneven"? Sapiensdominabiturastris—isitnotPassivevoice? Theunusualphrasing"verb+the+comparativeadjective"intheLordoftheRingsnovels circuitikz:Addingarrowheadtotapofvariableinductance? I2C(TWI)vsSPIEMInoiseresistance WhathappenswhenthequasarremnantsreachEarthin3millionyears? Whydostringhashcodeschangeforeachexecutionin.NET? morehotquestions Questionfeed SubscribetoRSS Questionfeed TosubscribetothisRSSfeed,copyandpastethisURLintoyourRSSreader. lang-java Yourprivacy Byclicking“Acceptallcookies”,youagreeStackExchangecanstorecookiesonyourdeviceanddiscloseinformationinaccordancewithourCookiePolicy. Acceptallcookies Customizesettings  



請為這篇文章評分?