Byte Order Mark - IBM

文章推薦指數: 80 %
投票人數:10人

Byte Order Mark ... Unicode in the 16-bit UTF-16 form has no prescribed endian orientation for interchange. This requires communication processes to evaluate the ... ByteOrderMark Unicodeinthe16-bitUTF-16formhasnoprescribedendianorientation forinterchange.Thisrequirescommunicationprocessestoevaluate theendianorientationcorrectly.Toaidinthis,thecharacterU+FEFF ZEROWIDTHNO-BREAKSPACEcanbeusedasaByteOrderMark(BOM). Wheninterpretedintheincorrectendianorientation,itevaluates toU+FFFE,whichisdefinedasNOTACHARACTER. Someapplications,particularlyonWindowssystems, writeaBOMcharactertothestartofafile.InUTF-8,theBOMis thesequenceofbytesEFBBBF.Asabyte-orientedencoding,there arenoendianissueswithUTF-8,butsomeapplications(primarily onWindows)writetheBOM tothestartofaUTF-8encodedfile.AnIBM®Netezza®system doesnotloadtheBOMcodepoint;youcanusethe-bomswitch toremoveaninitialBOMcodepoint. YoucanremoveaBOMfromthestartofaUTF-8filebyusingthenzconvertcommand, asinthefollowingexample:nzconvert-futf8-tutf8-bom-dfinput_file-ofoutput_file WhenyouareconvertingfromortoUTF-16,youcanuseoneofthree converters:UTF16,UTF16be,orUTF16leastheinput(-foption) andoutput(-toption): UTF16 Asinput,Netezzachecks foraBOMtoindicateendianness;otherwise,Netezzainterprets theinputasbig-endian.Asoutput,Netezzawrites aBOMandoutputsinthenativeendiannessofthemachine.Whenconverting fromUTF-16toanyotherencoding,suchasUTF-8,theBOMisremoved. UTF16le Asinput,interpretstheinputaslittle-endian.Asoutput,Netezzaoutputs aslittle-endianwithoutaBOM.AnyBOMistreatedasdataandconverted, suchastoUTF-8. UTF16be Asinput,interpretsallinputasbig-endian.Asoutput,Netezzaconverts asbig-endianwithoutaBOM.AnyBOMistreatedasdataandconverted, suchastoUTF-8. Parenttopic:Convertlegacyformats



請為這篇文章評分?