Byte Encodings and Strings (The Java™ Tutorials ...

文章推薦指數: 80 %
投票人數:10人

Byte Encodings and Strings ... If a byte array contains non-Unicode text, you can convert the text to Unicode with one of the String constructor methods. Documentation TheJava™Tutorials HideTOC WorkingwithText CheckingCharacterProperties ComparingStrings PerformingLocale-IndependentComparisons CustomizingCollationRules ImprovingCollationPerformance Unicode Terminology SupplementaryCharactersasSurrogates CharacterandStringAPIs SampleUsage DesignConsiderations MoreInformation DetectingTextBoundaries AbouttheBreakIteratorClass CharacterBoundaries WordBoundaries SentenceBoundaries LineBoundaries ConvertingLatinDigitstoOtherUnicodeDigits ConvertingNon-UnicodeText ByteEncodingsandStrings CharacterandByteStreams NormalizingText WorkingwithBidirectionalTextwiththeJTextComponentClass Trail:Internationalization Lesson:WorkingwithText Section:ConvertingNon-UnicodeText HomePage > Internationalization > WorkingwithText « Previous • Trail • Next » TheJavaTutorialshavebeenwrittenforJDK8.Examplesandpracticesdescribedinthispagedon'ttakeadvantageofimprovementsintroducedinlaterreleasesandmightusetechnologynolongeravailable.SeeJavaLanguageChangesforasummaryofupdatedlanguagefeaturesinJavaSE9andsubsequentreleases.SeeJDKReleaseNotesforinformationaboutnewfeatures,enhancements,andremovedordeprecatedoptionsforallJDKreleases. ByteEncodingsandStrings Ifabytearraycontainsnon-Unicodetext,youcanconvertthetexttoUnicodewithoneoftheStringconstructormethods.Conversely,youcanconvertaStringobjectintoabytearrayofnon-UnicodecharacterswiththeString.getBytesmethod.Wheninvokingeitherofthesemethods,youspecifytheencodingidentifierasoneoftheparameters. TheexamplethatfollowsconvertscharactersbetweenUTF-8andUnicode.UTF-8isatransmissionformatforUnicodethatissafeforUNIXfilesystems.Thefullsourcecodefortheexampleisinthefile StringConverter.java. TheStringConverterprogramstartsbycreatingaStringcontainingUnicodecharacters: Stringoriginal=newString("A"+"\u00ea"+"\u00f1"+"\u00fc"+"C"); Whenprinted,theStringnamedoriginalappearsas: AêñüC ToconverttheStringobjecttoUTF-8,invokethegetBytesmethodandspecifytheappropriateencodingidentifierasaparameter.ThegetBytesmethodreturnsanarrayofbytesinUTF-8format.TocreateaStringobjectfromanarrayofnon-Unicodebytes,invoketheStringconstructorwiththeencodingparameter.Thecodethatmakesthesecallsisenclosedinatryblock,incasethespecifiedencodingisunsupported: try{ byte[]utf8Bytes=original.getBytes("UTF8"); byte[]defaultBytes=original.getBytes(); StringroundTrip=newString(utf8Bytes,"UTF8"); System.out.println("roundTrip="+roundTrip); System.out.println(); printBytes(utf8Bytes,"utf8Bytes"); System.out.println(); printBytes(defaultBytes,"defaultBytes"); } catch(UnsupportedEncodingExceptione){ e.printStackTrace(); } TheStringConverterprogramprintsoutthevaluesintheutf8BytesanddefaultBytesarraystodemonstrateanimportantpoint:Thelengthoftheconvertedtextmightnotbethesameasthelengthofthesourcetext.SomeUnicodecharacterstranslateintosinglebytes,othersintopairsortripletsofbytes. TheprintBytesmethoddisplaysthebytearraysbyinvokingthebyteToHexmethod,whichisdefinedinthesourcefile, UnicodeFormatter.java.HereistheprintBytesmethod: publicstaticvoidprintBytes(byte[]array,Stringname){ for(intk=0;k



請為這篇文章評分?