Byte Encodings and Strings (The Java™ Tutorials ...
文章推薦指數: 80 %
Byte Encodings and Strings ... If a byte array contains non-Unicode text, you can convert the text to Unicode with one of the String constructor methods.
Documentation
TheJava™Tutorials
HideTOC
WorkingwithText
CheckingCharacterProperties
ComparingStrings
PerformingLocale-IndependentComparisons
CustomizingCollationRules
ImprovingCollationPerformance
Unicode
Terminology
SupplementaryCharactersasSurrogates
CharacterandStringAPIs
SampleUsage
DesignConsiderations
MoreInformation
DetectingTextBoundaries
AbouttheBreakIteratorClass
CharacterBoundaries
WordBoundaries
SentenceBoundaries
LineBoundaries
ConvertingLatinDigitstoOtherUnicodeDigits
ConvertingNon-UnicodeText
ByteEncodingsandStrings
CharacterandByteStreams
NormalizingText
WorkingwithBidirectionalTextwiththeJTextComponentClass
Trail:Internationalization
Lesson:WorkingwithText
Section:ConvertingNon-UnicodeText
HomePage
>
Internationalization
>
WorkingwithText
« Previous • Trail • Next »
TheJavaTutorialshavebeenwrittenforJDK8.Examplesandpracticesdescribedinthispagedon'ttakeadvantageofimprovementsintroducedinlaterreleasesandmightusetechnologynolongeravailable.SeeJavaLanguageChangesforasummaryofupdatedlanguagefeaturesinJavaSE9andsubsequentreleases.SeeJDKReleaseNotesforinformationaboutnewfeatures,enhancements,andremovedordeprecatedoptionsforallJDKreleases.
ByteEncodingsandStrings
Ifabytearraycontainsnon-Unicodetext,youcanconvertthetexttoUnicodewithoneoftheStringconstructormethods.Conversely,youcanconvertaStringobjectintoabytearrayofnon-UnicodecharacterswiththeString.getBytesmethod.Wheninvokingeitherofthesemethods,youspecifytheencodingidentifierasoneoftheparameters.
TheexamplethatfollowsconvertscharactersbetweenUTF-8andUnicode.UTF-8isatransmissionformatforUnicodethatissafeforUNIXfilesystems.Thefullsourcecodefortheexampleisinthefile
StringConverter.java.
TheStringConverterprogramstartsbycreatingaStringcontainingUnicodecharacters:
Stringoriginal=newString("A"+"\u00ea"+"\u00f1"+"\u00fc"+"C");
Whenprinted,theStringnamedoriginalappearsas:
AêñüC
ToconverttheStringobjecttoUTF-8,invokethegetBytesmethodandspecifytheappropriateencodingidentifierasaparameter.ThegetBytesmethodreturnsanarrayofbytesinUTF-8format.TocreateaStringobjectfromanarrayofnon-Unicodebytes,invoketheStringconstructorwiththeencodingparameter.Thecodethatmakesthesecallsisenclosedinatryblock,incasethespecifiedencodingisunsupported:
try{
byte[]utf8Bytes=original.getBytes("UTF8");
byte[]defaultBytes=original.getBytes();
StringroundTrip=newString(utf8Bytes,"UTF8");
System.out.println("roundTrip="+roundTrip);
System.out.println();
printBytes(utf8Bytes,"utf8Bytes");
System.out.println();
printBytes(defaultBytes,"defaultBytes");
}
catch(UnsupportedEncodingExceptione){
e.printStackTrace();
}
TheStringConverterprogramprintsoutthevaluesintheutf8BytesanddefaultBytesarraystodemonstrateanimportantpoint:Thelengthoftheconvertedtextmightnotbethesameasthelengthofthesourcetext.SomeUnicodecharacterstranslateintosinglebytes,othersintopairsortripletsofbytes.
TheprintBytesmethoddisplaysthebytearraysbyinvokingthebyteToHexmethod,whichisdefinedinthesourcefile,
UnicodeFormatter.java.HereistheprintBytesmethod:
publicstaticvoidprintBytes(byte[]array,Stringname){
for(intk=0;k
延伸文章資訊
- 1Encode String in UTF-8 in Java | Delft Stack
Encode String in UTF-8 in Java · Encode a String to UTF-8 by Converting It to Bytes Array and Usi...
- 2Byte Encodings and Strings (The Java™ Tutorials ...
Byte Encodings and Strings ... If a byte array contains non-Unicode text, you can convert the tex...
- 3Java String Encoding - Javatpoint
In Java, when we deal with String sometimes it is required to encode a string in a specific chara...
- 4Encode a String to UTF-8 in Java - Stack Abuse
Encoding a String in Java simply means injecting certain bytes into the byte array that constitut...
- 5Encode String to UTF-8 - java - Stack Overflow
A Java String is internally always encoded in UTF-16 - but you really should think about it like ...