Encode a String to UTF-8 in Java - Baeldung
文章推薦指數: 80 %
Strings are immutable in Java, which means we cannot change a String character encoding. To achieve what we want, we need to copy the bytes ...
StartHereCourses ▼▲
RESTwithSpring
ThecanonicalreferenceforbuildingaproductiongradeAPIwithSpring
LearnSpringSecurity ▼▲
THEuniqueSpringSecurityeducationifyou’reworkingwithJavatoday
LearnSpringSecurityCore
FocusontheCoreofSpringSecurity5
LearnSpringSecurityOAuth
FocusonthenewOAuth2stackinSpringSecurity5
LearnSpring
Fromnoexperiencetoactuallybuildingstuff
LearnSpringDataJPA
ThefullguidetopersistencewithSpringDataJPA
Guides ▼▲
Persistence
ThePersistencewithSpringguides
REST
TheguidesonbuildingRESTAPIswithSpring
Security
TheSpringSecurityguides
About ▼▲
FullArchive
Thehighleveloverviewofallthearticlesonthesite.
BaeldungEbooks
DiscoverallofoureBooks
WriteforBaeldung
Becomeawriteronthesite
AboutBaeldung
AboutBaeldung.
JavaTop
GetstartedwithSpring5andSpringBoot2,throughtheLearnSpringcourse:
>CHECKOUTTHECOURSE
1.Overview
WhendealingwithStringsinJava,wesometimesneedtoencodethemintoaspecificcharset.
Furtherreading:GuidetoCharacterEncodingExplorecharacterencodinginJavaandlearnaboutcommonpitfalls.Readmore→GuidetoJavaURLEncoding/DecodingThearticlediscussesURLencodinginJava,somepitfalls,andhowtoavoidthem.Readmore→JavaBase64EncodingandDecodingHowtodoBase64encodinganddecodinginJava,usingthenewAPIsintroducedinJava8aswellasApacheCommons.Readmore→
ThistutorialisapracticalguideshowingdifferentwaystoencodeaStringtotheUTF-8charset.
Foramoretechnicaldeep-dive,seeourGuidetoCharacterEncoding.
2.DefiningtheProblem
ToshowcasetheJavaencoding,we'llworkwiththeGermanString“EntwickelnSiemitVergnügen”:
StringgermanString="EntwickelnSiemitVergnügen";
byte[]germanBytes=germanString.getBytes();
StringasciiEncodedString=newString(germanBytes,StandardCharsets.US_ASCII);
assertNotEquals(asciiEncodedString,germanString);
ThisStringencodedusingUS_ASCIIgivesusthevalue“EntwickelnSiemitVergn?gen”whenprintedbecauseitdoesn'tunderstandthenon-ASCIIücharacter.
ButwhenweconvertanASCII-encodedStringthatusesallEnglishcharacterstoUTF-8,wegetthesamestring:
StringenglishString="Developwithpleasure";
byte[]englishBytes=englishString.getBytes();
StringasciiEncondedEnglishString=newString(englishBytes,StandardCharsets.US_ASCII);
assertEquals(asciiEncondedEnglishString,englishString);
Let'sseewhathappenswhenweusetheUTF-8encoding.
3.EncodingWithCoreJava
Let'sstartwiththecorelibrary.
StringsareimmutableinJava,whichmeanswecannotchangeaStringcharacterencoding. Toachievewhatwewant,weneedtocopythebytesoftheStringandthencreateanewonewiththedesiredencoding.
First,wegettheStringbytes,andthenwecreateanewoneusingtheretrievedbytesandthedesiredcharset:
StringrawString="EntwickelnSiemitVergnügen";
byte[]bytes=rawString.getBytes(StandardCharsets.UTF_8);
Stringutf8EncodedString=newString(bytes,StandardCharsets.UTF_8);
assertEquals(rawString,utf8EncodedString);
4.EncodingWithJava7StandardCharsets
Alternatively,wecanusetheStandardCharsetsclassintroducedinJava7toencodetheString.
First,we'lldecodetheStringintobytes,andsecond,we'llencodetheStringtoUTF-8:
StringrawString="EntwickelnSiemitVergnügen";
ByteBufferbuffer=StandardCharsets.UTF_8.encode(rawString);
Stringutf8EncodedString=StandardCharsets.UTF_8.decode(buffer).toString();
assertEquals(rawString,utf8EncodedString);
5.EncodingWithCommons-Codec
BesidesusingcoreJava,wecanalternativelyuseApacheCommonsCodec toachievethesameresults.
ApacheCommonsCodecisahandypackagecontainingsimpleencodersanddecodersforvariousformats.
First,let'sstartwiththeprojectconfiguration.
WhenusingMaven,wehavetoaddthecommons-codecdependencytoourpom.xml:
延伸文章資訊
- 1Encode String in UTF-8 in Java | Delft Stack
Encode String in UTF-8 in Java · Encode a String to UTF-8 by Converting It to Bytes Array and Usi...
- 2Java String - Jenkov.com
- 3Byte Encodings and Strings (The Java™ Tutorials ...
Byte Encodings and Strings ... If a byte array contains non-Unicode text, you can convert the tex...
- 4Java String Encoding - Javatpoint
- 5Encode a String to UTF-8 in Java - Stack Abuse
Encoding a String in Java simply means injecting certain bytes into the byte array that constitut...