Encode a String to UTF-8 in Java - Baeldung
文章推薦指數: 80 %
Strings are immutable in Java, which means we cannot change a String character encoding. To achieve what we want, we need to copy the bytes ...
StartHereCourses ▼▲
RESTwithSpring
ThecanonicalreferenceforbuildingaproductiongradeAPIwithSpring
LearnSpringSecurity ▼▲
THEuniqueSpringSecurityeducationifyou’reworkingwithJavatoday
LearnSpringSecurityCore
FocusontheCoreofSpringSecurity5
LearnSpringSecurityOAuth
FocusonthenewOAuth2stackinSpringSecurity5
LearnSpring
Fromnoexperiencetoactuallybuildingstuff
LearnSpringDataJPA
ThefullguidetopersistencewithSpringDataJPA
Guides ▼▲
Persistence
ThePersistencewithSpringguides
REST
TheguidesonbuildingRESTAPIswithSpring
Security
TheSpringSecurityguides
About ▼▲
FullArchive
Thehighleveloverviewofallthearticlesonthesite.
BaeldungEbooks
DiscoverallofoureBooks
WriteforBaeldung
Becomeawriteronthesite
AboutBaeldung
AboutBaeldung.
JavaTop
GetstartedwithSpring5andSpringBoot2,throughtheLearnSpringcourse:
>CHECKOUTTHECOURSE
1.Overview
WhendealingwithStringsinJava,wesometimesneedtoencodethemintoaspecificcharset.
Furtherreading:GuidetoCharacterEncodingExplorecharacterencodinginJavaandlearnaboutcommonpitfalls.Readmore→GuidetoJavaURLEncoding/DecodingThearticlediscussesURLencodinginJava,somepitfalls,andhowtoavoidthem.Readmore→JavaBase64EncodingandDecodingHowtodoBase64encodinganddecodinginJava,usingthenewAPIsintroducedinJava8aswellasApacheCommons.Readmore→
ThistutorialisapracticalguideshowingdifferentwaystoencodeaStringtotheUTF-8charset.
Foramoretechnicaldeep-dive,seeourGuidetoCharacterEncoding.
2.DefiningtheProblem
ToshowcasetheJavaencoding,we'llworkwiththeGermanString“EntwickelnSiemitVergnügen”:
StringgermanString="EntwickelnSiemitVergnügen";
byte[]germanBytes=germanString.getBytes();
StringasciiEncodedString=newString(germanBytes,StandardCharsets.US_ASCII);
assertNotEquals(asciiEncodedString,germanString);
ThisStringencodedusingUS_ASCIIgivesusthevalue“EntwickelnSiemitVergn?gen”whenprintedbecauseitdoesn'tunderstandthenon-ASCIIücharacter.
ButwhenweconvertanASCII-encodedStringthatusesallEnglishcharacterstoUTF-8,wegetthesamestring:
StringenglishString="Developwithpleasure";
byte[]englishBytes=englishString.getBytes();
StringasciiEncondedEnglishString=newString(englishBytes,StandardCharsets.US_ASCII);
assertEquals(asciiEncondedEnglishString,englishString);
Let'sseewhathappenswhenweusetheUTF-8encoding.
3.EncodingWithCoreJava
Let'sstartwiththecorelibrary.
StringsareimmutableinJava,whichmeanswecannotchangeaStringcharacterencoding. Toachievewhatwewant,weneedtocopythebytesoftheStringandthencreateanewonewiththedesiredencoding.
First,wegettheStringbytes,andthenwecreateanewoneusingtheretrievedbytesandthedesiredcharset:
StringrawString="EntwickelnSiemitVergnügen";
byte[]bytes=rawString.getBytes(StandardCharsets.UTF_8);
Stringutf8EncodedString=newString(bytes,StandardCharsets.UTF_8);
assertEquals(rawString,utf8EncodedString);
4.EncodingWithJava7StandardCharsets
Alternatively,wecanusetheStandardCharsetsclassintroducedinJava7toencodetheString.
First,we'lldecodetheStringintobytes,andsecond,we'llencodetheStringtoUTF-8:
StringrawString="EntwickelnSiemitVergnügen";
ByteBufferbuffer=StandardCharsets.UTF_8.encode(rawString);
Stringutf8EncodedString=StandardCharsets.UTF_8.decode(buffer).toString();
assertEquals(rawString,utf8EncodedString);
5.EncodingWithCommons-Codec
BesidesusingcoreJava,wecanalternativelyuseApacheCommonsCodec toachievethesameresults.
ApacheCommonsCodecisahandypackagecontainingsimpleencodersanddecodersforvariousformats.
First,let'sstartwiththeprojectconfiguration.
WhenusingMaven,wehavetoaddthecommons-codecdependencytoourpom.xml:
延伸文章資訊
- 1Java String - Jenkov.com
- 2STR51-J. Use the charset encoder and decoder classes when ...
- 3Java String Encoding - Javatpoint
In Java, when we deal with String sometimes it is required to encode a string in a specific chara...
- 4Encode a String to UTF-8 in Java - Stack Abuse
Encoding a String in Java simply means injecting certain bytes into the byte array that constitut...
- 5java中string类型转换成UTF-8 - CSDN博客
1、测试方法如下: public static String toUtf8(String str) { return new String(str.getBytes("UTF-8"),"UTF-...