Encode a String to UTF-8 in Java - Baeldung

文章推薦指數: 80 %
投票人數:10人

Strings are immutable in Java, which means we cannot change a String character encoding. To achieve what we want, we need to copy the bytes ... StartHereCourses ▼▲ RESTwithSpring ThecanonicalreferenceforbuildingaproductiongradeAPIwithSpring LearnSpringSecurity ▼▲ THEuniqueSpringSecurityeducationifyou’reworkingwithJavatoday LearnSpringSecurityCore FocusontheCoreofSpringSecurity5 LearnSpringSecurityOAuth FocusonthenewOAuth2stackinSpringSecurity5 LearnSpring Fromnoexperiencetoactuallybuildingstuff​ LearnSpringDataJPA ThefullguidetopersistencewithSpringDataJPA Guides ▼▲ Persistence ThePersistencewithSpringguides REST TheguidesonbuildingRESTAPIswithSpring Security TheSpringSecurityguides About ▼▲ FullArchive Thehighleveloverviewofallthearticlesonthesite. BaeldungEbooks DiscoverallofoureBooks WriteforBaeldung Becomeawriteronthesite AboutBaeldung AboutBaeldung. JavaTop GetstartedwithSpring5andSpringBoot2,throughtheLearnSpringcourse: >CHECKOUTTHECOURSE 1.Overview WhendealingwithStringsinJava,wesometimesneedtoencodethemintoaspecificcharset. Furtherreading:GuidetoCharacterEncodingExplorecharacterencodinginJavaandlearnaboutcommonpitfalls.Readmore→GuidetoJavaURLEncoding/DecodingThearticlediscussesURLencodinginJava,somepitfalls,andhowtoavoidthem.Readmore→JavaBase64EncodingandDecodingHowtodoBase64encodinganddecodinginJava,usingthenewAPIsintroducedinJava8aswellasApacheCommons.Readmore→ ThistutorialisapracticalguideshowingdifferentwaystoencodeaStringtotheUTF-8charset. Foramoretechnicaldeep-dive,seeourGuidetoCharacterEncoding. 2.DefiningtheProblem ToshowcasetheJavaencoding,we'llworkwiththeGermanString“EntwickelnSiemitVergnügen”: StringgermanString="EntwickelnSiemitVergnügen"; byte[]germanBytes=germanString.getBytes(); StringasciiEncodedString=newString(germanBytes,StandardCharsets.US_ASCII); assertNotEquals(asciiEncodedString,germanString); ThisStringencodedusingUS_ASCIIgivesusthevalue“EntwickelnSiemitVergn?gen”whenprintedbecauseitdoesn'tunderstandthenon-ASCIIücharacter. ButwhenweconvertanASCII-encodedStringthatusesallEnglishcharacterstoUTF-8,wegetthesamestring: StringenglishString="Developwithpleasure"; byte[]englishBytes=englishString.getBytes(); StringasciiEncondedEnglishString=newString(englishBytes,StandardCharsets.US_ASCII); assertEquals(asciiEncondedEnglishString,englishString); Let'sseewhathappenswhenweusetheUTF-8encoding. 3.EncodingWithCoreJava Let'sstartwiththecorelibrary. StringsareimmutableinJava,whichmeanswecannotchangeaStringcharacterencoding. Toachievewhatwewant,weneedtocopythebytesoftheStringandthencreateanewonewiththedesiredencoding. First,wegettheStringbytes,andthenwecreateanewoneusingtheretrievedbytesandthedesiredcharset: StringrawString="EntwickelnSiemitVergnügen"; byte[]bytes=rawString.getBytes(StandardCharsets.UTF_8); Stringutf8EncodedString=newString(bytes,StandardCharsets.UTF_8); assertEquals(rawString,utf8EncodedString); 4.EncodingWithJava7StandardCharsets Alternatively,wecanusetheStandardCharsetsclassintroducedinJava7toencodetheString. First,we'lldecodetheStringintobytes,andsecond,we'llencodetheStringtoUTF-8: StringrawString="EntwickelnSiemitVergnügen"; ByteBufferbuffer=StandardCharsets.UTF_8.encode(rawString); Stringutf8EncodedString=StandardCharsets.UTF_8.decode(buffer).toString(); assertEquals(rawString,utf8EncodedString); 5.EncodingWithCommons-Codec BesidesusingcoreJava,wecanalternativelyuseApacheCommonsCodec toachievethesameresults. ApacheCommonsCodecisahandypackagecontainingsimpleencodersanddecodersforvariousformats. First,let'sstartwiththeprojectconfiguration. WhenusingMaven,wehavetoaddthecommons-codecdependencytoourpom.xml: commons-codec commons-codec 1.14 Then,inourcase,themostinterestingclassisStringUtils,whichprovidesmethodstoencodeStrings. Usingthisclass,gettingaUTF-8encodedStringisprettystraightforward: StringrawString="EntwickelnSiemitVergnügen"; byte[]bytes=StringUtils.getBytesUtf8(rawString); Stringutf8EncodedString=StringUtils.newStringUtf8(bytes); assertEquals(rawString,utf8EncodedString); 6.Conclusion EncodingaStringintoUTF-8isn'tdifficult,butit'snotthatintuitive.Thisarticlepresentsthreewaysofdoingit,usingeithercoreJavaorApacheCommonsCodec. Asalways,thecodesamplescanbefoundoveronGitHub. Javabottom GetstartedwithSpring5andSpringBoot2,throughtheLearnSpringcourse: >>CHECKOUTTHECOURSE Genericfooterbanner LearningtobuildyourAPIwithSpring? DownloadtheE-book Commentsareclosedonthisarticle! Javasidebarbanner BuildingaRESTAPIwithSpring5? DownloadtheE-book Followthe Java Category FollowtheJavacategorytogetregularinfoaboutthenewarticlesandtutorialswepublishhere. FOLLOWTHEJAVACATEGORY



請為這篇文章評分?