Difference between UTF-8, UTF-16 and UTF-32 Character ...

文章推薦指數: 80 %
投票人數:10人

The main difference between UTF-8, UTF-16, and UTF-32 character encoding is how many bytes it requires to represent a character in memory. UTF-8 uses a ... TopicsandCategories corejava spring hibernate collections multithreading designpatterns interviewquestions coding datastructure OOP java8 books AboutMe JavaCertifications JDBC jsp-servlet JSON SQL Linux Courses onlineresources jvm-internals REST Eclipse jQuery JavaIO JavaXML Disclosure:Thisarticlemaycontainaffiliatelinks.Whenyoupurchase,wemayearnasmallcommission. DifferencebetweenUTF-8,UTF-16andUTF-32CharacterEncoding?Example ThemaindifferencebetweenUTF-8,UTF-16,andUTF-32characterencodingishowmanybytesitrequirestorepresentacharacterinmemory.UTF-8usesaminimumofonebyte,whileUTF-16usesaminimumof2bytes.BTW,ifthecharacter'scodepointisgreaterthan127,themaximumvalueofbytethenUTF-8maytake2,3o4bytesbutUTF-16willonlytakeeithertwoorfourbytes.Ontheotherhand,UTF-32isafixed-widthencodingschemeandalwaysuses4bytestoencodeaUnicodecodepoint.Now,let'sstartwithwhatischaracterencodingandwhyit'simportant?Well,characterencodingisanimportantconceptintheprocessofconvertingbytestreamsintocharacters,whichcanbedisplayed. Therearetwothings,whichareimportanttoconvertbytestocharacters,acharactersetandanencoding.Sincetherearesomanycharactersandsymbolsintheworld,acharactersetisrequiredtosupportallthosecharacters.Acharactersetisnothingbutlistofcharacters,whereeachsymbolorcharacterismappedtoanumericvalue,alsoknownascodepoints. OntheotherhandUTF-16,UTF-32andUTF-8areencodingschemes,whichdescribehowthesevalues(codepoints)aremappedtobytes(usingdifferentbitvaluesasabasis;e.g.16-bitforUTF-16,32bitsforUTF-32and8-bitforUTF-8).UTFstandsforUnicodeTransformation,whichdefinesanalgorithmtomapeveryUnicodecodepointtoauniquebytesequence.  Forexample,forcharacterA,whichisLatinCapitalA,UnicodecodepointisU+0041,UTF-8encodedbytesare41,UTF-16encodingis0041,andJavacharliteralis'\u0041'.Inshort,youjustneedacharacterencodingschemetointerpretastreamofbytes,intheabsenceofcharacterencoding,youcannotshowthemcorrectly.Javaprogramminglanguagehasextensivesupportfordifferentcharsetandcharacterencoding,bydefaultitusesUTF-8. DifferencebetweenUTF-32,UTF-16andUTF-8encoding AsIsaidearlier,UTF-8,UTF-16,andUTF-32arejustcoupleofwaystostoreUnicodecodepointsi.e.thoseU+magicnumbersusing8,16and32bitsinthecomputer'smemory.OnceUnicodecharacterisconvertedintobytes,itcanbeeasilypersistedinthedisk,transferredovernetworkandrecreatedatotherend. ThefundamentaldifferencebetweenUTF-32andUTF-8,UTF-16isthatformerisfixedwidthencodingscheme,whilelaterduoisvariablelengthencoding.BTW,despite,bothUTF-8andUTF-16usesUnicodecharactersandvariablewidthencoding,therearesomedifferencebetweenthemaswell. 1.UTF-8usesonebyteattheminimuminencodingthecharacterswhileUTF-16usesminimumtwobytes. InUTF-8,everycodepointfrom0-127isstoredinasinglebytes.Onlycodepoints128andabovearestoredusing2,3orinfact,upto4bytes.Inshort,UTF-8isvariablelengthencodingandtakes1to4bytes,dependinguponcodepoint.UTF-16isalsovariablelengthcharacterencodingbuteithertakes2or4bytes.OntheotherhandUTF-32isfixed4bytes. 2.UTF-8iscompatiblewithASCIIwhileUTF-16isincompatiblewithASCII UTF-8hasanadvantagewhereASCIIaremostusedcharacters,inthatcasemostcharactersonlyneedonebyte.UTF-8filecontainingonlyASCIIcharactershasthesameencodingasanASCIIfile,whichmeansEnglishtextlooksexactlythesameinUTF-8asitdidinASCII.GivendominanceofASCIIinpastthiswasthemainreasonofinitialacceptanceofUnicodeandUTF-8. Hereisanexample,whichshowshowdifferentcharactersaremappedtobytesunderdifferentcharacterencodingschemee.g.UTF-16,UTF-8andUTF-32.Youcanseehowdifferentschemetakesdifferentnumberofbytestorepresentsamecharacter. Summary 1)UTF16isnotfixedwidth.Ituses2or4bytes.TheonlyUTF32isfixed-widthandunfortunately,nooneusesit. Also,worthknowingisthatJavaStringsarerepresentedusingUTF-16bitcharacters,earliertheyuseUSC2,whichisfixedwidth.  2)YoumightthinkthatbecauseUTF-8takesfewerbytesformanycharactersitwouldtakelessmemorythanUTF-16,wellthatreallydependsonwhatlanguagethestringisin.Fornon-Europeanlanguages,UTF-8requiresmorememorythanUTF-16. 3)ASCIIisstrictlyfasterthanmulti-byteencodingschemebecauselessdatatoprocess=faster. That'sallaboutUnicode,UTF-8,UTF-32,andUTF-16characterencoding.Aswehavelearned,Unicodeisacharactersetofvarioussymbols,whileUTF-8,UTF-16,andUTF-32aredifferentwaystorepresenttheminbyteformat.BothUTF-8andUTF-16arevariable-lengthencoding,wherethenumberofbytesuseddependsuponUnicodecodepoints. Ontheotherhand,UTF-32isfixed-widthencoding,whereeachcodepointtakes4bytes.Unicodecontainscodepointsforalmostallrepresentablegraphicsymbolsintheworldanditsupportsallmajorlanguagese.g.English,Japanese,Mandarin,orDevanagari. Alwaysremember,UTF-32isfixed-widthencoding,alwaystakes32bits,butUTF-8andUTF-16arevariable-lengthencodingswhereUTF-8cantake1to4byteswhileUTF-16willtakeeither2or4bytes. By javinpaul EmailThis BlogThis! SharetoTwitter SharetoFacebook Labels: bestofjavarevisited , corejava , programming 11comments : KunalKrishna85 said... "BTW,ifcharacter'scodepointisgreaterthan127,"whatisCharacter'sCODEPOINT?plzexplain. February17,2015at9:21PM Anonymous said... Yousaid:"Javaprogramminglanguagehasextensivesupportfordifferentcharsetandcharacterencoding,bydefaultituseUTF-8."Thenyousaid:"Also,worthknowingisthatJavaStringsarerepresentedusingUTF-16bitcharacters"Couldyouclearthisout. February18,2015at11:29AM gm said... Onequestion.YoumentionthedefaultencodinginJavaisUTF-8butatleastCharacterandStringhavethedefaultUTF-16(http://docs.oracle.com/javase/8/docs/api/java/lang/Character.html).Isthereadifferentencodingyouwererefferingto?Tx,niceblog February19,2015at5:42AM Unknown said... @Kunal"Codepointsarethenumbersthatareusedincodedcharactersetwherecodedcharactersetrepresentcollectionofcharactersandeachcharacterwillassignauniquenumber.Thiscodedcharactersetdefinerangeofvalidcodepoints.ValidcodepointsforUnicodeareU+0000toU+10FFFF."http://javarevisited.blogspot.com/2012/01/java-string-codepoint-get-unicode.html February19,2015at8:42AM Anonymous said... HelloOnepointtonoteisthat,UTF-8cangountil6bytes,ihopeiamnotwronghere.Thanks. February19,2015at11:58AM javinpaul said... @gm,Yes,JavaStringusesUTF-16butwhenyouconvertBytearraytocharacters,Javausesplatform'sdefaultcharacterencoding.It'sdifferentatdifferentplacese.g.inEclipseitcouldbedifferentthanyourLinuxhost. February21,2015at11:45PM Anonymous said... Hellothere?WhatisdifferencebetweenUTF-16,UTF-16LEandUTF-16BE?Aretheysame? February23,2015at5:13AM Anonymous said... @Anonymous,Theyarenotsame.UTF-16LEstorebytesinlittleendianorder,whileUTF-16BEstoresbytesinbigendianorderindisk.SinceUTF-16usesminimum2bytestorepresentacharacter,howtheystorethosetwobytesindiskwillaffectthevalueofcharacter.Inbigendian,mostsignificantbyteisstoredathigherlocation. September16,2015at12:34AM vijaypratap said... (£)ThissymbolwearetakingfromDatabase,whiledisplayingthisvaluesin.jsppageitisfine,butwhilegettingthevalueintoAPIsitiscomingas(A^£).Weareusingchaset=utf-8.Couldyoupleasetellmewhyitishappeningandwhatissolutionforit. October24,2018at8:40PM Unknown said... Useutf16 August21,2019at4:44PM Anonymous said... Acharactersetisnothingbutlistofcharacters,whereeachsymbolorcharacterismappedtoanumericvalue,alsoknownascodepoints. December11,2020at12:21PM PostaComment NewerPost OlderPost Home Subscribeto: PostComments ( Atom ) SearchThisBlog SubscribeforDiscountsandUpdates Follow InterviewQuestions corejavainterviewquestion (178) interviewquestions (105) datastructureandalgorithm (86) CodingInterviewQuestion (79) designpatterns (38) objectorientedprogramming (37) SQLInterviewQuestions (35) springinterviewquestions (32) threadinterviewquestions (30) collectionsinterviewquestions (26) databaseinterviewquestions (16) servletinterviewquestions (15) Programminginterviewquestion (6) hibernateinterviewquestions (6) BestofJavarevisited HowSpringMVCworksinternally? HowtodesignavendingmachineinJava? HowHashMapworksinJava? WhyStringisImmutableinJava? 10ArticlesEveryProgrammerMustRead HowtoconvertlambdaexpressiontomethodreferenceinJava8? 10TipstoimproveProgrammingSkill 10OOPdesignprinciplesprogrammershouldknow HowSynchronizationworksinJava? 10tipstoworkfastinLinux 5BookstoimproveCodingSkills JavaTutorials dateandtimetutorial (24) FIXprotocoltutorial (15) JavaCertificationOCPJPSCJP (33) javacollectiontutorial (84) javaIOtutorial (29) JavaJSONtutorial (15) JavamultithreadingTutorials (61) JavaProgrammingTutorials (20) Javaxmltutorial (16) JDBC (34) jsp-servlet (37) onlineresources (227) GetNewBlogPostsonYourEmail Getnewpostsbyemail:Subscribe Followers Categories courses (395) SQL (68) linux (50) database (49) JavaCertificationOCPJPSCJP (33) Eclipse (30) REST (29) JVMInternals (24) JQuery (21) Testing (19) general (18) Maven (16) BlogArchive ►  2022 (701) ►  October (11) ►  September (37) ►  August (83) ►  July (144) ►  June (111) ►  May (64) ►  April (126) ►  March (25) ►  February (44) ►  January (56) ▼  2021 (960) ►  December (134) ►  November (88) ►  October (40) ►  September (57) ►  August (224) ▼  July (359) ParsingLargeJSONFilesusingJacksonStreamingA... HowtoSolveUnrecognizedPropertyException:Unreco... HowtoparseJSONwithdatefieldinJava-Jackso... HowtoIgnoreUnknownPropertiesWhileParsingJSO... HowtoFindPrimeFactorsofIntegerNumbersinJa... java.lang.ClassNotFoundException:org.postgresql.D... WhymultipleinheritancesarenotsupportedinJava HowtocreateHTTPServerinJava-ServerSocketE... LawofDemeterinJava-PrincipleofleastKnowle... HowtodoGROUPBYinJava8?Collectors.groupingB... 10ThingsEveryJavaProgrammerShouldKnowabout... 10TipstoDebugJavaPrograminEclipse-Examples HowSSL,HTTPSandCertificatesWorksinJavaweb... 3WaystoConvertanArraytoArrayListinJava?E... DifferencebetweenLEFTandRIGHTOUTERJoinsinS... DifferenceBetweenLinkedListandArrayinJava?... WhentoMakeaMethodStaticinJava?Example DifferentTypesofJDBCDriversinJava-QuickOv... DifferencebetweenClassNotFoundExceptionvsNoCla... WhyEnumSingletonarebetterinJava?Examples BuilderDesignpatterninJava-ExampleTutorial 5CodingTipsforImprovingPerformanceofJavaap... Differencebetweenrepaintandrevalidatemethodi... HowtoCountnumberofSetbitsor1'sofInteger... WhenaclassisloadedandinitializedinJVM-Ja... HowtoAddTwoIntegerNumberswithoutusingPlus... JavaArrayListandHashMapPerformanceImprovement... IsSwingThreadSafeinJava?Answer InvalidinitialandmaximumheapsizeinJVM-How... HowtoCloseJavaProgramorSwingApplicationwit... HowtoCheckifIntegerNumberisPowerofTwoin... InvokeLaterandInvokeAndWaitinJavaSwing(anex... HowtoUseBreak,Continue,andLabelinLoopin... 10ExamplesofHotSpotJVMOptionsinJava DifferencebetweenSun(Oracle)JVMandIBMJVM? HowtoGenerateMD5checksumforFilesinJava?Ex... HowtofindCPUandMemoryusedbyJavaprocessin... 10XSLTorXML,XSLTransformationInterviewQuest... HowClassLoaderWorksinJava?Example 3waystosolvejava.lang.NoClassDefFoundErrorin... 20DesignPatternsandSoftwareDesignInterviewQ... HowtouseComparatorandComparableinJava?With... 10InterviewQuestionsonJavaGenericsforProgra... Whatis-XX:+UseCompressedOopsin64bitJVM?Example Top10GarbageCollectionInterviewQuestionsand... WhatisClassFileandByteCodeinJava?Example Top10JavaSwingInterviewQuestionsAnswersaske... HowtocomparetwolistsofvaluesinMicrosoftEx... DifferencebetweenJVM,JIR,JRE,andJDKinJava?... Howtoreload/refreshapageusingJavaScriptand... HowtoincreaseHeapmemoryofApacheTomcatServe... HowmanycharactersallowedonVARCHAR(n)columns... WhatisboundedandunboundedwildcardsinGeneric... HowtoSplitStringbasedondelimiterinJava?Ex... DifferencebetweenRightshiftandUnsignedright... WhatisthemaximumHeapSizeof32bitor64-bit... HowtoReplaceLineBreaks,NewLinesFromString... HowtoConvertByteArraytoInputStreamandOutpu... HowtoCreateJUnitTestsinEclipseandNetBeans... 10ArticlesEveryProgrammerMustRead Whatisjava.library.path?HowtosetinEclipseI... HowtoaddandsubstractdaysincurrentdateinJ... 10JDK7FeaturestoRevisit,BeforeYouWelcomeJ... JavaProgramtofindfactorialofnumberinJava-... 7ExamplestoReadFileintoaByteArrayinJava DifferencebetweenConnectedvsDisconnectedRowSe... DifferencebetweenStubandMockobjectinJavaUn... HowtoAddLeadingZerostoIntegersinJava?Str... HowtoImplementLinkedListinJavawithJUnitTe... DifferencebetweenFileInputStreamandFileReader... Top10Puzzles,Riddles,Logical,andLateralThin... DifferencebetweenUTF-8,UTF-16andUTF-32Charac... HowtoImplementThreadinJavawithExample DifferencebetweenvalueOfandparseIntmethodin... HowtoCompareTwoEnuminJava?Equalsvs==vsC... DifferenceBetweenAbstractClassvsInterfacein... WhatisStringargs[]ArgumentinJavaMainmetho... HowtodisableJUnitTest-@IgnoreannotationExa... TheUltimateGuideofGenericsinJava-Examples Differencebetweentrunk,tagsandbranchesinSVN... HowtoCheckIfNumberisEvenorOddwithoutusin... HowtoConvertInputStreamtoByteArrayinJava-... JavaProgramtoprintPrimenumbersinJava-Exa... JavaProgramtoFindSumofDigitsinaNumberusi... HowtocomparetwoXMLfilesinJava-XMLUnitExa... JAXBDateFormatExampleusingAnnotation|JavaD... HowtoconvertdoubletointinJava?Example DoesmakingallfieldsFinalmakestheclassImmut... Top10TipsonLogginginJava-Tutorial HowtoSetupJavaRemoteDebugginginEclipse-St... HowtoFindFirstandLastelementinLinkedListJ... DifferenceBetweenjavaandjavawCommandsfromJDK JavaProgramtoconnectOracleDatabasewithExamp... DifferencebetweenValidatorFormvsValidatorActio... 10pointsaboutJavaHeapSpaceorJavaHeapMemory Top12ApacheWebServerInterviewQuestionsAnswe... WhatisinterfaceinJavawithExample-Tutorial StringreplaceAll()example-Howtoreplaceallc... JavaProgramtoReverseanIntegerNumber-Exampl... HowtoMeasureElapsedExecutionTimeinJava-Sp... ►  June (5) ►  May (7) ►  April (15) ►  March (17) ►  February (8) ►  January (6) ►  2020 (95) ►  December (13) ►  November (10) ►  October (6) ►  September (4) ►  August (5) ►  July (8) ►  June (2) ►  May (8) ►  April (20) ►  March (11) ►  February (8) ►  2019 (24) ►  December (3) ►  November (6) ►  October (4) ►  August (1) ►  July (2) ►  June (2) ►  May (1) ►  April (2) ►  February (1) ►  January (2) ►  2018 (5) ►  September (1) ►  August (1) ►  July (2) ►  June (1) ►  2017 (22) ►  December (2) ►  November (2) ►  October (4) ►  September (2) ►  July (3) ►  June (5) ►  May (3) ►  April (1) TranslateThisBlog References Oracle'sJavaTechNetwork jQueryDocumentation MicrosoftSQLServerDocumentation JavaSE8APIDocumentation SpringDocumentation Oracle'sJAvaCertification SpringSecurity5Documentation Pages PrivacyPolicy TermsandConditions CopyrightbyJavinPaul2010-2021.PoweredbyBlogger.



請為這篇文章評分?