5 Solid Ways to Remove Unicode Characters in Python

文章推薦指數: 80 %
投票人數:10人

We can remove the Unicode characters from the string in Python with the help of methods like encode() and decode(), ord((), replace(), ... Skiptocontent Menu Menu Contents IntroductionWhatareUnicodecharacters?ExamplestoremoveUnicodecharacters1.Usingencode()anddecode()method2.Usingreplace()methodtoremoveUnicodecharacters3.Usingcharacter.isalnum()methodtoremovespecialcharactersinPython4.UsingregularexpressiontoremovespecificUnicodecharactersinPython5.Usingord()methodandforlooptoremoveUnicodecharactersinPythonConclusion Introduction Inpython,wehavediscussedmanyconceptsandconversions.Butsometimes,wecometoasituationwhereweneedtoremovetheUnicodecharactersfromthestring.Inthistutorial,wewillbediscussinghowtoremovealltheUnicodecharactersfromthestringinpython. WhatareUnicodecharacters? Unicodeisaninternationalencodingstandardthatiswidelyspreadandhasitsacceptanceallovertheworld.Itisusedwithdifferentlanguagesandscriptsbywhicheachletter,digit,orsymbolisassignedwithauniquenumericvaluethatappliesacrossdifferentplatformsandprograms. ExamplestoremoveUnicodecharacters Here,wewillbediscussingallthedifferentwaysthroughwhichwecanremovealltheUnicodecharactersfromthestring: 1.Usingencode()anddecode()method Inthisexample,wewillbeusingtheencode()functionandthedecode()functionfromremovingtheUnicodecharactersfromtheString.Encode()functionwillencodethestringinto‘ASCII’anderroras‘ignore’toremoveUnicodecharacters.Decode()functionwillthendecodethestringbackinitsform.Letuslookattheexampleforunderstandingtheconceptindetail. #inputstring str="ThisisPython\u500cPool" #encode()method strencode=str.encode("ascii","ignore") #decode()method strdecode=strencode.decode() #output print("OutputafterremovingUnicodecharacters:",strdecode) Output: Explanation: Firstly,wewilltakeaninputstringinthevariablenamedstr.Then,wewillapplytheencode()method,whichwillencodethestringinto‘ASCII’anderroras‘ignore’toremoveUnicodecharacters.Afterthat,wewillapplythedecode()method,whichwillconvertthebytestringintothenormalstringformat.Atlast,wewillprinttheoutput.Hence,youcanseetheoutputstringwithalltheremovedUnicodecharacters. 2.Usingreplace()methodtoremoveUnicodecharacters Inthisexample,wewillbeusingreplace()methodforremovingtheUnicodecharactersfromthestring.SupposeyouneedtoremovetheparticularUnicodecharacterfromthestring,soyouusethestring.replace()method,whichwillremovetheparticularcharacterfromthestring.Letuslookattheexampleforunderstandingtheconceptindetail. #inputstring str="ThisisPython\u300cPool" #replace()method strreplaced=str.replace('\u300c','') #output print("OutputafterremovingUnicodecharacters:",strreplaced) Output: Explanation: Firstly,wewilltakeaninputstringinthevariablenamedstr.Then,wewillapplythereplace()methodinwhichwewillreplacetheparticularUnicodecharacterwiththeemptyspace.Atlast,wewillprinttheoutput.Hence,youcanseetheoutputstringwithalltheremovedUnicodecharacters. 3.Usingcharacter.isalnum()methodtoremovespecialcharactersinPython Inthisexample,wewillbeusingthecharacter.isalnum()methodtoremovethespecialcharactersfromthestring.Supposeweencounterastringinwhichwehavethepresenceofslashorwhitespacesorquestionmarks.So,allthesespecialcharacterscanberemovedwiththehelpofthegivenmethod.Letuslookattheexampleforunderstandingtheconceptindetail. #inputstring str="Thisis/i!?Pythonpooltutorial?"" output="" forcharacterinstr: ifcharacter.isalnum(): output+=character print(output) Output: Explanation: Firstly,wewilltakeaninputstringinthevariablenamedstr.Then,wewilltakeanemptystringwiththevariablenamedoutput.Afterthat,wewillapplyforloopfromthefirstcharactertothelastofthestring.Then,wewillchecktheifconditionandappendthecharacterintheemptystring.Thisprocesswillcontinueuntilthelastcharacterinthestringoccurs.Atlast,wewillprinttheoutput.Hence,youcanseetheoutputwithallthespecialcharactersandwhitespacesremovedfromthestring. 4.UsingregularexpressiontoremovespecificUnicodecharactersinPython Inthisexample,wewillbeusingtheregularexpression(re.sub()method)forremovingthespecificUnicodecharacterfromthestring.Thismethodcontainsthreeparametersinit,i.e.,pattern,replace,andstring.Letuslookattheexampleforunderstandingtheconceptindetail. #importremodule importre #inputstring str="PyéthonòPoòol!" #re.sub()method Output=re.sub(r"(\xe9|\362)","",str) #output print("Removingspecificcharcater:",Output) Output: Explanation: Firstly,wewillimporttheremodule.Then,wewilltakeaninputstringinthevariablenamedstr.Then,wewillapplythere.sub()methodforremovingthespecificcharactersfromthestringandstoretheoutputintheOutputvariable.Atlast,wewillprinttheoutput.Hence,youwillseetheoutputasthespecificcharacterremovedfromthestring. 5.Usingord()methodandforlooptoremoveUnicodecharactersinPython Inthisexample,wewillbeusingtheord()methodandaforloopforremovingtheUnicodecharactersfromthestring.Ord()methodacceptsthestringoflength1asanargumentandisusedtoreturntheUnicodecodepointrepresentationofthepassedargument.Letuslookattheexampleforunderstandingtheconceptindetail. #inputstring str="ThisisPython\u500cPool" #ord()function output=''.join([iiford(i)<128else''foriinstr]) #output print("AfterremovingUnicodecharacter:",output) Output: Eplanation: Firstly,wewilltakeaninputstringinthevariablenamedstr.Then,wewillapplythejoin()functioninsidewhichwehaveappliedtheord()methodandforloopandstoretheoutputintheoutputvariable.Atlast,wehaveprintedtheoutput.Hence,youcanseetheoutputastheUnicodecharactersareremovedfromthestring. Conclusion Inthistutorial,wehavelearnedabouttheconceptofremovingtheUnicodecharactersfromthestring.WehavediscussedallthewaysthroughwhichwecanremovetheUnicodecharactersfromthestring.Allthewaysareexplainedindetailwiththehelpofexamples.Youcanuseanyofthefunctionsaccordingtoyourchoiceandyourrequirementintheprogram. However,ifyouhaveanydoubtsorquestions,doletmeknowinthecommentsectionbelow.Iwilltrytohelpyouassoonaspossible. Subscribe Login Notifyof newfollow-upcomments newrepliestomycomments Label {} [+] Name* Email* Website Label {} [+] Name* Email* Website 0Comments InlineFeedbacks Viewallcomments AboutusPythonPoolisaplatformwhereyoucanlearnandbecomeanexpertineveryaspectofPythonprogramminglanguageaswellasinAI,ML,andDataScience. QuickLinks Algorithm Books Career Comparison DataScience Error Howto IDE&Editor Learning MachineLearning Matplotlib Module News Numpy OpenCV Pandas Programs Project PySpark Questions Review Software Tensorflow Tkinter Tutorials JoinusonTelegram wpDiscuzInsert



請為這篇文章評分?