Python Extract Substring Using Regex - Linux Hint

文章推薦指數: 80 %
投票人數:10人

In a programming language, a Regular Expression written as (RE or regex) is a text string that is used to describe a search pattern. Inaprogramminglanguage,aRegularExpressionwrittenas(REorregex)isatextstringthatisusedtodescribeasearchpattern.It’sperfectforextractingdatafromtextfiles,logs,spreadsheets,andevenpapers.WhenutilizingaPythonregularexpression,rememberthateverythingisfundamentallyacharacter.Wecreatepatternsthatmatchaspecificsequenceofcharacters,generallyreferredtoasastring.LatinlettersorAsciiarethelettersyouseeonyourkeyboards;ontheotherhand,Unicodeisprimarilyusedtomatchtheforeigntext.Allnumerals,punctuation,andspecialcharacters,suchas$#@!areincluded. APythonregularexpression,forexample,mayinstructaprogramtosearchastringforspecifiedtextandthenprinttheresult.Asetofcharactersareknownasa“string.”Whetherwe’reworkingonsoftwareoranyother competitiveprogramming,we’reconstantlydealingwithstrings.Whiledevelopingprograms,weoccasionallyneedtoaccesssub-partsofastring.Substringsarethenamesforthesesub-parts.Asubstringisastring’ssubset.Wecaneasilyachievethisbyusingthestringslicingtechnique oraregularexpression(RE). Expressionincludestextmatching,branching,repetition, andpatternbuilding.REisaregularexpressionorRegExthatisimportedviatheremoduleinPython.AregularexpressionissupportedbyPythonlibraries.Identifiers,Modifiers,andWhiteSpaceCharactersaresupportedbyRegExinPython.ForthebestuseofRegularExpressions,youmustimporttheremodule;otherwise,itmaynotworkproperly.Wehavestructuredthispieceintothreesectionsthatarenotexactlyrelatedtoeachother,andyoumaygorightintoanyofthemtogetstarted,butifyouarenewtoRegEx,werecommendreadingitinorder.We’llusethefindall,search,andmatchfunctionsintheremoduletosolveourproblemsthroughoutthispost.Let’sgetstarted. Example1: WewillusearegularexpressioninPythontoextractthesubstringinthisexample.WewillutilizePython’sbuilt-inpackagereforregularexpressions.Thesearch()functionintheprecedingcodelooksforthefirstinstanceofthepatternsuppliedasanargumentinthepassedtext.ItgivesyouaMatchobjectasaresult.Thespanofthesubstring,aswellasthestartingandendingindexesofthesubstring,areallcharacteristicsofaMatchobjectthatdefinetheoutput.It’sworthnotingthatsomepropertiesmaybemissingbecausedir()callsthe_dir_()method,whichprovidesalistofalltheattributes.Andthistechniquecanbechangedoroverridden. Hereistheoutputwhenweruntheabovecode. Example2: Wewillapplythere.match()methodinournextexample.InPython,there.match()functionlooksforandreturnsthefirstoccurrenceofaregularexpressionpattern.InPython,thisMatchfunctionwilllookforamatchatthebeginningonly.Ifamatchisdiscoveredinthefirstline,thematchobjectisreturned.TheMatchmethodofPythonRegEx,ontheotherhand,returnsnullifamatchissuccessfullyfoundinanotherline.ConsiderthefollowingPythoncodeforthere.match()function.Theexpressions“w+”and“W”willmatchwordsthatbeginwiththeletter“g,”andanythingthatdoesnotbeginwiththeletter“g”willbeignored.InthisPythonre.match()example,weusetheforlooptocheckformatchesforeachelementinthelistortext. Hereistheoutputoftheabovecodewhenexecuted. Example3: Inourlastexample,wewillusethefindallmethodofPython.Findall()isamodulethatsearchesfor“all”instancesofapatterninagiveninput.Incontrast,thesearch()module returns thefirstoccurrencethatonly matchesthepattern.findall() willcheckall thelinesinthefileandreturnthe non-overlappingpatternmatchesinasinglestep.Observethecodebelowandseethatwehavesomee-mailaddressesandsometextandwanttofetchtheemailaddressesonly,soweusethere.findall()functionforthispurpose.Itwillsearchtheentirelistfore-mailaddresses. Theresultoftheabovecodeisasfollows. Conclusion: Regularexpressions(RegEx)areusefulforextractingcharacterpatternsfromtextandprocessingthem.RegularExpressionsarequickandveryeasytouse,andtheysaveyoutimebyavoidingtheuseofredundantloopsinyourapplicationtomatchandretrievedata.WehaveshownyouhowtoutilizeregularexpressionsinPythontotacklespecificsituationsinthispost.WehavealsoincludedexamplesofutilizingRegExtoaddressvarioustextprocessingchallenges.Wemostlyfocusedonextractingwordsfromstringsinthispost. Abouttheauthor KalsoomBibi Hello,IamafreelancewriterandusuallywriteforLinuxandothertechnologyrelatedcontent Viewallposts RELATEDLINUXHINTPOSTS SelectMultipleColumnsinPandasPandasIsin()MethodTry-ExceptStatementsinPythonSeabornHistplotSeabornStackedBarPlotSeabornAxisLabelsSeabornScatterPlot



請為這篇文章評分?