perlre - Perl regular expressions - Perldoc Browser

文章推薦指數: 80 %
投票人數:10人

DESCRIPTION. This page describes the syntax of regular expressions in Perl. If you haven't used regular expressions before, a tutorial introduction is available ... PerldocBrowser 5.36.0 Latest 5.36.0 5.34.1 5.34.0 5.32.1 5.32.0 5.30.3 5.30.2 5.30.1 5.30.0 5.28.3 5.28.2 5.28.1 5.28.0 5.26.3 5.26.2 5.26.1 5.26.0 5.24.4 5.24.3 5.24.2 5.24.1 5.24.0 5.22.4 5.22.3 5.22.2 5.22.1 5.22.0 5.20.3 5.20.2 5.20.1 5.20.0 5.18.4 5.18.3 5.18.2 5.18.1 5.18.0 5.16.3 5.16.2 5.16.1 5.16.0 5.14.4 5.14.3 5.14.2 5.14.1 5.14.0 5.12.5 5.12.4 5.12.3 5.12.2 5.12.1 5.12.0 5.10.1 5.10.0 5.8.9 5.8.8 5.8.7 5.8.6 5.8.5 5.8.4 5.8.3 5.8.2 5.8.1 5.8.0 5.6.2 5.6.1 5.6.0 5.005_04 5.005_03 5.005_02 5.005_01 5.005 Dev blead 5.37.1 5.37.0 5.36.0-RC3 5.36.0-RC2 5.35.11 5.35.10 5.35.9 5.35.8 5.35.7 5.35.6 5.35.5 5.35.4 5.35.3 5.35.2 5.35.1 5.35.0 5.33.9 5.33.8 5.33.7 5.33.6 5.33.5 5.33.4 5.33.3 5.33.2 5.33.1 5.33.0 Documentation Perl Intro Tutorials FAQs Reference Operators Functions Variables Modules Utilities Community History Expand perlre (source, CPAN) CONTENTS NAME DESCRIPTION TheBasics Metacharacters Modifiers Overview Detailsonsomemodifiers /xand/xx Charactersetmodifiers /l /u /d /a(and/aa) Whichcharactersetmodifierisineffect? CharactersetmodifierbehaviorpriortoPerl5.14 RegularExpressions Quantifiers Escapesequences CharacterClassesandotherSpecialEscapes Assertions Capturegroups Quotingmetacharacters ExtendedPatterns Backtracking ScriptRuns SpecialBacktrackingControlVerbs Warningon\1Insteadof$1 RepeatedPatternsMatchingaZero-lengthSubstring CombiningREPieces CreatingCustomREEngines EmbeddedCodeExecutionFrequency PCRE/PythonSupport BUGS SEEALSO #NAME perlre-Perlregularexpressions #DESCRIPTION ThispagedescribesthesyntaxofregularexpressionsinPerl. Ifyouhaven'tusedregularexpressionsbefore,atutorialintroductionisavailableinperlretut.Ifyouknowjustalittleaboutthem,aquick-startintroductionisavailableinperlrequick. Exceptfor"TheBasics"section,thispageassumesyouarefamiliarwithregularexpressionbasics,likewhatisa"pattern",whatdoesitlooklike,andhowitisbasicallyused.Forareferenceonhowtheyareused,plusvariousexamplesofthesame,seediscussionsofm//,s///,qr//and"??"in"RegexpQuote-LikeOperators"inperlop. Newinv5.22,usere'strict'appliesstricterrulesthanotherwisewhencompilingregularexpressionpatterns.Itcanfindthingsthat,whilelegal,maynotbewhatyouintended. #TheBasics Regularexpressionsarestringswiththeveryparticularsyntaxandmeaningdescribedinthisdocumentandauxiliarydocumentsreferredtobythisone.Thestringsarecalled"patterns".Patternsareusedtodetermineifsomeotherstring,calledthe"target",has(ordoesn'thave)thecharacteristicsspecifiedbythepattern.Wecallthis"matching"thetargetstringagainstthepattern.Usuallythematchisdonebyhavingthetargetbethefirstoperand,andthepatternbethesecondoperand,ofoneofthetwobinaryoperators=~and!~,listedin"BindingOperators"inperlop;andthepatternwillhavebeenconvertedfromanordinarystringbyoneoftheoperatorsin"RegexpQuote-LikeOperators"inperlop,likeso: $foo=~m/abc/ Thisevaluatestotrueifandonlyifthestringinthevariable$foocontainssomewhereinit,thesequenceofcharacters"a","b",then"c".(The=~m,ormatchoperator,isdescribedin"m/PATTERN/msixpodualngc"inperlop.) Patternsthataren'talreadystoredinsomevariablemustbedelimited,atbothends,bydelimitercharacters.Theseareoften,asintheexampleabove,forwardslashes,andthetypicalwayapatterniswrittenindocumentationiswiththoseslashes.Inmostcases,thedelimiteristhesamecharacter,foreandaft,butthereareafewcaseswhereacharacterlookslikeithasamirror-imagemate,wheretheopeningversionisthebeginningdelimiter,andtheclosingoneistheendingdelimiter,like $foo=~m Mosttimes,thepatternisevaluatedindouble-quotishcontext,butitispossibletochoosedelimiterstoforcesingle-quotish,like $foo=~m'abc' Ifthepatterncontainsitsdelimiterwithinit,thatdelimitermustbeescaped.Prefixingitwithabackslash(e.g.,"/foo\/bar/")servesthispurpose. Anysinglecharacterinapatternmatchesthatsamecharacterinthetargetstring,unlessthecharacterisametacharacterwithaspecialmeaningdescribedinthisdocument.Asequenceofnon-metacharactersmatchesthesamesequenceinthetargetstring,aswesawabovewithm/abc/. Onlyafewcharacters(allofthembeingASCIIpunctuationcharacters)aremetacharacters.Themostcommonlyusedoneisadot".",whichnormallymatchesalmostanycharacter(includingadotitself). Youcancausecharactersthatnormallyfunctionasmetacharacterstobeinterpretedliterallybyprefixingthemwitha"\",justlikethepattern'sdelimitermustbeescapedifitalsooccurswithinthepattern.Thus,"\."matchesjustaliteraldot,"."insteadofitsnormalmeaning.Thismeansthatthebackslashisalsoametacharacter,so"\\"matchesasingle"\".Andasequencethatcontainsanescapedmetacharactermatchesthesamesequence(butwithouttheescape)inthetargetstring.So,thepattern/blur\\fl/wouldmatchanytargetstringthatcontainsthesequence"blur\fl". Themetacharacter"|"isusedtomatchonethingoranother.Thus $foo=~m/this|that/ isTRUEifandonlyif$foocontainseitherthesequence"this"orthesequence"that".Likeallmetacharacters,prefixingthe"|"withabackslashmakesitmatchtheplainpunctuationcharacter;initscase,theVERTICALLINE. $foo=~m/this\|that/ isTRUEifandonlyif$foocontainsthesequence"this|that". Youaren'tlimitedtojustasingle"|". $foo=~m/fee|fie|foe|fum/ isTRUEifandonlyif$foocontainsanyofthose4sequencesfromthechildren'sstory"JackandtheBeanstalk". Asyoucansee,the"|"bindslesstightlythanasequenceofordinarycharacters.Wecanoverridethisbyusingthegroupingmetacharacters,theparentheses"("and")". $foo=~m/th(is|at)thing/ isTRUEifandonlyif$foocontainseitherthesequence"thisthing"orthesequence"thatthing".Theportionsofthestringthatmatchtheportionsofthepatternenclosedinparenthesesarenormallymadeavailableseparatelyforuselaterinthepattern,substitution,orprogram.Thisiscalled"capturing",anditcangetcomplicated.See"Capturegroups". Thefirstalternativeincludeseverythingfromthelastpatterndelimiter("(","(?:"(describedlater),etc.orthebeginningofthepattern)uptothefirst"|",andthelastalternativecontainseverythingfromthelast"|"tothenextclosingpatterndelimiter.That'swhyit'scommonpracticetoincludealternativesinparentheses:tominimizeconfusionaboutwheretheystartandend. Alternativesaretriedfromlefttoright,sothefirstalternativefoundforwhichtheentireexpressionmatches,istheonethatischosen.Thismeansthatalternativesarenotnecessarilygreedy.Forexample:whenmatchingfoo|footagainst"barefoot",onlythe"foo"partwillmatch,asthatisthefirstalternativetried,anditsuccessfullymatchesthetargetstring.(Thismightnotseemimportant,butitisimportantwhenyouarecapturingmatchedtextusingparentheses.) Besidestakingawaythespecialmeaningofametacharacter,aprefixedbackslashchangessomeletteranddigitcharactersawayfrommatchingjustthemselvestoinsteadhavespecialmeaning.Thesearecalled"escapesequences",andallsucharedescribedinperlrebackslash.Abackslashsequence(ofaletterordigit)thatdoesn'tcurrentlyhavespecialmeaningtoPerlwillraiseawarningifwarningsareenabled,asthosearereservedforpotentialfutureuse. Onesuchsequenceis\b,whichmatchesaboundaryofsomesort.\b{wb}andafewothersgivespecializedtypesofboundaries.(Theyarealldescribedindetailstartingat"\b{},\b,\B{},\B"inperlrebackslash.)Notethatthesedon'tmatchcharacters,butthezero-widthspacesbetweencharacters.Theyareanexampleofazero-widthassertion.Consideragain, $foo=~m/fee|fie|foe|fum/ ItevaluatestoTRUEif,besidesthose4words,anyofthesequences"feed","field","Defoe","fume",andmanyothersarein$foo.Byjudicioususeof\b(orbetter(becauseitisdesignedtohandlenaturallanguage)\b{wb}),wecanmakesurethatonlytheGiant'swordsarematched: $foo=~m/\b(fee|fie|foe|fum)\b/ $foo=~m/\b{wb}(fee|fie|foe|fum)\b{wb}/ Thefinalexampleshowsthatthecharacters"{"and"}"aremetacharacters. Anotheruseforescapesequencesistospecifycharactersthatcannot(orwhichyouprefernotto)bewrittenliterally.Thesearedescribedindetailin"CharacterEscapes"inperlrebackslash,butthenextthreeparagraphsbrieflydescribesomeofthem. VariouscontrolcharacterscanbewritteninClanguagestyle:"\n"matchesanewline,"\t"atab,"\r"acarriagereturn,"\f"aformfeed,etc. Moregenerally,\nnn,wherennnisastringofthreeoctaldigits,matchesthecharacterwhosenativecodepointisnnn.Youcaneasilyrunintotroubleifyoudon'thaveexactlythreedigits.Soalwaysusethree,orsincePerl5.14,youcanuse\o{...}tospecifyanynumberofoctaldigits. Similarly,\xnn,wherennarehexadecimaldigits,matchesthecharacterwhosenativeordinalisnn.Again,notusingexactlytwodigitsisarecipefordisaster,butyoucanuse\x{...}tospecifyanynumberofhexdigits. Besidesbeingametacharacter,the"."isanexampleofa"characterclass",somethingthatcanmatchanysinglecharacterofagivensetofthem.Initscase,thesetisjustaboutallpossiblecharacters.Perlpredefinesseveralcharacterclassesbesidesthe".";thereisaseparatereferencepageaboutjustthese,perlrecharclass. Youcandefineyourowncustomcharacterclasses,byputtingintoyourpatternintheappropriateplace(s),alistofallthecharactersyouwantintheset.Youdothisbyenclosingthelistwithin[]bracketcharacters.Thesearecalled"bracketedcharacterclasses"whenwearebeingprecise,butoftentheword"bracketed"isdropped.(Droppingitusuallydoesn'tcauseconfusion.)Thismeansthatthe"["characterisanothermetacharacter.Itdoesn'tmatchanythingjustbyitself;itisusedonlytotellPerlthatwhatfollowsitisabracketedcharacterclass.Ifyouwanttomatchaliteralleftsquarebracket,youmustescapeit,like"\[".Thematching"]"isalsoametacharacter;againitdoesn'tmatchanythingbyitself,butjustmarkstheendofyourcustomclasstoPerl.Itisanexampleofa"sometimesmetacharacter".Itisn'tametacharacterifthereisnocorresponding"[",andmatchesitsliteralself: print"]"=~/]/;#prints1 Thelistofcharacterswithinthecharacterclassgivesthesetofcharactersmatchedbytheclass."[abc]"matchesasingle"a"or"b"or"c".Butifthefirstcharacterafterthe"["is"^",theclassinsteadmatchesanycharacternotinthelist.Withinalist,the"-"characterspecifiesarangeofcharacters,sothata-zrepresentsallcharactersbetween"a"and"z",inclusive.Ifyouwanteither"-"or"]"itselftobeamemberofaclass,putitatthestartofthelist(possiblyaftera"^"),orescapeitwithabackslash."-"isalsotakenliterallywhenitisattheendofthelist,justbeforetheclosing"]".(Thefollowingallspecifythesameclassofthreecharacters:[-az],[az-],and[a\-z].Allaredifferentfrom[a-z],whichspecifiesaclasscontainingtwenty-sixcharacters,evenonEBCDIC-basedcharactersets.) Thereislotsmoretobracketedcharacterclasses;fulldetailsarein"BracketedCharacterClasses"inperlrecharclass. #Metacharacters "TheBasics"introducedsomeofthemetacharacters.Thissectiongivesthemall.Mostofthemhavethesamemeaningasintheegrepcommand. Onlythe"\"isalwaysametacharacter.Theothersaremetacharactersjustsometimes.Thefollowingtableslistsallofthem,summarizestheiruse,andgivesthecontextswheretheyaremetacharacters.Outsidethosecontextsorifprefixedbya"\",theymatchtheircorrespondingpunctuationcharacter.Insomecases,theirmeaningvariesdependingonvariouspatternmodifiersthatalterthedefaultbehaviors.See"Modifiers". PURPOSEWHERE \EscapethenextcharacterAlways,exceptwhen escapedbyanother\ ^MatchthebeginningofthestringNotin[] (orline,if/misused) ^Complementthe[]classAtthebeginningof[] .MatchanysinglecharacterexceptnewlineNotin[] (under/s,includesnewline) $MatchtheendofthestringNotin[],butcan (orbeforenewlineattheendofthemeaninterpolatea string;orbeforeanynewlineif/misscalar used) |AlternationNotin[] ()GroupingNotin[] [StartBracketedCharacterclassNotin[] ]EndBracketedCharacterclassOnlyin[],and notfirst *Matchestheprecedingelement0ormoreNotin[] times +Matchestheprecedingelement1ormoreNotin[] times ?Matchestheprecedingelement0or1Notin[] times {Startsasequencethatgivesnumber(s)Notin[] oftimestheprecedingelementcanbe matched {whenfollowingcertainescapesequences startsamodifiertothemeaningofthe sequence }Endsequencestartedby{ -IndicatesarangeOnlyin[]interior #Beginningofcomment,extendstolineendOnlywith/xmodifier Noticethatmostofthemetacharacterslosetheirspecialmeaningwhentheyoccurinabracketedcharacterclass,except"^"hasadifferentmeaningwhenitisatthebeginningofsuchaclass.And"-"and"]"aremetacharactersonlyatrestrictedpositionswithinbracketedcharacterclasses;while"}"isametacharacteronlywhenclosingaspecialconstructstartedby"{". Indouble-quotishcontext,asisusuallythecase,youneedtobecarefulabout"$"andthenon-metacharacter"@".Thosecouldinterpolatevariables,whichmayormaynotbewhatyouintended. Theserulesweredesignedforcompactnessofexpression,ratherthanlegibilityandmaintainability.The"/xand/xx"patternmodifiersallowyoutoinsertwhitespacetoimprovereadability.Anduseofre'strict'addsextracheckingtocatchsometyposthatmightsilentlycompileintosomethingunintended. Bydefault,the"^"characterisguaranteedtomatchonlythebeginningofthestring,the"$"characteronlytheend(orbeforethenewlineattheend),andPerldoescertainoptimizationswiththeassumptionthatthestringcontainsonlyoneline.Embeddednewlineswillnotbematchedby"^"or"$".Youmay,however,wishtotreatastringasamulti-linebuffer,suchthatthe"^"willmatchafteranynewlinewithinthestring(exceptifthenewlineisthelastcharacterinthestring),and"$"willmatchbeforeanynewline.Atthecostofalittlemoreoverhead,youcandothisbyusingthe"/m"modifieronthepatternmatchoperator.(Olderprogramsdidthisbysetting$*,butthisoptionwasremovedinperl5.10.) Tosimplifymulti-linesubstitutions,the"."characternevermatchesanewlineunlessyouusethe/smodifier,whichineffecttellsPerltopretendthestringisasingleline--evenifitisn't. #Modifiers #Overview Thedefaultbehaviorformatchingcanbechanged,usingvariousmodifiers.Modifiersthatrelatetotheinterpretationofthepatternarelistedjustbelow.ModifiersthatalterthewayapatternisusedbyPerlaredetailedin"RegexpQuote-LikeOperators"inperlopand"Gorydetailsofparsingquotedconstructs"inperlop.Modifierscanbeaddeddynamically;see"ExtendedPatterns"below. #m Treatthestringbeingmatchedagainstasmultiplelines.Thatis,change"^"and"$"frommatchingthestartofthestring'sfirstlineandtheendofitslastlinetomatchingthestartandendofeachlinewithinthestring. #s Treatthestringassingleline.Thatis,change"."tomatchanycharacterwhatsoever,evenanewline,whichnormallyitwouldnotmatch. Usedtogether,as/ms,theyletthe"."matchanycharacterwhatsoever,whilestillallowing"^"and"$"tomatch,respectively,justafterandjustbeforenewlineswithinthestring. #i Docase-insensitivepatternmatching.Forexample,"A"willmatch"a"under/i. Iflocalematchingrulesareineffect,thecasemapistakenfromthecurrentlocaleforcodepointslessthan255,andfromUnicoderulesforlargercodepoints.However,matchesthatwouldcrosstheUnicoderules/non-Unicoderulesboundary(ords255/256)willnotsucceed,unlessthelocaleisaUTF-8one.Seeperllocale. ThereareanumberofUnicodecharactersthatmatchasequenceofmultiplecharactersunder/i.Forexample,LATINSMALLLIGATUREFIshouldmatchthesequencefi.Perlisnotcurrentlyabletodothiswhenthemultiplecharactersareinthepatternandaresplitbetweengroupings,orwhenoneormorearequantified.Thus "\N{LATINSMALLLIGATUREFI}"=~/fi/i;#Matches "\N{LATINSMALLLIGATUREFI}"=~/[fi][fi]/i;#Doesn'tmatch! "\N{LATINSMALLLIGATUREFI}"=~/fi*/i;#Doesn'tmatch! #Thebelowdoesn'tmatch,anditisn'tclearwhat$1and$2would #beevenifitdid!! "\N{LATINSMALLLIGATUREFI}"=~/(f)(i)/i;#Doesn'tmatch! Perldoesn'tmatchmultiplecharactersinabracketedcharacterclassunlessthecharacterthatmapstothemisexplicitlymentioned,anditdoesn'tmatchthematallifthecharacterclassisinverted,whichotherwisecouldbehighlyconfusing.See"BracketedCharacterClasses"inperlrecharclass,and"Negation"inperlrecharclass. #xandxx Extendyourpattern'slegibilitybypermittingwhitespaceandcomments.Detailsin"/xand/xx" #p Preservethestringmatchedsuchthat${^PREMATCH},${^MATCH},and${^POSTMATCH}areavailableforuseaftermatching. InPerl5.20andhigherthisisignored.Duetoanewcopy-on-writemechanism,${^PREMATCH},${^MATCH},and${^POSTMATCH}willbeavailableafterthematchregardlessofthemodifier. #a,d,l,andu Thesemodifiers,allnewin5.14,affectwhichcharacter-setrules(Unicode,etc.)areused,asdescribedbelowin"Charactersetmodifiers". #n Preventthegroupingmetacharacters()fromcapturing.Thismodifier,newin5.22,willstop$1,$2,etc...frombeingfilledin. "hello"=~/(hi|hello)/;#$1is"hello" "hello"=~/(hi|hello)/n;#$1isundef Thisisequivalenttoputting?:atthebeginningofeverycapturinggroup: "hello"=~/(?:hi|hello)/;#$1isundef /ncanbenegatedonaper-groupbasis.Alternatively,namedcapturesmaystillbeused. "hello"=~/(?-n:(hi|hello))/n;#$1is"hello" "hello"=~/(?hi|hello)/n;#$1is"hello",$+{greet}is #"hello" #OtherModifiers Thereareanumberofflagsthatcanbefoundattheendofregularexpressionconstructsthatarenotgenericregularexpressionflags,butapplytotheoperationbeingperformed,likematchingorsubstitution(m//ors///respectively). Flagsdescribedfurtherin"UsingregularexpressionsinPerl"inperlretutare: c-keepthecurrentpositionduringrepeatedmatching g-globallymatchthepatternrepeatedlyinthestring Substitution-specificmodifiersdescribedin"s/PATTERN/REPLACEMENT/msixpodualngcer"inperlopare: e-evaluatetheright-handsideasanexpression ee-evaluatetherightsideasastringthenevaltheresult o-pretendtooptimizeyourcode,butactuallyintroducebugs r-performnon-destructivesubstitutionandreturnthenewvalue Regularexpressionmodifiersareusuallywrittenindocumentationase.g.,"the/xmodifier",eventhoughthedelimiterinquestionmightnotreallybeaslash.Themodifiers/imnsxadlupmayalsobeembeddedwithintheregularexpressionitselfusingthe(?...)construct,see"ExtendedPatterns"below. #Detailsonsomemodifiers Someofthemodifiersrequiremoreexplanationthangiveninthe"Overview"above. #/xand/xx Asingle/xtellstheregularexpressionparsertoignoremostwhitespacethatisneitherbackslashednorwithinabracketedcharacterclass.Youcanusethistobreakupyourregularexpressionintomorereadableparts.Also,the"#"characteristreatedasametacharacterintroducingacommentthatrunsuptothepattern'sclosingdelimiter,ortotheendofthecurrentlineifthepatternextendsontothenextline.Hence,thisisverymuchlikeanordinaryPerlcodecomment.(Youcanincludetheclosingdelimiterwithinthecommentonlyifyouprecedeitwithabackslash,sobecareful!) Useof/xmeansthatifyouwantrealwhitespaceor"#"charactersinthepattern(outsideabracketedcharacterclass,whichisunaffectedby/x),thenyou'lleitherhavetoescapethem(usingbackslashesor\Q...\E)orencodethemusingoctal,hex,or\N{}or\p{name=...}escapes.Itisineffectivetotrytocontinueacommentontothenextlinebyescapingthe\nwithabackslashor\Q. Youcanuse"(?#text)"tocreateacommentthatendsearlierthantheendofthecurrentline,buttextalsocan'tcontaintheclosingdelimiterunlessescapedwithabackslash. Acommonpitfallistoforgetthat"#"characters(outsideabracketedcharacterclass)beginacommentunder/xandarenotmatchedliterally.Justkeepthatinmindwhentryingtopuzzleoutwhyaparticular/xpatternisn'tworkingasexpected.Insideabracketedcharacterclass,"#"retainsitsnon-special,literalmeaning. StartinginPerlv5.26,ifthemodifierhasasecond"x"withinit,theeffectofasingle/xisincreased.Theonlydifferenceisthatinsidebracketedcharacterclasses,non-escaped(byabackslash)SPACEandTABcharactersarenotaddedtotheclass,andhencecanbeinsertedtomaketheclassesmorereadable: /[d-eg-i3-7]/xx /[!@"#$%^&*()=?<>']/xx maybeeasiertograspthanthesquashedequivalents /[d-eg-i3-7]/ /[!@"#$%^&*()=?<>']/ Notethatthisunfortunatelydoesn'tmeanthatyourbracketedclassescancontaincommentsorextendovermultiplelines.A#insideacharacterclassisstilljustaliteral#,anddoesn'tintroduceacomment.And,unlesstheclosingbracketisonthesamelineastheopeningone,thenewlinecharacter(andeverythingonthenextline(s)untilterminatedbya]willbepartoftheclass,justasifyou'dwritten\n. Takentogether,thesefeaturesgoalongwaytowardsmakingPerl'sregularexpressionsmorereadable.Here'sanexample: #Delete(most)Ccomments. $program=~s{ /\*#Matchtheopeningdelimiter. .*?#Matchaminimalnumberofcharacters. \*/#Matchtheclosingdelimiter. }[]gsx; Notethatanythinginsidea\Q...\Estaysunaffectedby/x.Andnotethat/xdoesn'taffectspaceinterpretationwithinasinglemulti-characterconstruct.Forexample(?:...)can'thaveaspacebetweenthe"(","?",and":".Withinanydelimitersforsuchaconstruct,allowedspacesarenotaffectedby/x,anddependontheconstruct.Forexample,allconstructsusingcurlybracesasdelimiters,suchas\x{...}canhaveblankswithinbutadjacenttothebraces,butnotelsewhere,andnonon-blankspacecharacters.AnexceptionareUnicodepropertieswhichfollowUnicoderules,forwhichsee"Propertiesaccessiblethrough\p{}and\P{}"inperluniprops. ThesetofcharactersthataredeemedwhitespacearethosethatUnicodecalls"PatternWhiteSpace",namely: U+0009CHARACTERTABULATION U+000ALINEFEED U+000BLINETABULATION U+000CFORMFEED U+000DCARRIAGERETURN U+0020SPACE U+0085NEXTLINE U+200ELEFT-TO-RIGHTMARK U+200FRIGHT-TO-LEFTMARK U+2028LINESEPARATOR U+2029PARAGRAPHSEPARATOR #Charactersetmodifiers /d,/u,/a,and/l,availablestartingin5.14,arecalledthecharactersetmodifiers;theyaffectthecharactersetrulesusedfortheregularexpression. The/d,/u,and/lmodifiersarenotlikelytobeofmuchusetoyou,andsoyouneednotworryaboutthemverymuch.TheyexistforPerl'sinternaluse,sothatcomplexregularexpressiondatastructurescanbeautomaticallyserializedandlaterexactlyreconstituted,includingalltheirnuances.But,sincePerlcan'tkeepasecret,andtheremayberareinstanceswheretheyareuseful,theyaredocumentedhere. The/amodifier,ontheotherhand,maybeuseful.ItspurposeistoallowcodethatistoworkmostlyonASCIIdatatonothavetoconcernitselfwithUnicode. Briefly,/lsetsthecharactersettothatofwhateverLocaleisineffectatthetimeoftheexecutionofthepatternmatch. /usetsthecharactersettoUnicode. /aalsosetsthecharactersettoUnicode,BUTaddsseveralrestrictionsforASCII-safematching. /distheold,problematic,pre-5.14Defaultcharactersetbehavior.Itsonlyuseistoforcethatoldbehavior. Atanygiventime,exactlyoneofthesemodifiersisineffect.TheirexistenceallowsPerltokeeptheoriginallycompiledbehaviorofaregularexpression,regardlessofwhatrulesareineffectwhenitisactuallyexecuted.Andifitisinterpolatedintoalargerregex,theoriginal'srulescontinuetoapplytoit,anddon'taffecttheotherparts. The/land/umodifiersareautomaticallyselectedforregularexpressionscompiledwithinthescopeofvariouspragmas,andwerecommendthatingeneral,youusethosepragmasinsteadofspecifyingthesemodifiersexplicitly.Foronething,themodifiersaffectonlypatternmatching,anddonotextendtoevenanyreplacementdone,whereasusingthepragmasgivesconsistentresultsforallappropriateoperationswithintheirscopes.Forexample, s/foo/\Ubar/il willmatch"foo"usingthelocale'srulesforcase-insensitivematching,butthe/ldoesnotaffecthowthe\Uoperates.Mostlikelyyouwantbothofthemtouselocalerules.Todothis,insteadcompiletheregularexpressionwithinthescopeofuselocale.Thisbothimplicitlyaddsthe/l,andapplieslocalerulestothe\U.Thelessonistouselocale,andnot/lexplicitly. Similarly,itwouldbebettertouseusefeature'unicode_strings'insteadof, s/foo/\Lbar/iu togetUnicoderules,asthe\Lintheformer(butnotnecessarilythelatter)wouldalsouseUnicoderules. Moredetailoneachofthemodifiersfollows.Mostlikelyyoudon'tneedtoknowthisdetailfor/l,/u,and/d,andcanskipaheadto/a. #/l meanstousethecurrentlocale'srules(seeperllocale)whenpatternmatching.Forexample,\wwillmatchthe"word"charactersofthatlocale,and"/i"case-insensitivematchingwillmatchaccordingtothelocale'scasefoldingrules.Thelocaleusedwillbetheoneineffectatthetimeofexecutionofthepatternmatch.Thismaynotbethesameasthecompilation-timelocale,andcandifferfromonematchtoanotherifthereisaninterveningcallofthesetlocale()function. Priortov5.20,Perldidnotsupportmulti-bytelocales.Startingthen,UTF-8localesaresupported.Noothermultibytelocalesareeverlikelytobesupported.However,inalllocales,onecanhavecodepointsabove255andthesewillalwaysbetreatedasUnicodenomatterwhatlocaleisineffect. UnderUnicoderules,thereareafewcase-insensitivematchesthatcrossthe255/256boundary.ExceptforUTF-8localesinPerlsv5.20andlater,thesearedisallowedunder/l.Forexample,0xFF(onASCIIplatforms)doesnotcaselesslymatchthecharacterat0x178,LATINCAPITALLETTERYWITHDIAERESIS,because0xFFmaynotbeLATINSMALLLETTERYWITHDIAERESISinthecurrentlocale,andPerlhasnowayofknowingifthatcharacterevenexistsinthelocale,muchlesswhatcodepointitis. InaUTF-8localeinv5.20andlater,theonlyvisibledifferencebetweenlocaleandnon-localeinregularexpressionsshouldbetainting,ifyourperlsupportstaintchecking(seeperlsec). Thismodifiermaybespecifiedtobethedefaultbyuselocale,butsee"Whichcharactersetmodifierisineffect?". #/u meanstouseUnicoderuleswhenpatternmatching.OnASCIIplatforms,thismeansthatthecodepointsbetween128and255takeontheirLatin-1(ISO-8859-1)meanings(whicharethesameasUnicode's).(OtherwisePerlconsiderstheirmeaningstobeundefined.)Thus,underthismodifier,theASCIIplatformeffectivelybecomesaUnicodeplatform;andhence,forexample,\wwillmatchanyofthemorethan100_000wordcharactersinUnicode. Unlikemostlocales,whicharespecifictoalanguageandcountrypair,Unicodeclassifiesallthecharactersthatareletterssomewhereintheworldas\w.Forexample,yourlocalemightnotthinkthatLATINSMALLLETTERETHisaletter(unlessyouhappentospeakIcelandic),butUnicodedoes.Similarly,allthecharactersthataredecimaldigitssomewhereintheworldwillmatch\d;thisishundreds,not10,possiblematches.Andsomeofthosedigitslooklikesomeofthe10ASCIIdigits,butmeanadifferentnumber,soahumancouldeasilythinkanumberisadifferentquantitythanitreallyis.Forexample,BENGALIDIGITFOUR(U+09EA)looksverymuchlikeanASCIIDIGITEIGHT(U+0038),andLEPCHADIGITSIX(U+1C46)looksverymuchlikeanASCIIDIGITFIVE(U+0035).And,\d+,maymatchstringsofdigitsthatareamixturefromdifferentwritingsystems,creatingasecurityissue.Afraudulentwebsite,forexample,coulddisplaythepriceofsomethingusingU+1C46,anditwouldappeartotheuserthatsomethingcost500units,butitreallycosts600.Abrowserthatenforcedscriptruns("ScriptRuns")wouldpreventthatfraudulentdisplay."num()"inUnicode::UCDcanalsobeusedtosortthisout.Orthe/amodifiercanbeusedtoforce\dtomatchjusttheASCII0through9. Also,underthismodifier,case-insensitivematchingworksonthefullsetofUnicodecharacters.TheKELVINSIGN,forexamplematchestheletters"k"and"K";andLATINSMALLLIGATUREFFmatchesthesequence"ff",which,ifyou'renotprepared,mightmakeitlooklikeahexadecimalconstant,presentinganotherpotentialsecurityissue.Seehttps://unicode.org/reports/tr36foradetaileddiscussionofUnicodesecurityissues. Thismodifiermaybespecifiedtobethedefaultbyusefeature'unicode_strings,uselocale':not_characters',orusev5.12(orhigher),butsee"Whichcharactersetmodifierisineffect?". #/d IMPORTANT:Becauseoftheunpredictablebehaviorsthismodifiercauses,onlyuseittomaintainweirdbackwardcompatibilities.Usetheunicode_stringsfeatureinnewcodetoavoidinadvertentlyenablingthismodifierbydefault. Whatdoesthismodifierdo?It"Depends"! Thismodifiermeanstouseplatform-nativematchingrulesexceptwhenthereiscausetouseUnicoderulesinstead,asfollows: thetargetstring'sUTF8flag(seebelow)isset;or thepattern'sUTF8flag(seebelow)isset;or thepatternexplicitlymentionsacodepointthatisabove255(sayby\x{100});or thepatternusesaUnicodename(\N{...});or thepatternusesaUnicodeproperty(\p{...}or\P{...});or thepatternusesaUnicodebreak(\b{...}or\B{...});or thepatternuses"(?[])" thepatternuses(*script_run:...) Regardingthe"UTF8flag"referencesabove:normallyPerlapplicationsshouldn'tthinkaboutthatflag.It'spartofPerl'sinternals,soitcanchangewheneverPerlwants./dmaythuscauseunpredictableresults.See"The"UnicodeBug""inperlunicode.Thisbughasbecomeratherinfamous,leadingtoyetother(withoutswearing)namesforthismodifierlike"Dicey"and"Dodgy". HerearesomeexamplesofhowthatworksonanASCIIplatform: $str="\xDF";# utf8::downgrade($str);#$strisnotUTF8-flagged. $str=~/^\w/;#Nomatch,sincenoUTF8flag. $str.="\x{0e0b}";#Now$strisUTF8-flagged. $str=~/^\w/;#Match!$strisnowUTF8-flagged. chop$str; $str=~/^\w/;#Stillamatch!$strretainsitsUTF8flag. UnderPerl'sdefaultconfigurationthismodifierisautomaticallyselectedbydefaultwhennoneoftheothersare,soyetanothernameforit(unfortunately)is"Default". Wheneveryoucan,usetheunicode_stringstocausetobethedefaultinstead. #/a(and/aa) ThismodifierstandsforASCII-restrict(orASCII-safe).Thismodifiermaybedoubled-uptoincreaseitseffect. Whenitappearssingly,itcausesthesequences\d,\s,\w,andthePosixcharacterclassestomatchonlyintheASCIIrange.Theythusreverttotheirpre-5.6,pre-Unicodemeanings.Under/a,\dalwaysmeanspreciselythedigits"0"to"9";\smeansthefivecharacters[\f\n\r\t],andstartinginPerlv5.18,theverticaltab;\wmeansthe63characters[A-Za-z0-9_];andlikewise,allthePosixclassessuchas[[:print:]]matchonlytheappropriateASCII-rangecharacters. ThismodifierisusefulforpeoplewhoonlyincidentallyuseUnicode,andwhodonotwishtobeburdenedwithitscomplexitiesandsecurityconcerns. With/a,onecanwrite\dwithconfidencethatitwillonlymatchASCIIcharacters,andshouldtheneedarisetomatchbeyondASCII,youcaninsteaduse\p{Digit}(or\p{Word}for\w).Therearesimilar\p{...}constructsthatcanmatchbeyondASCIIbothwhitespace(see"Whitespace"inperlrecharclass),andPosixclasses(see"POSIXCharacterClasses"inperlrecharclass).Thus,thismodifierdoesn'tmeanyoucan'tuseUnicode,itmeansthattogetUnicodematchingyoumustexplicitlyuseaconstruct(\p{},\P{})thatsignalsUnicode. Asyouwouldexpect,thismodifiercauses,forexample,\Dtomeanthesamethingas[^0-9];infact,allnon-ASCIIcharactersmatch\D,\S,and\W.\bstillmeanstomatchattheboundarybetween\wand\W,usingthe/adefinitionsofthem(similarlyfor\B). Otherwise,/abehaveslikethe/umodifier,inthatcase-insensitivematchingusesUnicoderules;forexample,"k"willmatchtheUnicode\N{KELVINSIGN}under/imatching,andcodepointsintheLatin1range,aboveASCIIwillhaveUnicoderuleswhenitcomestocase-insensitivematching. ToforbidASCII/non-ASCIImatches(like"k"with\N{KELVINSIGN}),specifythe"a"twice,forexample/aaior/aia.(Thefirstoccurrenceof"a"restrictsthe\d,etc.,andthesecondoccurrenceaddsthe/irestrictions.)But,notethatcodepointsoutsidetheASCIIrangewilluseUnicoderulesfor/imatching,sothemodifierdoesn'treallyrestrictthingstojustASCII;itjustforbidstheintermixingofASCIIandnon-ASCII. Tosummarize,thismodifierprovidesprotectionforapplicationsthatdon'twishtobeexposedtoallofUnicode.Specifyingittwicegivesaddedprotection. Thismodifiermaybespecifiedtobethedefaultbyusere'/a'orusere'/aa'.Ifyoudoso,youmayactuallyhaveoccasiontousethe/umodifierexplicitlyifthereareafewregularexpressionswhereyoudowantfullUnicoderules(butevenhere,it'sbestifeverythingwereunderfeature"unicode_strings",alongwiththeusere'/aa').Alsosee"Whichcharactersetmodifierisineffect?". #Whichcharactersetmodifierisineffect? Whichofthesemodifiersisineffectatanygivenpointinaregularexpressiondependsonafairlycomplexsetofinteractions.Thesehavebeendesignedsothatingeneralyoudon'thavetoworryaboutit,butthissectiongivesthegorydetails.Asexplainedbelowin"ExtendedPatterns"itispossibletoexplicitlyspecifymodifiersthatapplyonlytoportionsofaregularexpression.Theinnermostalwayshaspriorityoveranyouterones,andoneapplyingtothewholeexpressionhaspriorityoveranyofthedefaultsettingsthataredescribedintheremainderofthissection. Theusere'/foo'pragmacanbeusedtosetdefaultmodifiers(includingthese)forregularexpressionscompiledwithinitsscope.Thispragmahasprecedenceovertheotherpragmaslistedbelowthatalsochangethedefaults. Otherwise,uselocalesetsthedefaultmodifierto/l;andusefeature'unicode_strings,orusev5.12(orhigher)setthedefaultto/uwhennotinthesamescopeaseitheruselocaleorusebytes.(uselocale':not_characters'alsosetsthedefaultto/u,overridinganyplainuselocale.)Unlikethemechanismsmentionedabove,theseaffectoperationsbesidesregularexpressionspatternmatching,andsogivemoreconsistentresultswithotheroperators,includingusing\U,\l,etc.insubstitutionreplacements. Ifnoneoftheaboveapply,forbackwardscompatibilityreasons,the/dmodifieristheoneineffectbydefault.Asthiscanleadtounexpectedresults,itisbesttospecifywhichotherrulesetshouldbeused. #CharactersetmodifierbehaviorpriortoPerl5.14 Priorto5.14,therewerenoexplicitmodifiers,but/lwasimpliedforregexescompiledwithinthescopeofuselocale,and/dwasimpliedotherwise.However,interpolatingaregexintoalargerregexwouldignoretheoriginalcompilationinfavorofwhateverwasineffectatthetimeofthesecondcompilation.Therewereanumberofinconsistencies(bugs)withthe/dmodifier,whereUnicoderuleswouldbeusedwheninappropriate,andviceversa.\p{}didnotimplyUnicoderules,andneitherdidalloccurrencesof\N{},until5.12. #RegularExpressions #Quantifiers Quantifiersareusedwhenaparticularportionofapatternneedstomatchacertainnumber(ornumbers)oftimes.Ifthereisn'taquantifierthenumberoftimestomatchisexactlyone.Thefollowingstandardquantifiersarerecognized: *Match0ormoretimes +Match1ormoretimes ?Match1or0times {n}Matchexactlyntimes {n,}Matchatleastntimes {,n}Matchatmostntimes {n,m}Matchatleastnbutnotmorethanmtimes (Ifanon-escapedcurlybracketoccursinacontextotherthanoneofthequantifierslistedabove,whereitdoesnotformpartofabackslashedsequencelike\x{...},itiseitherafatalsyntaxerror,ortreatedasaregularcharacter,generallywithadeprecationwarningraised.Toescapeit,youcanprecedeitwithabackslash("\{")orencloseitwithinsquarebrackets("[{]").Thischangewillallowforfuturesyntaxextensions(likemakingthelowerboundofaquantifieroptional),andbettererrorcheckingofquantifiers). The"*"quantifierisequivalentto{0,},the"+"quantifierto{1,},andthe"?"quantifierto{0,1}.nandmarelimitedtonon-negativeintegralvalueslessthanapresetlimitdefinedwhenperlisbuilt.Thisisusually65534onthemostcommonplatforms.Theactuallimitcanbeseenintheerrormessagegeneratedbycodesuchasthis: $_**=$_,/{$_}/for2..42; Bydefault,aquantifiedsubpatternis"greedy",thatis,itwillmatchasmanytimesaspossible(givenaparticularstartinglocation)whilestillallowingtherestofthepatterntomatch.Ifyouwantittomatchtheminimumnumberoftimespossible,followthequantifierwitha"?".Notethatthemeaningsdon'tchange,justthe"greediness": *?Match0ormoretimes,notgreedily +?Match1ormoretimes,notgreedily ??Match0or1time,notgreedily {n}?Matchexactlyntimes,notgreedily(redundant) {n,}?Matchatleastntimes,notgreedily {,n}?Matchatmostntimes,notgreedily {n,m}?Matchatleastnbutnotmorethanmtimes,notgreedily Normallywhenaquantifiedsubpatterndoesnotallowtherestoftheoverallpatterntomatch,Perlwillbacktrack.However,thisbehaviourissometimesundesirable.ThusPerlprovidesthe"possessive"quantifierformaswell. *+Match0ormoretimesandgivenothingback ++Match1ormoretimesandgivenothingback ?+Match0or1timeandgivenothingback {n}+Matchexactlyntimesandgivenothingback(redundant) {n,}+Matchatleastntimesandgivenothingback {,n}+Matchatmostntimesandgivenothingback {n,m}+Matchatleastnbutnotmorethanmtimesandgivenothingback Forinstance, 'aaaa'=~/a++a/ willnevermatch,asthea++willgobbleupallthe"a"'sinthestringandwon'tleaveanyfortheremainingpartofthepattern.Thisfeaturecanbeextremelyusefultogiveperlhintsaboutwhereitshouldn'tbacktrack.Forinstance,thetypical"matchadouble-quotedstring"problemcanbemostefficientlyperformedwhenwrittenas: /"(?:[^"\\]++|\\.)*+"/ asweknowthatifthefinalquotedoesnotmatch,backtrackingwillnothelp.Seetheindependentsubexpression"(?>pattern)"formoredetails;possessivequantifiersarejustsyntacticsugarforthatconstruct.Forinstancetheaboveexamplecouldalsobewrittenasfollows: /"(?>(?:(?>[^"\\]+)|\\.)*)"/ Notethatthepossessivequantifiermodifiercannotbecombinedwiththenon-greedymodifier.Thisisbecauseitwouldmakenosense.Considerthefollowequivalencytable: IllegalLegal ------------------ X??+X{0} X+?+X{1} X{min,max}?+X{min} #Escapesequences Becausepatternsareprocessedasdouble-quotedstrings,thefollowingalsowork: \ttab(HT,TAB) \nnewline(LF,NL) \rreturn(CR) \fformfeed(FF) \aalarm(bell)(BEL) \eescape(thinktroff)(ESC) \cKcontrolchar(example:VT) \x{},\x00characterwhoseordinalisthegivenhexadecimalnumber \N{name}namedUnicodecharacterorcharactersequence \N{U+263D}Unicodecharacter(example:FIRSTQUARTERMOON) \o{},\000characterwhoseordinalisthegivenoctalnumber \llowercasenextchar(thinkvi) \uuppercasenextchar(thinkvi) \Llowercaseuntil\E(thinkvi) \Uuppercaseuntil\E(thinkvi) \Qquote(disable)patternmetacharactersuntil\E \Eendeithercasemodificationorquotedsection,thinkvi Detailsarein"QuoteandQuote-likeOperators"inperlop. #CharacterClassesandotherSpecialEscapes Inaddition,Perldefinesthefollowing: SequenceNoteDescription [...][1]Matchacharacteraccordingtotherulesofthe bracketedcharacterclassdefinedbythe"...". Example:[a-z]matches"a"or"b"or"c"...or"z" [[:...:]][2]MatchacharacteraccordingtotherulesofthePOSIX characterclass"..."withintheouterbracketed characterclass.Example:[[:upper:]]matchesany uppercasecharacter. (?[...])[8]Extendedbracketedcharacterclass \w[3]Matcha"word"character(alphanumericplus"_",plus otherconnectorpunctuationcharsplusUnicode marks) \W[3]Matchanon-"word"character \s[3]Matchawhitespacecharacter \S[3]Matchanon-whitespacecharacter \d[3]Matchadecimaldigitcharacter \D[3]Matchanon-digitcharacter \pP[3]MatchP,namedproperty.Use\p{Prop}forlongernames \PP[3]Matchnon-P \X[4]MatchUnicode"eXtendedgraphemecluster" \1[5]Backreferencetoaspecificcapturegrouporbuffer. '1'mayactuallybeanypositiveinteger. \g1[5]Backreferencetoaspecificorpreviousgroup, \g{-1}[5]Thenumbermaybenegativeindicatingarelative previousgroupandmayoptionallybewrappedin curlybracketsforsaferparsing. \g{name}[5]Namedbackreference \k[5]Namedbackreference \k'name'[5]Namedbackreference \k{name}[5]Namedbackreference \K[6]Keepthestuffleftofthe\K,don'tincludeitin$& \N[7]Anycharacterbut\n.Notaffectedby/smodifier \v[3]Verticalwhitespace \V[3]Notverticalwhitespace \h[3]Horizontalwhitespace \H[3]Nothorizontalwhitespace \R[4]Linebreak #[1] See"BracketedCharacterClasses"inperlrecharclassfordetails. #[2] See"POSIXCharacterClasses"inperlrecharclassfordetails. #[3] See"UnicodeCharacterProperties"inperlunicodefordetails #[4] See"Misc"inperlrebackslashfordetails. #[5] See"Capturegroups"belowfordetails. #[6] See"ExtendedPatterns"belowfordetails. #[7] Notethat\Nhastwomeanings.Whenoftheform\N{NAME},itmatchesthecharacterorcharactersequencewhosenameisNAME;andsimilarlywhenoftheform\N{U+hex},itmatchesthecharacterwhoseUnicodecodepointishex.Otherwiseitmatchesanycharacterbut\n. #[8] See"ExtendedBracketedCharacterClasses"inperlrecharclassfordetails. #Assertions Besides"^"and"$",Perldefinesthefollowingzero-widthassertions: \b{}MatchatUnicodeboundaryofspecifiedtype \B{}Matchwherecorresponding\b{}doesn'tmatch \bMatcha\w\Wor\W\wboundary \BMatchexceptata\w\Wor\W\wboundary \AMatchonlyatbeginningofstring \ZMatchonlyatendofstring,orbeforenewlineattheend \zMatchonlyatendofstring \GMatchonlyatpos()(e.g.attheend-of-matchposition ofpriorm//g) AUnicodeboundary(\b{}),availablestartinginv5.22,isaspotbetweentwocharacters,orbeforethefirstcharacterinthestring,orafterthefinalcharacterinthestringwherecertaincriteriadefinedbyUnicodearemet.See"\b{},\b,\B{},\B"inperlrebackslashfordetails. Awordboundary(\b)isaspotbetweentwocharactersthathasa\wononesideofitanda\Wontheothersideofit(ineitherorder),countingtheimaginarycharactersoffthebeginningandendofthestringasmatchinga\W.(Withincharacterclasses\brepresentsbackspaceratherthanawordboundary,justasitnormallydoesinanydouble-quotedstring.)The\Aand\Zarejustlike"^"and"$",exceptthattheywon'tmatchmultipletimeswhenthe/mmodifierisused,while"^"and"$"willmatchateveryinternallineboundary.Tomatchtheactualendofthestringandnotignoreanoptionaltrailingnewline,use\z. The\Gassertioncanbeusedtochainglobalmatches(usingm//g),asdescribedin"RegexpQuote-LikeOperators"inperlop.Itisalsousefulwhenwritinglex-likescanners,whenyouhaveseveralpatternsthatyouwanttomatchagainstconsequentsubstringsofyourstring;seethepreviousreference.Theactuallocationwhere\Gwillmatchcanalsobeinfluencedbyusingpos()asanlvalue:see"pos"inperlfunc.Notethattheruleforzero-lengthmatches(see"RepeatedPatternsMatchingaZero-lengthSubstring")ismodifiedsomewhat,inthatcontentstotheleftof\Garenotcountedwhendeterminingthelengthofthematch.Thusthefollowingwillnotmatchforever: my$string='ABC'; pos($string)=1; while($string=~/(.\G)/g){ print$1; } Itwillprint'A'andthenterminate,asitconsidersthematchtobezero-width,andthuswillnotmatchatthesamepositiontwiceinarow. Itisworthnotingthat\Gimproperlyusedcanresultinaninfiniteloop.Takecarewhenusingpatternsthatinclude\Ginanalternation. Notealsothats///willrefusetooverwritepartofasubstitutionthathasalreadybeenreplaced;soforexamplethiswillstopafterthefirstiteration,ratherthaniteratingitswaybackwardsthroughthestring: $_="123456789"; pos=6; s/.(?=.\G)/X/g; print;#prints1234X6789,notXXXXX6789 #Capturegroups Thegroupingconstruct(...)createscapturegroups(alsoreferredtoascapturebuffers).Torefertothecurrentcontentsofagrouplateron,withinthesamepattern,use\g1(or\g{1})forthefirst,\g2(or\g{2})forthesecond,andsoon.Thisiscalledabackreference.Thereisnolimittothenumberofcapturedsubstringsthatyoumayuse.Groupsarenumberedwiththeleftmostopenparenthesisbeingnumber1,etc.Ifagroupdidnotmatch,theassociatedbackreferencewon'tmatcheither.(Thiscanhappenifthegroupisoptional,orinadifferentbranchofanalternation.)Youcanomitthe"g",andwrite"\1",etc,buttherearesomeissueswiththisform,describedbelow. Youcanalsorefertocapturegroupsrelatively,byusinganegativenumber,sothat\g-1and\g{-1}bothrefertotheimmediatelyprecedingcapturegroup,and\g-2and\g{-2}bothrefertothegroupbeforeit.Forexample: / (Y)#group1 (#group2 (X)#group3 \g{-1}#backreftogroup3 \g{-3}#backreftogroup1 ) /x wouldmatchthesameas/(Y)((X)\g3\g1)/x.Thisallowsyoutointerpolateregexesintolargerregexesandnothavetoworryaboutthecapturegroupsbeingrenumbered. Youcandispensewithnumbersaltogetherandcreatenamedcapturegroups.Thenotationis(?...)todeclareand\g{name}toreference.(Tobecompatiblewith.Netregularexpressions,\g{name}mayalsobewrittenas\k{name},\kor\k'name'.)namemustnotbeginwithanumber,norcontainhyphens.Whendifferentgroupswithinthesamepatternhavethesamename,anyreferencetothatnameassumestheleftmostdefinedgroup.Namedgroupscountinabsoluteandrelativenumbering,andsocanalsobereferredtobythosenumbers.(It'spossibletodothingswithnamedcapturegroupsthatwouldotherwiserequire(??{}).) Capturegroupcontentsaredynamicallyscopedandavailabletoyououtsidethepatternuntiltheendoftheenclosingblockoruntilthenextsuccessfulmatch,whichevercomesfirst.(See"CompoundStatements"inperlsyn.)Youcanrefertothembyabsolutenumber(using"$1"insteadof"\g1",etc);orbynameviathe%+hash,using"$+{name}". Bracesarerequiredinreferringtonamedcapturegroups,butareoptionalforabsoluteorrelativenumberedones.Bracesaresaferwhencreatingaregexbyconcatenatingsmallerstrings.Forexampleifyouhaveqr/$a$b/,and$acontained"\g1",and$bcontained"37",youwouldget/\g137/whichisprobablynotwhatyouintended. Ifyouusebraces,youmayalsooptionallyaddanynumberofblank(spaceortab)characterswithinbutadjacenttothebraces,like\g{-1},or\k{name}. The\gand\knotationswereintroducedinPerl5.10.0.Priortothattherewerenonamednorrelativenumberedcapturegroups.Absolutenumberedgroupswerereferredtousing\1,\2,etc.,andthisnotationisstillaccepted(andlikelyalwayswillbe).Butitleadstosomeambiguitiesiftherearemorethan9capturegroups,as\10couldmeaneitherthetenthcapturegroup,orthecharacterwhoseordinalinoctalis010(abackspaceinASCII).Perlresolvesthisambiguitybyinterpreting\10asabackreferenceonlyifatleast10leftparentheseshaveopenedbeforeit.Likewise\11isabackreferenceonlyifatleast11leftparentheseshaveopenedbeforeit.Andsoon.\1through\9arealwaysinterpretedasbackreferences.Thereareseveralexamplesbelowthatillustratetheseperils.Youcanavoidtheambiguitybyalwaysusing\g{}or\gifyoumeancapturinggroups;andforoctalconstantsalwaysusing\o{},orfor\077andbelow,using3digitspaddedwithleadingzeros,sincealeadingzeroimpliesanoctalconstant. The\digitnotationalsoworksincertaincircumstancesoutsidethepattern.See"Warningon\1Insteadof$1"belowfordetails. Examples: s/^([^]*)*([^]*)/$2$1/;#swapfirsttwowords /(.)\g1/#findfirstdoubledchar andprint"'$1'isthefirstdoubledcharacter\n"; /(?.)\k/#...adifferentway andprint"'$+{char}'isthefirstdoubledcharacter\n"; /(?'char'.)\g1/#...mixandmatch andprint"'$1'isthefirstdoubledcharacter\n"; if(/Time:(..):(..):(..)/){#parseoutvalues $hours=$1; $minutes=$2; $seconds=$3; } /(.)(.)(.)(.)(.)(.)(.)(.)(.)\g10/#\g10isabackreference /(.)(.)(.)(.)(.)(.)(.)(.)(.)\10/#\10isoctal /((.)(.)(.)(.)(.)(.)(.)(.)(.))\10/#\10isabackreference /((.)(.)(.)(.)(.)(.)(.)(.)(.))\010/#\010isoctal $a='(.)\1';#Createsproblemswhenconcatenated. $b='(.)\g{1}';#Avoidstheproblems. "aa"=~/${a}/;#True "aa"=~/${b}/;#True "aa0"=~/${a}0/;#False! "aa0"=~/${b}0/;#True "aa\x08"=~/${a}0/;#True! "aa\x08"=~/${b}0/;#False Severalspecialvariablesalsoreferbacktoportionsofthepreviousmatch.$+returnswhateverthelastbracketmatchmatched.$&returnstheentirematchedstring.(Atonepoint$0didalso,butnowitreturnsthenameoftheprogram.)$`returnseverythingbeforethematchedstring.$'returnseverythingafterthematchedstring.And$^Ncontainswhateverwasmatchedbythemost-recentlyclosedgroup(submatch).$^Ncanbeusedinextendedpatterns(seebelow),forexampletoassignasubmatchtoavariable. Thesespecialvariables,likethe%+hashandthenumberedmatchvariables($1,$2,$3,etc.)aredynamicallyscopeduntiltheendoftheenclosingblockoruntilthenextsuccessfulmatch,whichevercomesfirst.(See"CompoundStatements"inperlsyn.) The@{^CAPTURE}arraymaybeusedtoaccessALLofthecapturebuffersasanarraywithoutneedingtoknowhowmanythereare.Forinstance $string=~/$pattern/and@captured=@{^CAPTURE}; willplaceacopyofeachcapturevariable,$1,$2etc,intothe@capturedarray. Beawarethatwheninterpolatingasubscriptofthe@{^CAPTURE}arrayyoumustusedemarcatedcurlybracenotation: print"@{^CAPTURE[0]}"; See"Demarcatedvariablenamesusingbraces"inperldataformoreonthisnotation. NOTE:FailedmatchesinPerldonotresetthematchvariables,whichmakesiteasiertowritecodethattestsforaseriesofmorespecificcasesandremembersthebestmatch. WARNING:IfyourcodeistorunonPerl5.16orearlier,bewarethatoncePerlseesthatyouneedoneof$&,$`,or$'anywhereintheprogram,ithastoprovidethemforeverypatternmatch.Thismaysubstantiallyslowyourprogram. Perlusesthesamemechanismtoproduce$1,$2,etc,soyoualsopayapriceforeachpatternthatcontainscapturingparentheses.(Toavoidthiscostwhileretainingthegroupingbehaviour,usetheextendedregularexpression(?:...)instead.)Butifyouneveruse$&,$`or$',thenpatternswithoutcapturingparentheseswillnotbepenalized.Soavoid$&,$',and$`ifyoucan,butifyoucan't(andsomealgorithmsreallyappreciatethem),onceyou'veusedthemonce,usethematwill,becauseyou'vealreadypaidtheprice. Perl5.16introducedaslightlymoreefficientmechanismthatnotesseparatelywhethereachof$`,$&,and$'havebeenseen,andthusmayonlyneedtocopypartofthestring.Perl5.20introducedamuchmoreefficientcopy-on-writemechanismwhicheliminatesanyslowdown. Asanotherworkaroundforthisproblem,Perl5.10.0introduced${^PREMATCH},${^MATCH}and${^POSTMATCH},whichareequivalentto$`,$&and$',exceptthattheyareonlyguaranteedtobedefinedafterasuccessfulmatchthatwasexecutedwiththe/p(preserve)modifier.Theuseofthesevariablesincursnoglobalperformancepenalty,unliketheirpunctuationcharacterequivalents,howeveratthetrade-offthatyouhavetotellperlwhenyouwanttousethem.AsofPerl5.20,thesethreevariablesareequivalentto$`,$&and$',and/pisignored. #Quotingmetacharacters BackslashedmetacharactersinPerlarealphanumeric,suchas\b,\w,\n.Unlikesomeotherregularexpressionlanguages,therearenobackslashedsymbolsthataren'talphanumeric.Soanythingthatlookslike\\,\(,\),\[,\],\{,or\}isalwaysinterpretedasaliteralcharacter,notametacharacter.Thiswasonceusedinacommonidiomtodisableorquotethespecialmeaningsofregularexpressionmetacharactersinastringthatyouwanttouseforapattern.Simplyquoteallnon-"word"characters: $pattern=~s/(\W)/\\$1/g; (Ifuselocaleisset,thenthisdependsonthecurrentlocale.)Todayitismorecommontousethequotemeta()functionorthe\Qmetaquotingescapesequencetodisableallmetacharacters'specialmeaningslikethis: /$unquoted\Q$quoted\E$unquoted/ Bewarethatifyouputliteralbackslashes(thosenotinsideinterpolatedvariables)between\Qand\E,double-quotishbackslashinterpolationmayleadtoconfusingresults.Ifyouneedtouseliteralbackslasheswithin\Q...\E,consult"Gorydetailsofparsingquotedconstructs"inperlop. quotemeta()and\Qarefullydescribedin"quotemeta"inperlfunc. #ExtendedPatterns Perlalsodefinesaconsistentextensionsyntaxforfeaturesnotfoundinstandardtoolslikeawkandlex.Thesyntaxformostoftheseisapairofparentheseswithaquestionmarkasthefirstthingwithintheparentheses.Thecharacterafterthequestionmarkindicatestheextension. Aquestionmarkwaschosenforthisandfortheminimal-matchingconstructbecause1)questionmarksarerareinolderregularexpressions,and2)wheneveryouseeone,youshouldstopand"question"exactlywhatisgoingon.That'spsychology.... #(?#text) Acomment.Thetextisignored.NotethatPerlclosesthecommentassoonasitseesa")",sothereisnowaytoputaliteral")"inthecomment.Thepattern'sclosingdelimitermustbeescapedbyabackslashifitappearsinthecomment. See"/x"foranotherwaytohavecommentsinpatterns. Notethatacommentcangojustaboutanywhere,exceptinthemiddleofanescapesequence.Examples: qr/foo(?#comment)bar/'#Matches'foobar' #Thepatternbelowmatches'abcd','abccd',or'abcccd' qr/abc(?#commentbetweenliteralanditsquantifier){1,3}d/ #Thepatternbelowgeneratesasyntaxerror,becausethe'\p'must #befollowedimmediatelybya'{'. qr/\p(?#commentbetween\panditspropertyname){Any}/ #Thepatternbelowgeneratesasyntaxerror,becausetheinitial #'\('isaliteralopeningparenthesis,andsothereisnothing #fortheclosing')'tomatch qr/\(?#thebackslashmeansthisisn'tacomment)p{Any}/ #Commentscanbeusedtofoldlongpatternsintomultiplelines qr/Firstpartofalongregex(?# )remainingpart/ #(?adlupimnsx-imnsx) #(?^alupimnsx) Zeroormoreembeddedpattern-matchmodifiers,tobeturnedon(orturnedoffifprecededby"-")fortheremainderofthepatternortheremainderoftheenclosingpatterngroup(ifany). Thisisparticularlyusefulfordynamically-generatedpatterns,suchasthosereadinfromaconfigurationfile,takenfromanargument,orspecifiedinatablesomewhere.Considerthecasewheresomepatternswanttobecase-sensitiveandsomedonot:Thecase-insensitiveonesmerelyneedtoinclude(?i)atthefrontofthepattern.Forexample: $pattern="foobar"; if(/$pattern/i){} #moreflexible: $pattern="(?i)foobar"; if(/$pattern/){} Thesemodifiersarerestoredattheendoftheenclosinggroup.Forexample, ((?i)blah)\s+\g1 willmatchblahinanycase,somespaces,andanexact(includingthecase!)repetitionofthepreviousword,assumingthe/xmodifier,andno/imodifieroutsidethisgroup. Thesemodifiersdonotcarryoverintonamedsubpatternscalledintheenclosinggroup.Inotherwords,apatternsuchas((?i)(?&NAME))doesnotchangethecase-sensitivityoftheNAMEpattern. Amodifierisoverriddenbylateroccurrencesofthisconstructinthesamescopecontainingthesamemodifier,sothat /((?im)foo(?-m)bar)/ matchesalloffoobarcaseinsensitively,butuses/mrulesforonlythefooportion.The"a"flagoverridesaaaswell;likewiseaaoverrides"a".Thesamegoesfor"x"andxx.Hence,in /(?-x)foo/xx both/xand/xxareturnedoffduringmatchingfoo.Andin /(?x)foo/x /xbutNOT/xxisturnedonformatchingfoo.(Onemightmistakenlythinkthatsincetheinner(?x)isalreadyinthescopeof/x,thattheresultwouldeffectivelybethesumofthem,yielding/xx.Itdoesn'tworkthatway.)Similarly,doingsomethinglike(?xx-x)footurnsoffall"x"behaviorformatchingfoo,itisnotthatyousubtract1"x"from2toget1"x"remaining. Anyofthesemodifierscanbesettoapplygloballytoallregularexpressionscompiledwithinthescopeofausere.See"'/flags'mode"inre. StartinginPerl5.14,a"^"(caretorcircumflexaccent)immediatelyafterthe"?"isashorthandequivalenttod-imnsx.Flags(except"d")mayfollowthecarettooverrideit.Butaminussignisnotlegalwithit. Notethatthe"a","d","l","p",and"u"modifiersarespecialinthattheycanonlybeenabled,notdisabled,andthe"a","d","l",and"u"modifiersaremutuallyexclusive:specifyingonede-specifiestheothers,andamaximumofone(ortwo"a"'s)mayappearintheconstruct.Thus,forexample,(?-p)willwarnwhencompiledunderusewarnings;(?-d:...)and(?dl:...)arefatalerrors. Notealsothatthe"p"modifierisspecialinthatitspresenceanywhereinapatternhasaglobaleffect. Havingzeromodifiersmakesthisano-op(sowhydidyouspecifyit,unlessit'sgeneratedcode),andstartinginv5.30,warnsunderusere'strict'. #(?:pattern) #(?adluimnsx-imnsx:pattern) #(?^aluimnsx:pattern) Thisisforclustering,notcapturing;itgroupssubexpressionslike"()",butdoesn'tmakebackreferencesas"()"does.So @fields=split(/\b(?:a|b|c)\b/) matchesthesamefielddelimitersas @fields=split(/\b(a|b|c)\b/) butdoesn'tspitoutthedelimitersthemselvesasextrafields(eventhoughthat'sthebehaviourof"split"inperlfuncwhenitspatterncontainscapturinggroups).It'salsocheapernottocapturecharactersifyoudon'tneedto. Anylettersbetween"?"and":"actasflagsmodifiersaswith(?adluimnsx-imnsx).Forexample, /(?s-i:more.*than).*million/i isequivalenttothemoreverbose /(?:(?s-i)more.*than).*million/i Notethatany()constructsenclosedwithinthisonewillstillcaptureunlessthe/nmodifierisineffect. Likethe"(?adlupimnsx-imnsx)"construct,aaand"a"overrideeachother,asdoxxand"x".Theyarenotadditive.So,doingsomethinglike(?xx-x:foo)turnsoffall"x"behaviorformatchingfoo. StartinginPerl5.14,a"^"(caretorcircumflexaccent)immediatelyafterthe"?"isashorthandequivalenttod-imnsx.Anypositiveflags(except"d")mayfollowthecaret,so (?^x:foo) isequivalentto (?x-imns:foo) ThecarettellsPerlthatthisclusterdoesn'tinherittheflagsofanysurroundingpattern,butusesthesystemdefaults(d-imnsx),modifiedbyanyflagsspecified. Thecaretallowsforsimplerstringificationofcompiledregularexpressions.Theselooklike (?^:pattern) withanynon-defaultflagsappearingbetweenthecaretandthecolon.Atestthatlooksatsuchstringificationthusdoesn'tneedtohavethesystemdefaultflagshard-codedinit,justthecaret.IfnewflagsareaddedtoPerl,themeaningofthecaret'sexpansionwillchangetoincludethedefaultforthoseflags,sothetestwillstillwork,unchanged. Specifyinganegativeflagafterthecaretisanerror,astheflagisredundant. Mnemonicfor(?^...):Afreshbeginningsincetheusualuseofacaretistomatchatthebeginning. #(?|pattern) Thisisthe"branchreset"pattern,whichhasthespecialpropertythatthecapturegroupsarenumberedfromthesamestartingpointineachalternationbranch.Itisavailablestartingfromperl5.10.0. Capturegroupsarenumberedfromlefttoright,butinsidethisconstructthenumberingisrestartedforeachbranch. Thenumberingwithineachbranchwillbeasnormal,andanygroupsfollowingthisconstructwillbenumberedasthoughtheconstructcontainedonlyonebranch,thatbeingtheonewiththemostcapturegroupsinit. Thisconstructisusefulwhenyouwanttocaptureoneofanumberofalternativematches. Considerthefollowingpattern.Thenumbersunderneathshowinwhichgroupthecapturedcontentwillbestored. #before---------------branch-reset-----------after /(a)(?|x(y)z|(p(q)r)|(t)u(v))(z)/x #1223234 Becarefulwhenusingthebranchresetpatternincombinationwithnamedcaptures.Namedcapturesareimplementedasbeingaliasestonumberedgroupsholdingthecaptures,andthatinterfereswiththeimplementationofthebranchresetpattern.Ifyouareusingnamedcapturesinabranchresetpattern,it'sbesttousethesamenames,inthesameorder,ineachofthealternations: /(?|(?x)(?y) |(?z)(?w))/x Notdoingsomayleadtosurprises: "12"=~/(?|(?\d+)|(?\D+))/x; say$+{a};#Prints'12' say$+{b};#*Also*prints'12'. Theproblemhereisthatboththegroupnamedaandthegroupnamedbarealiasesforthegroupbelongingto$1. #LookaroundAssertions Lookaroundassertionsarezero-widthpatternswhichmatchaspecificpatternwithoutincludingitin$&.Positiveassertionsmatchwhentheirsubpatternmatches,negativeassertionsmatchwhentheirsubpatternfails.Lookbehindmatchestextuptothecurrentmatchposition,lookaheadmatchestextfollowingthecurrentmatchposition. #(?=pattern) #(*pla:pattern) #(*positive_lookahead:pattern) Azero-widthpositivelookaheadassertion.Forexample,/\w+(?=\t)/matchesawordfollowedbyatab,withoutincludingthetabin$&. #(?!pattern) #(*nla:pattern) #(*negative_lookahead:pattern) Azero-widthnegativelookaheadassertion.Forexample/foo(?!bar)/matchesanyoccurrenceof"foo"thatisn'tfollowedby"bar".NotehoweverthatlookaheadandlookbehindareNOTthesamething.Youcannotusethisforlookbehind. Ifyouarelookingfora"bar"thatisn'tprecededbya"foo",/(?!foo)bar/willnotdowhatyouwant.That'sbecausethe(?!foo)isjustsayingthatthenextthingcannotbe"foo"--andit'snot,it'sa"bar",so"foobar"willmatch.Uselookbehindinstead(seebelow). #(?<=pattern) #\K #(*plb:pattern) #(*positive_lookbehind:pattern) Azero-widthpositivelookbehindassertion.Forexample,/(?<=\t)\w+/matchesawordthatfollowsatab,withoutincludingthetabin$&. PriortoPerl5.30,itworkedonlyforfixed-widthlookbehind,butstartinginthatrelease,itcanhandlevariablelengthsfrom1to255charactersasanexperimentalfeature.Thefeatureisenabledautomaticallyifyouuseavariablelengthpositivelookbehindassertion. InPerl5.35.10thescopeoftheexperimentalnatureofthisconstructhasbeenreduced,andexperimentalwarningswillonlybeproducedwhentheconstructcontainscapturingparenthesis.Thewarningswillberaisedatpatterncompilationtime,unlessturnedoff,intheexperimental::vlbcategory.Thisistowarnyouthattheexactcontentsofcapturingbuffersinavariablelengthpositivelookbehindisnotwelldefinedandissubjecttochangeinafuturereleaseofperl. Currentlyifyouusecapturebuffersinsideofapositivevariablelengthlookbehindtheresultwillbethelongestandthusleftmostmatchpossible.Thismeansthat "aax"=~/(?=x)(?<=(a|aa))/ "aax"=~/(?=x)(?<=(aa|a))/ "aax"=~/(?=x)(?<=(a{1,2}?)/ "aax"=~/(?=x)(?<=(a{1,2})/ willallresultin$1containing"aa".Itispossibleinafuturereleaseofperlwewillchangethisbehavior. Thereisaspecialformofthisconstruct,called\K(availablesincePerl5.10.0),whichcausestheregexengineto"keep"everythingithadmatchedpriortothe\Kandnotincludeitin$&.Thiseffectivelyprovidesnon-experimentalvariable-lengthlookbehindofanylength. And,thereisatechniquethatcanbeusedtohandlevariablelengthlookbehindsonearlierreleases,andlongerthan255characters.Itisdescribedinhttp://www.drregex.com/2019/02/variable-length-lookbehinds-actually.html. Notethatunder/i,afewsinglecharactersmatchtwoorthreeothercharacters.Thismakesthemvariablelength,andthe255lengthappliestothemaximumnumberofcharactersinthematch.Forexampleqr/\N{LATINSMALLLETTERSHARPS}/imatchesthesequence"ss".Yourlookbehindassertioncouldcontain127SharpScharactersunder/i,butaddinga128thwouldgenerateacompilationerror,asthatcouldmatch256"s"charactersinarow. Theuseof\Kinsideofanotherlookaroundassertionisallowed,butthebehaviouriscurrentlynotwelldefined. Forvariousreasons\Kmaybesignificantlymoreefficientthantheequivalent(?<=...)construct,anditisespeciallyusefulinsituationswhereyouwanttoefficientlyremovesomethingfollowingsomethingelseinastring.Forinstance s/(foo)bar/$1/g; canberewrittenasthemuchmoreefficient s/foo\Kbar//g; Useofthenon-greedymodifier"?"maynotgiveyoutheexpectedresultsifitiswithinacapturinggroupwithintheconstruct. #(?pattern) #(?'NAME'pattern) Anamedcapturegroup.Identicalineveryrespecttonormalcapturingparentheses()butfortheadditionalfactthatthegroupcanbereferredtobynameinvariousregularexpressionconstructs(like\g{NAME})andcanbeaccessedbynameafterasuccessfulmatchvia%+or%-.Seeperlvarformoredetailsonthe%+and%-hashes. Ifmultipledistinctcapturegroupshavethesamename,then$+{NAME}willrefertotheleftmostdefinedgroupinthematch. Theforms(?'NAME'pattern)and(?pattern)areequivalent. NOTE:Whilethenotationofthisconstructisthesameasthesimilarfunctionin.NETregexes,thebehaviorisnot.InPerlthegroupsarenumberedsequentiallyregardlessofbeingnamedornot.Thusinthepattern /(x)(?y)(z)/ $+{foo}willbethesameas$2,and$3willcontain'z'insteadoftheoppositewhichiswhata.NETregexhackermightexpect. CurrentlyNAMEisrestrictedtosimpleidentifiersonly.Inotherwords,itmustmatch/^[_A-Za-z][_A-Za-z0-9]*\z/oritsUnicodeextension(seeutf8),thoughitisn'textendedbythelocale(seeperllocale). NOTE:InordertomakethingseasierforprogrammerswithexperiencewiththePythonorPCREregexengines,thepattern(?Ppattern)maybeusedinsteadof(?pattern);howeverthisformdoesnotsupporttheuseofsinglequotesasadelimiterforthename. #\k #\k'NAME' #\k{NAME} Namedbackreference.Similartonumericbackreferences,exceptthatthegroupisdesignatedbynameandnotnumber.Ifmultiplegroupshavethesamenamethenitreferstotheleftmostdefinedgroupinthecurrentmatch. Itisanerrortorefertoanamenotdefinedbya(?)earlierinthepattern. Allthreeformsareequivalent,althoughwith\k{NAME},youmayoptionallyhaveblankswithinbutadjacenttothebraces,asshown. NOTE:InordertomakethingseasierforprogrammerswithexperiencewiththePythonorPCREregexengines,thepattern(?P=NAME)maybeusedinsteadof\k. #(?{code}) WARNING:Usingthisfeaturesafelyrequiresthatyouunderstanditslimitations.Codeexecutedthathassideeffectsmaynotperformidenticallyfromversiontoversionduetotheeffectoffutureoptimisationsintheregexengine.Formoreinformationonthis,see"EmbeddedCodeExecutionFrequency". Thiszero-widthassertionexecutesanyembeddedPerlcode.Italwayssucceeds,anditsreturnvalueissetas$^R. Inliteralpatterns,thecodeisparsedatthesametimeasthesurroundingcode.Whilewithinthepattern,controlispassedtemporarilybacktotheperlparser,untilthelogically-balancingclosingbraceisencountered.Thisissimilartothewaythatanarrayindexexpressioninaliteralstringishandled,forexample "abc$array[1+f('[')+g()]def" Inparticular,bracesdonotneedtobebalanced: s/abc(?{f('{');})/def/ Eveninapatternthatisinterpolatedandcompiledatrun-time,literalcodeblockswillbecompiledonce,atperlcompiletime;thefollowingprints"ABCD": print"D"; my$qr=qr/(?{BEGIN{print"A"}})/; my$foo="foo"; /$foo$qr(?{BEGIN{print"B"}})/; BEGIN{print"C"} Inpatternswherethetextofthecodeisderivedfromrun-timeinformationratherthanappearingliterallyinasourcecode/pattern/,thecodeiscompiledatthesametimethatthepatterniscompiled,andforreasonsofsecurity,usere'eval'mustbeinscope.Thisistostopuser-suppliedpatternscontainingcodesnippetsfrombeingexecutable. Insituationswhereyouneedtoenablethiswithusere'eval',youshouldalsohavetaintcheckingenabled,ifyourperlsupportsit.Betteryet,usethecarefullyconstrainedevaluationwithinaSafecompartment.Seeperlsecfordetailsaboutboththesemechanisms. Fromtheviewpointofparsing,lexicalvariablescopeandclosures, /AAA(?{BBB})CCC/ behavesapproximatelylike /AAA/&&do{BBB}&&/CCC/ Similarly, qr/AAA(?{BBB})CCC/ behavesapproximatelylike sub{/AAA/&&do{BBB}&&/CCC/} Inparticular: {my$i=1;$r=qr/(?{print$i})/} my$i=2; /$r/;#prints"1" Insidea(?{...})block,$_referstothestringtheregularexpressionismatchingagainst.Youcanalsousepos()toknowwhatisthecurrentpositionofmatchingwithinthisstring. Thecodeblockintroducesanewscopefromtheperspectiveoflexicalvariabledeclarations,butnotfromtheperspectiveoflocalandsimilarlocalizingbehaviours.Solatercodeblockswithinthesamepatternwillstillseethevalueswhichwerelocalizedinearlierblocks.Theseaccumulatedlocalizationsareundoneeitherattheendofasuccessfulmatch,oriftheassertionisbacktracked(compare"Backtracking").Forexample, $_='a'x8; m< (?{$cnt=0})#Initialize$cnt. ( a (?{ local$cnt=$cnt+1;#Update$cnt, #backtracking-safe. }) )* aaaa (?{$res=$cnt})#Onsuccesscopyto #non-localizedlocation. >x; willinitiallyincrement$cntupto8;thenduringbacktracking,itsvaluewillbeunwoundbackto4,whichisthevalueassignedto$res.Attheendoftheregexexecution,$cntwillbewoundbacktoitsinitialvalueof0. Thisassertionmaybeusedastheconditionina (?(condition)yes-pattern|no-pattern) switch.Ifnotusedinthisway,theresultofevaluationofcodeisputintothespecialvariable$^R.Thishappensimmediately,so$^Rcanbeusedfromother(?{code})assertionsinsidethesameregularexpression. Theassignmentto$^Raboveisproperlylocalized,sotheoldvalueof$^Risrestorediftheassertionisbacktracked;compare"Backtracking". Notethatthespecialvariable$^Nisparticularlyusefulwithcodeblockstocapturetheresultsofsubmatchesinvariableswithouthavingtokeeptrackofthenumberofnestedparentheses.Forexample: $_="Thebrownfoxjumpsoverthelazydog"; /the(\S+)(?{$color=$^N})(\S+)(?{$animal=$^N})/i; print"color=$color,animal=$animal\n"; #(??{code}) WARNING:Usingthisfeaturesafelyrequiresthatyouunderstanditslimitations.Codeexecutedthathassideeffectsmaynotperformidenticallyfromversiontoversionduetotheeffectoffutureoptimisationsintheregexengine.Formoreinformationonthis,see"EmbeddedCodeExecutionFrequency". Thisisa"postponed"regularsubexpression.Itbehavesinexactlythesamewayasa(?{code})codeblockasdescribedabove,exceptthatitsreturnvalue,ratherthanbeingassignedto$^R,istreatedasapattern,compiledifit'sastring(orusedas-isifitsaqr//object),thenmatchedasifitwereinsertedinsteadofthisconstruct. Duringthematchingofthissub-pattern,ithasitsownsetofcaptureswhicharevalidduringthesub-match,butarediscardedoncecontrolreturnstothemainpattern.Forexample,thefollowingmatches,withtheinnerpatterncapturing"B"andmatching"BB",whiletheouterpatterncaptures"A"; my$inner='(.)\1'; "ABBA"=~/^(.)(??{$inner})\1/; print$1;#prints"A"; Notethatthismeansthatthereisnowayfortheinnerpatterntorefertoacapturegroupdefinedoutside.(Thecodeblockitselfcanuse$1,etc.,torefertotheenclosingpattern'scapturegroups.)Thus,although ('a'x100)=~/(??{'(.)'x100})/ willmatch,itwillnotset$1onexit. Thefollowingpatternmatchesaparenthesizedgroup: $re=qr{ \( (?: (?>[^()]+)#Non-parenswithoutbacktracking | (??{$re})#Groupwithmatchingparens )* \) }x; Seealso(?PARNO)foradifferent,moreefficientwaytoaccomplishthesametask. Executingapostponedregularexpressiontoomanytimeswithoutconsuminganyinputstringwillalsoresultinafatalerror.Thedepthatwhichthathappensiscompiledintoperl,soitcanbechangedwithacustombuild. #(?PARNO)(?-PARNO)(?+PARNO)(?R)(?0) Recursivesubpattern.Treatthecontentsofagivencapturebufferinthecurrentpatternasanindependentsubpatternandattempttomatchitatthecurrentpositioninthestring.Informationaboutcapturestatefromthecallerforthingslikebackreferencesisavailabletothesubpattern,butcapturebufferssetbythesubpatternarenotvisibletothecaller. Similarto(??{code})exceptthatitdoesnotinvolveexecutinganycodeorpotentiallycompilingareturnedpatternstring;insteadittreatsthepartofthecurrentpatterncontainedwithinaspecifiedcapturegroupasanindependentpatternthatmustmatchatthecurrentposition.Alsodifferentisthetreatmentofcapturebuffers,unlike(??{code})recursivepatternshaveaccesstotheircaller'smatchstate,soonecanusebackreferencessafely. PARNOisasequenceofdigits(notstartingwith0)whosevaluereflectstheparen-numberofthecapturegrouptorecurseto.(?R)recursestothebeginningofthewholepattern.(?0)isanalternatesyntaxfor(?R).IfPARNOisprecededbyaplusorminussignthenitisassumedtoberelative,withnegativenumbersindicatingprecedingcapturegroupsandpositiveonesfollowing.Thus(?-1)referstothemostrecentlydeclaredgroup,and(?+1)indicatesthenextgrouptobedeclared.Notethatthecountingforrelativerecursiondiffersfromthatofrelativebackreferences,inthatwithrecursionunclosedgroupsareincluded. Thefollowingpatternmatchesafunctionfoo()whichmaycontainbalancedparenthesesastheargument. $re=qr{(#parengroup1(fullfunction) foo (#parengroup2(parens) \( (#parengroup3(contentsofparens) (?: (?>[^()]+)#Non-parenswithoutbacktracking | (?2)#Recursetostartofparengroup2 )* ) \) ) ) }x; Ifthepatternwasusedasfollows 'foo(bar(baz)+baz(bop))'=~/$re/ andprint"\$1=$1\n", "\$2=$2\n", "\$3=$3\n"; theoutputproducedshouldbethefollowing: $1=foo(bar(baz)+baz(bop)) $2=(bar(baz)+baz(bop)) $3=bar(baz)+baz(bop) Ifthereisnocorrespondingcapturegroupdefined,thenitisafatalerror.Recursingdeeplywithoutconsuminganyinputstringwillalsoresultinafatalerror.Thedepthatwhichthathappensiscompiledintoperl,soitcanbechangedwithacustombuild. Thefollowingshowshowusingnegativeindexingcanmakeiteasiertoembedrecursivepatternsinsideofaqr//constructforlateruse: my$parens=qr/(\((?:[^()]++|(?-1))*+\))/; if(/foo$parens\s+\+\s+bar$parens/x){ #dosomethinghere... } NotethatthispatterndoesnotbehavethesamewayastheequivalentPCREorPythonconstructofthesameform.InPerlyoucanbacktrackintoarecursedgroup,inPCREandPythontherecursedintogroupistreatedasatomic.Also,modifiersareresolvedatcompiletime,soconstructslike(?i:(?1))or(?:(?i)(?1))donotaffecthowthesub-patternwillbeprocessed. #(?&NAME) Recursetoanamedsubpattern.Identicalto(?PARNO)exceptthattheparenthesistorecursetoisdeterminedbyname.Ifmultipleparentheseshavethesamename,thenitrecursestotheleftmost. Itisanerrortorefertoanamethatisnotdeclaredsomewhereinthepattern. NOTE:InordertomakethingseasierforprogrammerswithexperiencewiththePythonorPCREregexenginesthepattern(?P>NAME)maybeusedinsteadof(?&NAME). #(?(condition)yes-pattern|no-pattern) #(?(condition)yes-pattern) Conditionalexpression.Matchesyes-patternifconditionyieldsatruevalue,matchesno-patternotherwise.Amissingpatternalwaysmatches. (condition)shouldbeoneof: #anintegerinparentheses (whichisvalidifthecorrespondingpairofparenthesesmatched); #alookahead/lookbehind/evaluatezero-widthassertion; #anameinanglebracketsorsinglequotes (whichisvalidifagroupwiththegivennamematched); #thespecialsymbol(R) (truewhenevaluatedinsideofrecursionoreval).Additionallythe"R"maybefollowedbyanumber,(whichwillbetruewhenevaluatedwhenrecursinginsideoftheappropriategroup),orby&NAME,inwhichcaseitwillbetrueonlywhenevaluatedduringrecursioninthenamedgroup. Here'sasummaryofthepossiblepredicates: #(1)(2)... Checksifthenumberedcapturinggrouphasmatchedsomething.Fullsyntax:(?(1)then|else) #()('NAME') Checksifagroupwiththegivennamehasmatchedsomething.Fullsyntax:(?()then|else) #(?=...)(?!...)(?<=...)(?(?&NAME_PAT))(?(?&ADDRESS_PAT)) (?(DEFINE) (?....) (?....) )/x Notethatcapturegroupsmatchedinsideofrecursionarenotaccessibleaftertherecursionreturns,sotheextralayerofcapturinggroupsisnecessary.Thus$+{NAME_PAT}wouldnotbedefinedeventhough$+{NAME}wouldbe. Finally,keepinmindthatsubpatternscreatedinsideaDEFINEblockcounttowardstheabsoluteandrelativenumberofcaptures,sothis: my@captures="a"=~/(.)#Firstcapture (?(DEFINE) (?1)#Secondcapture )/x; sayscalar@captures; Willoutput2,not1.Thisisparticularlyimportantifyouintendtocompilethedefinitionswiththeqr//operator,andlaterinterpolatetheminanotherpattern. #(?>pattern) #(*atomic:pattern) An"independent"subexpression,onewhichmatchesthesubstringthatastandalonepatternwouldmatchifanchoredatthegivenposition,anditmatchesnothingotherthanthissubstring.Thisconstructisusefulforoptimizationsofwhatwouldotherwisebe"eternal"matches,becauseitwillnotbacktrack(see"Backtracking").Itmayalsobeusefulinplaceswherethe"graballyoucan,anddonotgiveanythingback"semanticisdesirable. Forexample:^(?>a*)abwillnevermatch,since(?>a*)(anchoredatthebeginningofstring,asabove)willmatchallcharacters"a"atthebeginningofstring,leavingno"a"forabtomatch.Incontrast,a*abwillmatchthesameasa+b,sincethematchofthesubgroupa*isinfluencedbythefollowinggroupab(see"Backtracking").Inparticular,a*insidea*abwillmatchfewercharactersthanastandalonea*,sincethismakesthetailmatch. (?>pattern)doesnotdisablebacktrackingaltogetheronceithasmatched.Itisstillpossibletobacktrackpasttheconstruct,butnotintoit.So((?>a*)|(?>b*))arwillstillmatch"bar". Aneffectsimilarto(?>pattern)maybeachievedbywriting(?=(pattern))\g{-1}.Thismatchesthesamesubstringasastandalonea+,andthefollowing\g{-1}eatsthematchedstring;itthereforemakesazero-lengthassertionintoananalogueof(?>...).(Thedifferencebetweenthesetwoconstructsisthatthesecondoneusesacapturinggroup,thusshiftingordinalsofbackreferencesintherestofaregularexpression.) Considerthispattern: m{\( ( [^()]+#x+ | \([^()]*\) )+ \) }x Thatwillefficientlymatchanonemptygroupwithmatchingparenthesestwolevelsdeeporless.However,ifthereisnosuchgroup,itwilltakevirtuallyforeveronalongstring.That'sbecausetherearesomanydifferentwaystosplitalongstringintoseveralsubstrings.Thisiswhat(.+)+isdoing,and(.+)+issimilartoasubpatternoftheabovepattern.Considerhowthepatternabovedetectsno-matchon((()aaaaaaaaaaaaaaaaaainseveralseconds,butthateachextraletterdoublesthistime.Thisexponentialperformancewillmakeitappearthatyourprogramhashung.However,atinychangetothispattern m{\( ( (?>[^()]+)#changex+aboveto(?>x+) | \([^()]*\) )+ \) }x whichuses(?>...)matchesexactlywhentheoneabovedoes(verifyingthisyourselfwouldbeaproductiveexercise),butfinishesinafourththetimewhenusedonasimilarstringwith1000000"a"s.Beaware,however,that,whenthisconstructisfollowedbyaquantifier,itcurrentlytriggersawarningmessageundertheusewarningspragmaor-wswitchsayingit"matchesnullstringmanytimesinregex". Onsimplegroups,suchasthepattern(?>[^()]+),acomparableeffectmaybeachievedbynegativelookahead,asin[^()]+(?![^()]).Thiswasonly4timessloweronastringwith1000000"a"s. The"graballyoucan,anddonotgiveanythingback"semanticisdesirableinmanysituationswhereonthefirstsightasimple()*lookslikethecorrectsolution.Supposeweparsetextwithcommentsbeingdelimitedby"#"followedbysomeoptional(horizontal)whitespace.Contrarytoitsappearance,#[\t]*isnotthecorrectsubexpressiontomatchthecommentdelimiter,becauseitmay"giveup"somewhitespaceiftheremainderofthepatterncanbemadetomatchthatway.Thecorrectansweriseitheroneofthese: (?>#[\t]*) #[\t]*(?![\t]) Forexample,tograbnon-emptycommentsinto$1,oneshoulduseeitheroneofthese: /(?>\#[\t]*)(.+)/x; /\#[\t]*([^\t].*)/x; Whichoneyoupickdependsonwhichoftheseexpressionsbetterreflectstheabovespecificationofcomments. Insomeliteraturethisconstructiscalled"atomicmatching"or"possessivematching". Possessivequantifiersareequivalenttoputtingtheitemtheyareappliedtoinsideofoneoftheseconstructs.Thefollowingequivalencesapply: QuantifierFormBracketingForm ------------------------------ PAT*+(?>PAT*) PAT++(?>PAT+) PAT?+(?>PAT?) PAT{min,max}+(?>PAT{min,max}) Nested(?>...)constructsarenotno-ops,evenifatfirstglancetheymightseemtobe.Thisisbecausethenested(?>...)canrestrictinternalbacktrackingthatotherwisemightoccur.Forexample, "abc"=~/(?>a[bc]*c)/ matches,but "abc"=~/(?>a(?>[bc]*)c)/ doesnot. #(?[]) See"ExtendedBracketedCharacterClasses"inperlrecharclass. #Backtracking NOTE:Thissectionpresentsanabstractapproximationofregularexpressionbehavior.Foramorerigorous(andcomplicated)viewoftherulesinvolvedinselectingamatchamongpossiblealternatives,see"CombiningREPieces". Afundamentalfeatureofregularexpressionmatchinginvolvesthenotioncalledbacktracking,whichiscurrentlyused(whenneeded)byallregularnon-possessiveexpressionquantifiers,namely"*",*?,"+",+?,{n,m},and{n,m}?.Backtrackingisoftenoptimizedinternally,butthegeneralprincipleoutlinedhereisvalid. Foraregularexpressiontomatch,theentireregularexpressionmustmatch,notjustpartofit.Soifthebeginningofapatterncontainingaquantifiersucceedsinawaythatcauseslaterpartsinthepatterntofail,thematchingenginebacksupandrecalculatesthebeginningpart--that'swhyit'scalledbacktracking. Hereisanexampleofbacktracking:Let'ssayyouwanttofindthewordfollowing"foo"inthestring"Foodisonthefootable.": $_="Foodisonthefootable."; if(/\b(foo)\s+(\w+)/i){ print"$2follows$1.\n"; } Whenthematchruns,thefirstpartoftheregularexpression(\b(foo))findsapossiblematchrightatthebeginningofthestring,andloadsup$1with"Foo".However,assoonasthematchingengineseesthatthere'snowhitespacefollowingthe"Foo"thatithadsavedin$1,itrealizesitsmistakeandstartsoveragainonecharacterafterwhereithadthetentativematch.Thistimeitgoesallthewayuntilthenextoccurrenceof"foo".Thecompleteregularexpressionmatchesthistime,andyougettheexpectedoutputof"tablefollowsfoo." Sometimesminimalmatchingcanhelpalot.Imagineyou'dliketomatcheverythingbetween"foo"and"bar".Initially,youwritesomethinglikethis: $_="Thefoodisunderthebarinthebarn."; if(/foo(.*)bar/){ print"got\n"; } Whichperhapsunexpectedlyyields: got That'sbecause.*wasgreedy,soyougeteverythingbetweenthefirst"foo"andthelast"bar".Hereit'smoreeffectivetouseminimalmatchingtomakesureyougetthetextbetweena"foo"andthefirst"bar"thereafter. if(/foo(.*?)bar/){print"got\n"} got Here'sanotherexample.Let'ssayyou'dliketomatchanumberattheendofastring,andyoualsowanttokeeptheprecedingpartofthematch.Soyouwritethis: $_="Ihave2numbers:53147"; if(/(.*)(\d*)/){#Wrong! print"Beginningis,numberis.\n"; } Thatwon'tworkatall,because.*wasgreedyandgobbledupthewholestring.As\d*canmatchonanemptystringthecompleteregularexpressionmatchedsuccessfully. Beginningis<53147>,numberis<>. Herearesomevariants,mostofwhichdon'twork: $_="Ihave2numbers:53147"; @pats=qw{ (.*)(\d*) (.*)(\d+) (.*?)(\d*) (.*?)(\d+) (.*)(\d+)$ (.*?)(\d+)$ (.*)\b(\d+)$ (.*\D)(\d+)$ }; for$pat(@pats){ printf"%-12s",$pat; if(/$pat/){ print"\n"; }else{ print"FAIL\n"; } } Thatwillprintout: (.*)(\d*)<53147><> (.*)(\d+)<5314><7> (.*?)(\d*)<><> (.*?)(\d+)<2> (.*)(\d+)$<5314><7> (.*?)(\d+)$<53147> (.*)\b(\d+)$<53147> (.*\D)(\d+)$<53147> Asyousee,thiscanbeabittricky.It'simportanttorealizethataregularexpressionismerelyasetofassertionsthatgivesadefinitionofsuccess.Theremaybe0,1,orseveraldifferentwaysthatthedefinitionmightsucceedagainstaparticularstring.Andiftherearemultiplewaysitmightsucceed,youneedtounderstandbacktrackingtoknowwhichvarietyofsuccessyouwillachieve. Whenusinglookaheadassertionsandnegations,thiscanallgeteventrickier.Imagineyou'dliketofindasequenceofnon-digitsnotfollowedby"123".Youmighttrytowritethatas $_="ABC123"; if(/^\D*(?!123)/){#Wrong! print"Yup,no123in$_\n"; } Butthatisn'tgoingtomatch;atleast,notthewayyou'rehoping.Itclaimsthatthereisno123inthestring.Here'saclearerpictureofwhythatpatternmatches,contrarytopopularexpectations: $x='ABC123'; $y='ABC445'; print"1:got$1\n"if$x=~/^(ABC)(?!123)/; print"2:got$1\n"if$y=~/^(ABC)(?!123)/; print"3:got$1\n"if$x=~/^(\D*)(?!123)/; print"4:got$1\n"if$y=~/^(\D*)(?!123)/; Thisprints 2:gotABC 3:gotAB 4:gotABC Youmighthaveexpectedtest3tofailbecauseitseemstoamoregeneralpurposeversionoftest1.Theimportantdifferencebetweenthemisthattest3containsaquantifier(\D*)andsocanusebacktracking,whereastest1willnot.What'shappeningisthatyou'veasked"Isittruethatatthestartof$x,following0ormorenon-digits,youhavesomethingthat'snot123?"Ifthepatternmatcherhadlet\D*expandto"ABC",thiswouldhavecausedthewholepatterntofail. Thesearchenginewillinitiallymatch\D*with"ABC".Thenitwilltrytomatch(?!123)with"123",whichfails.Butbecauseaquantifier(\D*)hasbeenusedintheregularexpression,thesearchenginecanbacktrackandretrythematchdifferentlyinthehopeofmatchingthecompleteregularexpression. Thepatternreally,reallywantstosucceed,soitusesthestandardpatternback-off-and-retryandlets\D*expandtojust"AB"thistime.Nowthere'sindeedsomethingfollowing"AB"thatisnot"123".It's"C123",whichsuffices. Wecandealwiththisbyusingbothanassertionandanegation.We'llsaythatthefirstpartin$1mustbefollowedbothbyadigitandbysomethingthat'snot"123".Rememberthatthelookaheadsarezero-widthexpressions--theyonlylook,butdon'tconsumeanyofthestringintheirmatch.Sorewritingthiswayproduceswhatyou'dexpect;thatis,case5willfail,butcase6succeeds: print"5:got$1\n"if$x=~/^(\D*)(?=\d)(?!123)/; print"6:got$1\n"if$y=~/^(\D*)(?=\d)(?!123)/; 6:gotABC Inotherwords,thetwozero-widthassertionsnexttoeachotherworkasthoughthey'reANDedtogether,justasyou'duseanybuilt-inassertions:/^$/matchesonlyifyou'reatthebeginningofthelineANDtheendofthelinesimultaneously.ThedeeperunderlyingtruthisthatjuxtapositioninregularexpressionsalwaysmeansAND,exceptwhenyouwriteanexplicitORusingtheverticalbar./ab/meansmatch"a"AND(then)match"b",althoughtheattemptedmatchesaremadeatdifferentpositionsbecause"a"isnotazero-widthassertion,butaone-widthassertion. WARNING:Particularlycomplicatedregularexpressionscantakeexponentialtimetosolvebecauseoftheimmensenumberofpossiblewaystheycanusebacktrackingtotryforamatch.Forexample,withoutinternaloptimizationsdonebytheregularexpressionengine,thiswilltakeapainfullylongtimetorun: 'aaaaaaaaaaaa'=~/((a{0,5}){0,5})*[c]/ Andifyouused"*"'sintheinternalgroupsinsteadoflimitingthemto0through5matches,thenitwouldtakeforever--oruntilyouranoutofstackspace.Moreover,theseinternaloptimizationsarenotalwaysapplicable.Forexample,ifyouput{0,5}insteadof"*"ontheexternalgroup,nocurrentoptimizationisapplicable,andthematchtakesalongtimetofinish. Apowerfultoolforoptimizingsuchbeastsiswhatisknownasan"independentgroup",whichdoesnotbacktrack(see"(?>pattern)").Notealsothatzero-lengthlookahead/lookbehindassertionswillnotbacktracktomakethetailmatch,sincetheyarein"logical"context:onlywhethertheymatchisconsideredrelevant.Foranexamplewhereside-effectsoflookaheadmighthaveinfluencedthefollowingmatch,see"(?>pattern)". #ScriptRuns Ascriptrunisbasicallyasequenceofcharacters,allfromthesameUnicodescript(see"Scripts"inperlunicode),suchasLatinorGreek.Inmostplacesasinglewordwouldneverbewritteninmultiplescripts,unlessitisaspoofingattack.Aninfamousexample,is paypal.com ThoseletterscouldallbeLatin(asintheexamplejustabove),ortheycouldbeallCyrillic(exceptforthedot),ortheycouldbeamixtureofthetwo.Inthecaseofaninternetaddressthe.comwouldbeinLatin,AndanyCyrilliconeswouldcauseittobeamixture,notascriptrun.SomeoneclickingonsuchalinkwouldnotbedirectedtotherealPaypalwebsite,butanattackerwouldcraftalook-alikeonetoattempttogathersensitiveinformationfromtheperson. StartinginPerl5.28,itisnoweasytodetectstringsthataren'tscriptruns.Simplyenclosejustaboutanypatternlikeeitherofthese: (*script_run:pattern) (*sr:pattern) Whathappensisthatafterpatternsucceedsinmatching,itissubjectedtotheadditionalcriterionthateverycharacterinitmustbefromthesamescript(seeexceptionsbelow).Ifthisisn'ttrue,backtrackingoccursuntilsomethingallinthesamescriptisfoundthatmatches,orallpossibilitiesareexhausted.Thiscancausealotofbacktracking,butgenerally,onlymaliciousinputwillresultinthis,thoughtheslowdowncouldcauseadenialofserviceattack.Ifyourneedspermit,itisbesttomakethepatternatomictocutdownontheamountofbacktracking.Thisissolikelytobewhatyouwant,thatinsteadofwritingthis: (*script_run:(?>pattern)) youcanwriteeitherofthese: (*atomic_script_run:pattern) (*asr:pattern) (See"(?>pattern)".) InTaiwan,Japan,andKorea,itiscommonfortexttohaveamixtureofcharactersfromtheirnativescriptsandbaseChinese.PerlfollowsUnicode'sUTS39(https://unicode.org/reports/tr39/)UnicodeSecurityMechanismsinallowingsuchmixtures.Forexample,theJapanesescriptsKatakanaandHiraganaarecommonlymixedtogetherinpractice,alongwithsomeChinesecharacters,andhencearetreatedasbeinginasinglescriptrunbyPerl. Therulesusedformatchingdecimaldigitsareslightlystricter.ManyscriptshavetheirownsetsofdigitsequivalenttotheWestern0through9ones.Afew,suchasArabic,havemorethanoneset.Forastringtobeconsideredascriptrun,alldigitsinitmustcomefromthesamesetoften,asdeterminedbythefirstdigitencountered.Asanexample, qr/(*script_run:\d+\b)/x guaranteesthatthedigitsmatchedwillallbefromthesamesetof10.Youwon'tgetalook-alikedigitfromadifferentscriptthathasadifferentvaluethanwhatitappearstobe. Unicodehasthreepseudoscriptsthatarehandledspecially. "Unknown"isappliedtocodepointswhosemeaninghasyettobedetermined.Perlcurrentlywillmatchasascriptrun,anysinglecharacterstringconsistingofoneofthesecodepoints.Butanystringlongerthanonecodepointcontainingoneofthesewillnotbeconsideredascriptrun. "Inherited"isappliedtocharactersthatmodifyanother,suchasanaccentofsometype.Theseareconsideredtobeinthescriptofthemastercharacter,andsonevercauseascriptruntonotmatch. Theotheroneis"Common".Thisconsistsofmostlypunctuation,emoji,andcharactersusedinmathematicsandmusic,theASCIIdigits0through9,andfull-widthformsofthesedigits.Thesecharacterscanappearintermixedintextinmanyoftheworld'sscripts.Thesealsodon'tcauseascriptruntonotmatch.Butlikeotherscripts,alldigitsinarunmustcomefromthesamesetof10. Thisconstructisnon-capturing.Youcanaddparenthesestopatterntocapture,ifdesired.Youwillhavetodothisifyouplantouse"(*ACCEPT)(*ACCEPT:arg)"andnothaveitbypassthescriptrunchecking. TheScript_ExtensionspropertyasmodifiedbyUTS39(https://unicode.org/reports/tr39/)isusedasthebasisforthisfeature. Tosummarize, Alllength0orlength1sequencesarescriptruns. Alongersequenceisascriptrunifandonlyifallofthefollowingconditionsaremet: NocodepointinthesequencehastheScript_ExtensionpropertyofUnknown. ThiscurrentlymeansthatallcodepointsinthesequencehavebeenassignedbyUnicodetobecharactersthataren'tprivateusenorsurrogatecodepoints. AllcharactersinthesequencecomefromtheCommonscriptand/ortheInheritedscriptand/orasingleotherscript. ThescriptofacharacterisdeterminedbytheScript_ExtensionspropertyasmodifiedbyUTS39(https://unicode.org/reports/tr39/),asdescribedabove. Alldecimaldigitsinthesequencecomefromthesameblockof10consecutivedigits. #SpecialBacktrackingControlVerbs Thesespecialpatternsaregenerallyoftheform(*VERB:arg).Unlessotherwisestatedtheargargumentisoptional;insomecases,itismandatory. Anypatterncontainingaspecialbacktrackingverbthatallowsanargumenthasthespecialbehaviourthatwhenexecuteditsetsthecurrentpackage's$REGERRORand$REGMARKvariables.Whendoingsothefollowingrulesapply: Onfailure,the$REGERRORvariablewillbesettotheargvalueoftheverbpattern,iftheverbwasinvolvedinthefailureofthematch.Iftheargpartofthepatternwasomitted,then$REGERRORwillbesettothenameofthelast(*MARK:NAME)patternexecuted,ortoTRUEiftherewasnone.Also,the$REGMARKvariablewillbesettoFALSE. Onasuccessfulmatch,the$REGERRORvariablewillbesettoFALSE,andthe$REGMARKvariablewillbesettothenameofthelast(*MARK:NAME)patternexecuted.Seetheexplanationforthe(*MARK:NAME)verbbelowformoredetails. NOTE:$REGERRORand$REGMARKarenotmagicvariableslike$1andmostotherregex-relatedvariables.Theyarenotlocaltoascope,norreadonly,butinsteadarevolatilepackagevariablessimilarto$AUTOLOAD.Theyaresetinthepackagecontainingthecodethatexecutedtheregex(ratherthantheonethatcompiledit,wherethosediffer).Ifnecessary,youcanuselocaltolocalizechangestothesevariablestoaspecificscopebeforeexecutingaregex. Ifapatterndoesnotcontainaspecialbacktrackingverbthatallowsanargument,then$REGERRORand$REGMARKarenottouchedatall. #Verbs #(*PRUNE)(*PRUNE:NAME) Thiszero-widthpatternprunesthebacktrackingtreeatthecurrentpointwhenbacktrackedintoonfailure.Considerthepattern/A(*PRUNE)B/,whereAandBarecomplexpatterns.Untilthe(*PRUNE)verbisreached,Amaybacktrackasnecessarytomatch.Onceitisreached,matchingcontinuesinB,whichmayalsobacktrackasnecessary;however,shouldBnotmatch,thennofurtherbacktrackingwilltakeplace,andthepatternwillfailoutrightatthecurrentstartingposition. Thefollowingexamplecountsallthepossiblematchingstringsinapattern(withoutactuallymatchinganyofthem). 'aaab'=~/a+b?(?{print"$&\n";$count++})(*FAIL)/; print"Count=$count\n"; whichproduces: aaab aaa aa a aab aa a ab a Count=9 Ifweadda(*PRUNE)beforethecountlikethefollowing 'aaab'=~/a+b?(*PRUNE)(?{print"$&\n";$count++})(*FAIL)/; print"Count=$count\n"; wepreventbacktrackingandfindthecountofthelongestmatchingstringateachmatchingstartingpointlikeso: aaab aab ab Count=3 Anynumberof(*PRUNE)assertionsmaybeusedinapattern. Seealso"(?>pattern)"andpossessivequantifiersforotherwaystocontrolbacktracking.Insomecases,theuseof(*PRUNE)canbereplacedwitha(?>pattern)withnofunctionaldifference;however,(*PRUNE)canbeusedtohandlecasesthatcannotbeexpressedusinga(?>pattern)alone. #(*SKIP)(*SKIP:NAME) Thiszero-widthpatternissimilarto(*PRUNE),exceptthatonfailureitalsosignifiesthatwhatevertextthatwasmatchedleadinguptothe(*SKIP)patternbeingexecutedcannotbepartofanymatchofthispattern.Thiseffectivelymeansthattheregexengine"skips"forwardtothispositiononfailureandtriestomatchagain,(assumingthatthereissufficientroomtomatch). Thenameofthe(*SKIP:NAME)patternhasspecialsignificance.Ifa(*MARK:NAME)wasencounteredwhilematching,thenitisthatpositionwhichisusedasthe"skippoint".Ifno(*MARK)ofthatnamewasencountered,thenthe(*SKIP)operatorhasnoeffect.Whenusedwithoutanamethe"skippoint"iswherethematchpointwaswhenexecutingthe(*SKIP)pattern. Comparethefollowingtotheexamplesin(*PRUNE);notethestringistwiceaslong: 'aaabaaab'=~/a+b?(*SKIP)(?{print"$&\n";$count++})(*FAIL)/; print"Count=$count\n"; outputs aaab aaab Count=2 Oncethe'aaab'atthestartofthestringhasmatched,andthe(*SKIP)executed,thenextstartingpointwillbewherethecursorwaswhenthe(*SKIP)wasexecuted. #(*MARK:NAME)(*:NAME) Thiszero-widthpatterncanbeusedtomarkthepointreachedinastringwhenacertainpartofthepatternhasbeensuccessfullymatched.Thismarkmaybegivenaname.Alater(*SKIP)patternwillthenskipforwardtothatpointifbacktrackedintoonfailure.Anynumberof(*MARK)patternsareallowed,andtheNAMEportionmaybeduplicated. Inadditiontointeractingwiththe(*SKIP)pattern,(*MARK:NAME)canbeusedto"label"apatternbranch,sothataftermatching,theprogramcandeterminewhichbranchesofthepatternwereinvolvedinthematch. Whenamatchissuccessful,the$REGMARKvariablewillbesettothenameofthemostrecentlyexecuted(*MARK:NAME)thatwasinvolvedinthematch. Thiscanbeusedtodeterminewhichbranchofapatternwasmatchedwithoutusingaseparatecapturegroupforeachbranch,whichinturncanresultinaperformanceimprovement,asperlcannotoptimize/(?:(x)|(y)|(z))/asefficientlyassomethinglike/(?:x(*MARK:x)|y(*MARK:y)|z(*MARK:z))/. Whenamatchhasfailed,andunlessanotherverbhasbeeninvolvedinfailingthematchandhasprovideditsownnametouse,the$REGERRORvariablewillbesettothenameofthemostrecentlyexecuted(*MARK:NAME). See"(*SKIP)"formoredetails. Asashortcut(*MARK:NAME)canbewritten(*:NAME). #(*THEN)(*THEN:NAME) Thisissimilartothe"cutgroup"operator::fromRaku.Like(*PRUNE),thisverbalwaysmatches,andwhenbacktrackedintoonfailure,itcausestheregexenginetotrythenextalternationintheinnermostenclosinggroup(capturingorotherwise)thathasalternations.Thetwobranchesofa(?(condition)yes-pattern|no-pattern)donotcountasanalternation,asfaras(*THEN)isconcerned. Itsnamecomesfromtheobservationthatthisoperationcombinedwiththealternationoperator("|")canbeusedtocreatewhatisessentiallyapattern-basedif/then/elseblock: (COND(*THEN)FOO|COND2(*THEN)BAR|COND3(*THEN)BAZ) NotethatifthisoperatorisusedandNOTinsideofanalternationthenitactsexactlylikethe(*PRUNE)operator. /A(*PRUNE)B/ isthesameas /A(*THEN)B/ but /(A(*THEN)B|C)/ isnotthesameas /(A(*PRUNE)B|C)/ asaftermatchingtheAbutfailingontheBthe(*THEN)verbwillbacktrackandtryC;butthe(*PRUNE)verbwillsimplyfail. #(*COMMIT)(*COMMIT:arg) ThisistheRaku"commitpattern"or:::.It'sazero-widthpatternsimilarto(*SKIP),exceptthatwhenbacktrackedintoonfailureitcausesthematchtofailoutright.Nofurtherattemptstofindavalidmatchbyadvancingthestartpointerwilloccuragain.Forexample, 'aaabaaab'=~/a+b?(*COMMIT)(?{print"$&\n";$count++})(*FAIL)/; print"Count=$count\n"; outputs aaab Count=1 Inotherwords,oncethe(*COMMIT)hasbeenentered,andifthepatterndoesnotmatch,theregexenginewillnottryanyfurthermatchingontherestofthestring. #(*FAIL)(*F)(*FAIL:arg) Thispatternmatchesnothingandalwaysfails.Itcanbeusedtoforcetheenginetobacktrack.Itisequivalentto(?!),buteasiertoread.Infact,(?!)getsoptimisedinto(*FAIL)internally.YoucanprovideanargumentsothatifthematchfailsbecauseofthisFAILdirectivetheargumentcanbeobtainedfrom$REGERROR. Itisprobablyusefulonlywhencombinedwith(?{})or(??{}). #(*ACCEPT)(*ACCEPT:arg) Thispatternmatchesnothingandcausestheendofsuccessfulmatchingatthepointatwhichthe(*ACCEPT)patternwasencountered,regardlessofwhetherthereisactuallymoretomatchinthestring.Wheninsideofanestedpattern,suchasrecursion,orinasubpatterndynamicallygeneratedvia(??{}),onlytheinnermostpatternisendedimmediately. Ifthe(*ACCEPT)isinsideofcapturinggroupsthenthegroupsaremarkedasendedatthepointatwhichthe(*ACCEPT)wasencountered.Forinstance: 'AB'=~/(A(A|B(*ACCEPT)|C)D)(E)/x; willmatch,and$1willbeABand$2willbe"B",$3willnotbeset.Ifanotherbranchintheinnerparentheseswasmatched,suchasinthestring'ACDE',thenthe"D"and"E"wouldhavetobematchedaswell. Youcanprovideanargument,whichwillbeavailableinthevar$REGMARKafterthematchcompletes. #Warningon\1Insteadof$1 Somepeoplegettoousedtowritingthingslike: $pattern=~s/(\W)/\\\1/g; Thisisgrandfathered(for\1to\9)fortheRHSofasubstitutetoavoidshockingthesedaddicts,butit'sadirtyhabittogetinto.That'sbecauseinPerlThink,therighthandsideofans///isadouble-quotedstring.\1intheusualdouble-quotedstringmeansacontrol-A.ThecustomaryUnixmeaningof\1iskludgedinfors///.However,ifyougetintothehabitofdoingthat,yougetyourselfintotroubleifyouthenaddan/emodifier. s/(\d+)/\1+1/eg;#causeswarningunder-w Orifyoutrytodo s/(\d+)/\1000/; Youcan'tdisambiguatethatbysaying\{1}000,whereasyoucanfixitwith${1}000.Theoperationofinterpolationshouldnotbeconfusedwiththeoperationofmatchingabackreference.Certainlytheymeantwodifferentthingsontheleftsideofthes///. #RepeatedPatternsMatchingaZero-lengthSubstring WARNING:Difficultmaterial(andprose)ahead.Thissectionneedsarewrite. Regularexpressionsprovideaterseandpowerfulprogramminglanguage.Aswithmostotherpowertools,powercomestogetherwiththeabilitytowreakhavoc. Acommonabuseofthispowerstemsfromtheabilitytomakeinfiniteloopsusingregularexpressions,withsomethingasinnocuousas: 'foo'=~m{(o?)*}x; Theo?matchesatthebeginningof"foo",andsincethepositioninthestringisnotmovedbythematch,o?wouldmatchagainandagainbecauseofthe"*"quantifier.Anothercommonwaytocreateasimilarcycleiswiththeloopingmodifier/g: @matches=('foo'=~m{o?}xg); or print"match:\n"while'foo'=~m{o?}xg; ortheloopimpliedbysplit(). However,longexperiencehasshownthatmanyprogrammingtasksmaybesignificantlysimplifiedbyusingrepeatedsubexpressionsthatmaymatchzero-lengthsubstrings.Here'sasimpleexamplebeing: @chars=split//,$string;#//isnotmagicinsplit ($whitewashed=$string)=~s/()//g;#parensavoidmagics/// ThusPerlallowssuchconstructs,byforcefullybreakingtheinfiniteloop.Therulesforthisaredifferentforlower-levelloopsgivenbythegreedyquantifiers*+{},andforhigher-leveloneslikethe/gmodifierorsplit()operator. Thelower-levelloopsareinterrupted(thatis,theloopisbroken)whenPerldetectsthatarepeatedexpressionmatchedazero-lengthsubstring.Thus m{(?:NON_ZERO_LENGTH|ZERO_LENGTH)*}x; ismadeequivalentto m{(?:NON_ZERO_LENGTH)*(?:ZERO_LENGTH)?}x; Forexample,thisprogram #!perl-l "aaaaab"=~/ (?: a#non-zero |#or (?{print"hello"})#printhellowheneverthis #branchistried (?=(b))#zero-widthassertion )*#anynumberoftimes /x; print$&; print$1; prints hello aaaaa b Noticethat"hello"isonlyprintedonce,aswhenPerlseesthatthesixthiterationoftheoutermost(?:)*matchesazero-lengthstring,itstopsthe"*". Thehigher-levelloopspreserveanadditionalstatebetweeniterations:whetherthelastmatchwaszero-length.Tobreaktheloop,thefollowingmatchafterazero-lengthmatchisprohibitedtohavealengthofzero.Thisprohibitioninteractswithbacktracking(see"Backtracking"),andsothesecondbestmatchischosenifthebestmatchisofzerolength. Forexample: $_='bar'; s/\w??//g; resultsin<><><><>.Ateachpositionofthestringthebestmatchgivenbynon-greedy??isthezero-lengthmatch,andthesecondbestmatchiswhatismatchedby\w.Thuszero-lengthmatchesalternatewithone-character-longmatches. Similarly,forrepeatedm/()/gthesecond-bestmatchisthematchatthepositiononenotchfurtherinthestring. Theadditionalstateofbeingmatchedwithzero-lengthisassociatedwiththematchedstring,andisresetbyeachassignmenttopos().Zero-lengthmatchesattheendofthepreviousmatchareignoredduringsplit. #CombiningREPieces Eachoftheelementarypiecesofregularexpressionswhichweredescribedbefore(suchasabor\Z)couldmatchatmostonesubstringatthegivenpositionoftheinputstring.However,inatypicalregularexpressiontheseelementarypiecesarecombinedintomorecomplicatedpatternsusingcombiningoperatorsST,S|T,S*etc.(intheseexamples"S"and"T"areregularsubexpressions). Suchcombinationscanincludealternatives,leadingtoaproblemofchoice:ifwematcharegularexpressiona|abagainst"abc",willitmatchsubstring"a"or"ab"?Onewaytodescribewhichsubstringisactuallymatchedistheconceptofbacktracking(see"Backtracking").However,thisdescriptionistoolow-levelandmakesyouthinkintermsofaparticularimplementation. Anotherdescriptionstartswithnotionsof"better"/"worse".Allthesubstringswhichmaybematchedbythegivenregularexpressioncanbesortedfromthe"best"matchtothe"worst"match,anditisthe"best"matchwhichischosen.Thissubstitutesthequestionof"whatischosen?"bythequestionof"whichmatchesarebetter,andwhichareworse?". Again,forelementarypiecesthereisnosuchquestion,sinceatmostonematchatagivenpositionispossible.Thissectiondescribesthenotionofbetter/worseforcombiningoperators.Inthedescriptionbelow"S"and"T"areregularsubexpressions. #ST Considertwopossiblematches,ABandA'B',"A"andA'aresubstringswhichcanbematchedby"S","B"andB'aresubstringswhichcanbematchedby"T". If"A"isabettermatchfor"S"thanA',ABisabettermatchthanA'B'. If"A"andA'coincide:ABisabettermatchthanAB'if"B"isabettermatchfor"T"thanB'. #S|T When"S"canmatch,itisabettermatchthanwhenonly"T"canmatch. Orderingoftwomatchesfor"S"isthesameasfor"S".Similarfortwomatchesfor"T". #S{REPEAT_COUNT} MatchesasSSS...S(repeatedasmanytimesasnecessary). #S{min,max} MatchesasS{max}|S{max-1}|...|S{min+1}|S{min}. #S{min,max}? MatchesasS{min}|S{min+1}|...|S{max-1}|S{max}. #S?,S*,S+ SameasS{0,1},S{0,BIG_NUMBER},S{1,BIG_NUMBER}respectively. #S??,S*?,S+? SameasS{0,1}?,S{0,BIG_NUMBER}?,S{1,BIG_NUMBER}?respectively. #(?>S) Matchesthebestmatchfor"S"andonlythat. #(?=S),(?<=S) Onlythebestmatchfor"S"isconsidered.(Thisisimportantonlyif"S"hascapturingparentheses,andbackreferencesareusedsomewhereelseinthewholeregularexpression.) #(?!S),(?\&convert; } subinvalid{die"/$_[0]/:invalidescape'\\$_[1]'"} #Wemustalsotakecareofnotescapingthelegitimate\\Y| #sequence,hencethepresenceof'\\'intheconversionrules. my%rules=('\\'=>'\\\\', 'Y|'=>qr/(?=\S)(?; chomp$re; $re=customre::convert$re; /\Y|$re\Y|/; #EmbeddedCodeExecutionFrequency Theexactrulesforhowoften(??{})and(?{})areexecutedinapatternareunspecified.InthecaseofasuccessfulmatchyoucanassumethattheyDWIMandwillbeexecutedinlefttorightordertheappropriatenumberoftimesintheacceptingpathofthepatternaswouldanyothermeta-pattern.Hownon-acceptingpathwaysandmatchfailuresaffectthenumberoftimesapatternisexecutedisspecificallyunspecifiedandmayvarydependingonwhatoptimizationscanbeappliedtothepatternandislikelytochangefromversiontoversion. Forinstancein "aaabcdeeeee"=~/a(?{print"a"})b(?{print"b"})cde/; theexactnumberoftimes"a"or"b"areprintedoutisunspecifiedforfailure,butyoumayassumetheywillbeprintedatleastonceduringasuccessfulmatch,additionallyyoumayassumethatif"b"isprinted,itwillbeprecededbyatleastone"a". Inthecaseofbranchingconstructslikethefollowing: /a(b|(?{print"a"}))c(?{print"c"})/; youcanassumethattheinput"ac"willoutput"ac",andthat"abc"willoutputonly"c". Whenembeddedcodeisquantified,successfulmatcheswillcallthecodeonceforeachmatchediterationofthequantifier.Forexample: "good"=~/g(?:o(?{print"o"}))*d/; willoutput"o"twice. #PCRE/PythonSupport AsofPerl5.10.0,PerlsupportsseveralPython/PCRE-specificextensionstotheregexsyntax.WhilePerlprogrammersareencouragedtousethePerl-specificsyntax,thefollowingarealsoaccepted: #(?Ppattern) Defineanamedcapturegroup.Equivalentto(?pattern). #(?P=NAME) Backreferencetoanamedcapturegroup.Equivalentto\g{NAME}. #(?P>NAME) Subroutinecalltoanamedcapturegroup.Equivalentto(?&NAME). #BUGS Thereareanumberofissueswithregardtocase-insensitivematchinginUnicoderules.See"i"under"Modifiers"above. Thisdocumentvariesfromdifficulttounderstandtocompletelyandutterlyopaque.Thewanderingproseriddledwithjargonishardtofathominseveralplaces. Thisdocumentneedsarewritethatseparatesthetutorialcontentfromthereferencecontent. #SEEALSO ThesyntaxofpatternsusedinPerlpatternmatchingevolvedfromthosesuppliedintheBellLabsResearchUnix8thEdition(Version8)regexroutines.(Thecodeisactuallyderived(distantly)fromHenrySpencer'sfreelyredistributablereimplementationofthoseV8routines.) perlrequick. perlretut. "RegexpQuote-LikeOperators"inperlop. "Gorydetailsofparsingquotedconstructs"inperlop. perlfaq6. "pos"inperlfunc. perllocale. perlebcdic. MasteringRegularExpressionsbyJeffreyFriedl,publishedbyO'ReillyandAssociates. PerldocBrowserismaintainedbyDanBook(DBOOK).PleasecontacthimviatheGitHubissuetrackeroremailregardinganyissueswiththesiteitself,search,orrenderingofdocumentation. ThePerldocumentationismaintainedbythePerl5PortersinthedevelopmentofPerl.PleasecontactthemviathePerlissuetracker,themailinglist,orIRCtoreportanyissueswiththecontentsorformatofthedocumentation.



請為這篇文章評分?