4.8 Regular Expressions - Racket Documentation

文章推薦指數: 80 %
投票人數:10人

If a character regexp is used with a byte string or input port, it matches UTF-8 encodings (see Encodings and Locales) of matching character streams; if a byte ... ►TheRacketReference1 LanguageModel2 NotationforDocumentation3 SyntacticForms4 Datatypes5 Structures6 ClassesandObjects7 Units8 Contracts9 PatternMatching10 ControlFlow11 ConcurrencyandParallelism12 Macros13 InputandOutput14 ReflectionandSecurity15 OperatingSystem16 MemoryManagement17 UnsafeOperations18 RunningRacketBibliographyIndex▼4 Datatypes4.1 Equality4.2 Booleans4.3 Numbers4.4 Strings4.5 ByteStrings4.6 Characters4.7 Symbols4.8 RegularExpressions4.9 Keywords4.10 PairsandLists4.11 MutablePairsandLists4.12 Vectors4.13 Boxes4.14 HashTables4.15 SequencesandStreams4.16 Dictionaries4.17 Sets4.18 Procedures4.19 Void4.20 Undefined►4.8 RegularExpressions4.8.1 RegexpSyntax4.8.2 AdditionalSyntacticConstraints4.8.3 RegexpConstructors4.8.4 RegexpMatching4.8.5 RegexpSplitting4.8.6 RegexpSubstitutionOnthispage:4.8.1 RegexpSyntax4.8.2 AdditionalSyntacticConstraints4.8.3 RegexpConstructorsregexp?pregexp?byte-regexp?byte-pregexp?regexppregexpbyte-regexpbyte-pregexpregexp-quoteregexp-max-lookbehind4.8.4 RegexpMatchingregexp-matchregexp-match*regexp-try-matchregexp-match-positionsregexp-match-positions*regexp-match?regexp-match-exact?regexp-match-peekregexp-match-peek-positionsregexp-match-peek-immediateregexp-match-peek-positions-immediateregexp-match-peek-positions*regexp-match/ endregexp-match-positions/ endregexp-match-peek-positions/ endregexp-match-peek-positions-immediate/ end4.8.5 RegexpSplittingregexp-split4.8.6 RegexpSubstitutionregexp-replaceregexp-replace*regexp-replacesregexp-replace-quote  top  contents  ←prev  up  next→ 4.8 RegularExpressionsRegularExpressionsinTheRacketGuideintroducesregularexpressions.Regularexpressionsarespecifiedasstringsorbyte strings,usingthesamepatternlanguageaseithertheUnixutility egreporPerl.Astring-specifiedpatternproducesacharacter regexpmatcher,andabyte-stringpatternproducesabyteregexp matcher.Ifacharacterregexpisusedwithabytestringorinput port,itmatchesUTF-8encodings(seeEncodingsandLocales)of matchingcharacterstreams;ifabyteregexpisusedwithacharacter string,itmatchesbytesintheUTF-8encodingofthestring.Aregularexpressionthatisrepresentedasastringorbytestring canbecompiledtoaregexpvalue,whichcanbeusedmore efficientlybyfunctionssuchasregexp-matchcomparedtothe stringorbytestringform.Theregexpand byte-regexpproceduresconvertastringorbytestring (respectively)intoaregexpvalueusingasyntaxofregular expressionsthatismostcompatibletoegrep.The pregexpandbyte-pregexpproceduresproducearegexp valueusingaslightlydifferentsyntaxofregularexpressionsthatis morecompatiblewithPerl.Tworegexpvaluesareequal?iftheyhavethesame source,usethesamepatternlanguage,andarebothcharacterregexps orbothbyteregexps.Aliteralorprintedregexpvaluestartswith#rxor #px.SeeReadingRegularExpressions forinformationonreading regularexpressionsandPrintingRegularExpressions forinformationonprintingregularexpressions.Regexp valuesproducedbythedefaultreaderareinternedin read-syntaxmode.OntheBCvariantofRacket, theinternalsizeofaregexpvalueislimitedto32kilobytes;this limitroughlycorrespondstoasourcestringwith32,000literal charactersor5,000operators.4.8.1 RegexpSyntaxThefollowingsyntaxspecificationsdescribethecontentofastring thatrepresentsaregularexpression.Thesyntaxofthecorresponding stringmayinvolveextraescapecharacters.Forexample,theregular expression(.*)\1canberepresentedwiththestring "(.*)\\1"ortheregexpconstant#rx"(.*)\\1";the \intheregularexpressionmustbeescapedtoincludeit inastringorregexpconstant.Theregexpandpregexpsyntaxesshareacommoncore: ‹regexp› ::= ‹pces› Match‹pces›  |‹regexp›|‹regexp› Matcheither‹regexp›,tryleftfirst ex1 ‹pces› ::=  Matchempty  |‹pce›‹pces› Match‹pce›followedby‹pces› ‹pce› ::= ‹repeat› Match‹repeat›,longestpossible ex3  |‹repeat›? Match‹repeat›,shortestpossible ex6  |‹atom› Match‹atom›exactlyonce ‹repeat› ::= ‹atom›* Match‹atom›0ormoretimes ex3  |‹atom›+ Match‹atom›1ormoretimes ex4  |‹atom›? Match‹atom›0or1times ex5 ‹atom› ::= (‹regexp›) Matchsub-expression‹regexp›andreport ex11  |[‹rng›] Matchanycharacterin‹rng› ex2  |[^‹rng›] Matchanycharacternotin‹rng› ex12  |. Matchany(exceptnewlineinmultimode) ex13  |^ Matchstart(orafternewlineinmultimode) ex14  |$ Matchend(orbeforenewlineinmultimode) ex15  |‹literal› Matchasingleliteralcharacter ex1  |(?‹mode›:‹regexp›) Match‹regexp›using‹mode› ex35  |(?>‹regexp›) Match‹regexp›,onlyfirstpossible  |‹look› Matchemptyif‹look›matches  |(?‹tst›‹pces›|‹pces›) Match1st‹pces›if‹tst›,else2nd‹pces› ex36  |(?‹tst›‹pces›) Match‹pces›if‹tst›,emptyifnot‹tst›  |\atendofpattern Matchthenulcharacter(ASCII0) ‹rng› ::= ] ‹rng›contains]only ex27  |- ‹rng›contains-only ex28  |‹mrng› ‹rng›containseverythingin‹mrng›  |‹mrng›- ‹rng›contains-andeverythingin‹mrng› ‹mrng› ::= ]‹lrng› ‹mrng›contains]andeverythingin‹lrng› ex29  |-‹lrng› ‹mrng›contains-andeverythingin‹lrng› ex29  |‹lirng› ‹mrng›containseverythingin‹lirng› ‹lirng› ::= ‹riliteral› ‹lirng›containsaliteralcharacter  |‹riliteral›-‹rliteral› ‹lirng›containsUnicoderangeinclusive ex22  |‹lirng›‹lrng› ‹lirng›containseverythinginboth ‹lrng› ::= ^ ‹lrng›contains^ ex30  |‹rliteral›-‹rliteral› ‹lrng›containsUnicoderangeinclusive  |^‹lrng› ‹lrng›contains^andmore  |‹lirng› ‹lrng›containseverythingin‹lirng› ‹look› ::= (?=‹regexp›) Matchif‹regexp›matches ex31  |(?!‹regexp›) Matchif‹regexp›doesn'tmatch ex32  |(?<=‹regexp›) Matchif‹regexp›matchespreceding ex33  |(?(regexp-match#rx"a|b""cat"); ex1'("a")>(regexp-match#rx"[at]""cat"); ex2'("a")>(regexp-match#rx"ca*[at]""caaat"); ex3'("caaat")>(regexp-match#rx"ca+[at]""caaat"); ex4'("caaat")>(regexp-match#rx"ca?t?""ct"); ex5'("ct")>(regexp-match#rx"ca*?[at]""caaat"); ex6'("ca")>(regexp-match#px"ca{2}""caaat"); ex7,uses#px'("caa")>(regexp-match#px"ca{2,}t""catcaat"); ex8,uses#px'("caat")>(regexp-match#px"ca{,2}t""caaatcat"); ex9,uses#px'("cat")>(regexp-match#px"ca{1,2}t""caaatcat"); ex10,uses#px'("cat")>(regexp-match#rx"(c*)(a*)""caat"); ex11'("caa""c""aa")>(regexp-match#rx"[^ca]""caat"); ex12'("t")>(regexp-match#rx".(.).""cat"); ex13'("cat""a")>(regexp-match#rx"^a|^c""cat"); ex14'("c")>(regexp-match#rx"a$|t$""cat"); ex15'("t")>(regexp-match#px"c(.)\\1t""caat"); ex16,uses#px'("caat""a")>(regexp-match#px".\\b.""catinhat"); ex17,uses#px'("t")>(regexp-match#px".\\B.""catinhat"); ex18,uses#px'("ca")>(regexp-match#px"\\p{Ll}""Cat"); ex19,uses#px'("a")>(regexp-match#px"\\P{Ll}""cat!"); ex20,uses#px'("!")>(regexp-match#rx"\\|""c|t"); ex21'("|")>(regexp-match#rx"[a-f]*""cat"); ex22'("ca")>(regexp-match#px"[a-f\\d]*""1cat"); ex23,uses#px'("1ca")>(regexp-match#px"[\\w]""cathat"); ex24,uses#px'("h")>(regexp-match#px"t[\\s]""cat\nhat"); ex25,uses#px'("t\n")>(regexp-match#px"[[:lower:]]+""Cat"); ex26,uses#px'("at")>(regexp-match#rx"[]]""c]t"); ex27'("]")>(regexp-match#rx"[-]""c-t"); ex28'("-")>(regexp-match#rx"[]a[]+""c[a]t"); ex29'("[a]")>(regexp-match#rx"[a^]+""ca^t"); ex30'("a^")>(regexp-match#rx".a(?=p)""catnap"); ex31'("na")>(regexp-match#rx".a(?!t)""catnap"); ex32'("na")>(regexp-match#rx"(?<=n)a.""catnap"); ex33'("ap")>(regexp-match#rx"(?(regexp-match#rx"(?i:a)[tp]""cATnAp"); ex35'("Ap")>(regexp-match#rx"(?(?<=c)a|b)+""cabal"); ex36'("ab")4.8.2 AdditionalSyntacticConstraintsInadditiontomatchingagrammar,regularexpressionsmustmeettwo syntacticrestrictions:Ina‹repeat›otherthan‹atom›?, the‹atom›mustnotmatchanemptysequence.Ina(?<=‹regexp›)or (?0  ‹atom›* : [0,∞]   ‹atom› : [n,m]   n>0  ‹atom›+ : [1,∞]     ‹atom› : [n,m]  ‹atom›? : [0,m]   ‹atom› : [n,m]   n>0  ‹atom›{‹n›} : [n*‹n›,m*‹n›]   ‹atom› : [n,m]   n>0  ‹atom›{‹n›,} : [n*‹n›,∞]   ‹atom› : [n,m]   n>0  ‹atom›{,‹m›} : [0,m*‹m›]   ‹atom› : [n,m]   n>0  ‹atom›{‹n›,‹m›} : [n*‹n›,m*‹m›]   ‹regexp› : [n,m]  (‹regexp›) : [n,m]   αN=n   ‹regexp› : [n,m]  (?‹mode›:‹regexp›) : [n,m]   ‹regexp› : [n,m]  (?=‹regexp›) : [0,0]     ‹regexp› : [n,m]  (?!‹regexp›) : [0,0]   ‹regexp› : [n,m]   m‹regexp›) : [n,m]   ‹tst› : [n0,m0]   ‹pces›1 : [n1,m1]   ‹pces›2 : [n2,m2]  (?‹tst›‹pces›1|‹pces›2) : [min(n1,n2),max(m1,m2)]   ‹tst› : [n0,m0]   ‹pces› : [n1,m1]  (?‹tst›‹pces›) : [0,m1]  (‹n›) : [αN,∞]   [‹rng›] : [1,1]   [^‹rng›] : [1,1] . : [1,1]   ^ : [0,0]   $ : [0,0] ‹literal› : [1,1]   \‹n› : [αN,∞]   ‹class› : [1,1] \b : [0,0]   \B : [0,0] \p{‹property›} : [1,6]   \P{‹property›} : [1,6]4.8.3 RegexpConstructorsprocedure(regexp? v) → boolean?  v : any/cReturns#tifvisaregexpvaluecreatedby regexporpregexp,#fotherwise.procedure(pregexp? v) → boolean?  v : any/cReturns#tifvisaregexpvaluecreatedby pregexp(notregexp),#fotherwise.procedure(byte-regexp? v) → boolean?  v : any/cReturns#tifvisaregexpvaluecreatedby byte-regexporbyte-pregexp,#fotherwise.procedure(byte-pregexp? v) → boolean?  v : any/cReturns#tifvisaregexpvaluecreatedby byte-pregexp(notbyte-regexp),#f otherwise.procedure(regexp str) → regexp?  str : string?(regexp str handler) → any  str : string?  handler : (or/c #f (string? -> any))Takesastringrepresentationofaregularexpression(usingthe syntaxinRegexpSyntax)andcompilesitintoaregexp value.Otherregularexpressionproceduresaccepteitherastringora regexpvalueasthematchingpattern.Ifaregularexpressionstring isusedmultipletimes,itisfastertocompilethestringoncetoa regexpvalueanduseitforrepeatedmatchesinsteadofusingthe stringeachtime.Ifhandlerisprovidedandnot#f,itiscalledand itsresultisreturnedwhenstrisnotavalidrepresentation ofaregularexpression;theargumenttohandlerisastring thatdescribestheproblemwithstr.Ifhandleris #fornotprovided,thenexn:fail:contractexceptionisraised.Theobject-nameprocedurereturns thesourcestringforaregexpvalue.Examples:>(regexp "ap*le")#rx"ap*le">(object-name #rx"ap*le")"ap*le">(regexp "+" (λ (s) (list s)))'("`+`followsnothinginpattern")Changedinversion6.5.0.1ofpackagebase:Addedthehandlerargument.procedure(pregexp str) → pregexp?  str : string?(pregexp str handler) → any  str : string?  handler : (or/c #f (string? -> any))Likeregexp,exceptthatitusesaslightlydifferentsyntax (seeRegexpSyntax).Theresultcanbeusedwith regexp-match,etc.,justliketheresultfrom regexp.Examples:>(pregexp "ap*le")#px"ap*le">(regexp? #px"ap*le")#t>(pregexp "+" (λ (s) (vector s)))'#("`+`followsnothinginpattern")Changedinversion6.5.0.1ofpackagebase:Addedthehandlerargument.procedure(byte-regexp bstr) → byte-regexp?  bstr : bytes?(byte-regexp bstr handler) → any  bstr : bytes?  handler : (or/c #f (bytes? -> any))Takesabyte-stringrepresentationofaregularexpression(usingthe syntaxinRegexpSyntax)andcompilesitintoa byte-regexpvalue.Ifhandlerisprovided,itiscalledanditsresultisreturned ifstrisnotavalidrepresentationofaregularexpression.Theobject-nameprocedure returnsthesourcebytestringforaregexpvalue.Examples:>(byte-regexp #"ap*le")#rx#"ap*le">(object-name #rx#"ap*le")#"ap*le">(byte-regexp "ap*le")byte-regexp:contractviolation  expected:bytes?  given:"ap*le">(byte-regexp #"+" (λ (s) (list s)))'("`+`followsnothinginpattern")Changedinversion6.5.0.1ofpackagebase:Addedthehandlerargument.procedure(byte-pregexp bstr) → byte-pregexp?  bstr : bytes?(byte-pregexp bstr handler) → any  bstr : bytes?  handler : (or/c #f (bytes? -> any))Likebyte-regexp,exceptthatitusesaslightlydifferent syntax(seeRegexpSyntax).Theresultcanbeusedwith regexp-match,etc.,justliketheresultfrom byte-regexp.Examples:>(byte-pregexp #"ap*le")#px#"ap*le">(byte-pregexp #"+" (λ (s) (vector s)))'#("`+`followsnothinginpattern")Changedinversion6.5.0.1ofpackagebase:Addedthehandlerargument.procedure(regexp-quote str [case-sensitive?]) → string?  str : string?  case-sensitive? : any/c = #t(regexp-quote bstr [case-sensitive?]) → bytes?  bstr : bytes?  case-sensitive? : any/c = #tProducesastringorbytestringsuitableforusewithregexp tomatchtheliteralsequenceofcharactersinstror sequenceofbytesinbstr.Ifcase-sensitive?is true(thedefault),theresultingregexpmatcheslettersin strorbytescase-sensitively,otherwiseitmatches case-insensitively.Examples:>(regexp-match "." "apple.scm")'("a")>(regexp-match (regexp-quote ".") "apple.scm")'(".")procedure(regexp-max-lookbehind pattern) → exact-nonnegative-integer?  pattern : (or/c regexp? byte-regexp?)Returnsthemaximumnumberofbytesthatpatternmayconsult beforethestartingpositionofamatchtodeterminethematch.For example,thepattern(?<=abc)dconsultsthreebytes precedingamatchingd,whilee(?<=a..)dconsults twobytesbeforeamatchinged.A^patternmay consultaprecedingbytetodeterminewhetherthecurrentpositionis thestartoftheinputorofaline.4.8.4 RegexpMatchingprocedure(regexp-match pattern   input   [start-pos   end-pos   output-port   input-prefix])  → (if (and (or (string? pattern) (regexp? pattern))         (or (string? input) (path? input)))    (or/c #f (cons/c string? (listof (or/c string? #f))))    (or/c #f (cons/c bytes?  (listof (or/c bytes?  #f)))))  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : (or/c string? bytes? path? input-port?)  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  output-port : (or/c output-port? #f) = #f  input-prefix : bytes? = #""Attemptstomatchpattern(astring,bytestring, regexpvalue,orbyte-regexpvalue)oncetoaportionof input.Thematcherfindsaportionofinputthat matchesandisclosesttothestartoftheinput(after start-pos).Ifinputisapath,itisconvertedtoabytestringwith path->bytesifpatternisabytestringora byte-basedregexp.Otherwise,inputisconvertedtoastring withpath->string.Theoptionalstart-posandend-posargumentsselect aportionofinputformatching;thedefaultistheentire stringorthestreamuptoanend-of-file.Wheninputisa string,start-posisacharacterposition;when inputisabytestring,thenstart-posisabyte position;andwheninputisaninputport,start-pos isthenumberofbytestoskipbeforestartingtomatch.The end-posargumentcanbe#f,whichcorrespondstothe endofthestringoranend-of-fileinthestream;otherwise,itisa characterorbyteposition,likestart-pos.Ifinput isaninputport,andifanend-of-fileisreachedbefore start-posbytesareskipped,thenthematchfails.Inpattern,astart-of-string^referstothefirst positionofinputafterstart-pos,assumingthat input-prefixis#"".Theend-of-input$ referstotheend-posthpositionor(inthecaseofaninput port)anend-of-file,whichevercomesfirst.Theinput-prefixspecifiesbytesthateffectivelyprecede inputforthepurposesof^andotherlook-behind matching.Forexample,a#""prefixmeansthat^ matchesatthebeginningofthestream,whilea#"\n" input-prefixmeansthatastart-of-line^canmatch thebeginningoftheinput,whileastart-of-file^cannot.Ifthematchfails,#fisreturned.Ifthematchsucceeds,a listcontainingstringsorbytestring,andpossibly#f,is returned.Thelistcontainsstringsonlyifinputisastring andpatternisnotabyteregexp.Otherwise,thelist containsbytestrings(substringsoftheUTF-8encodingof input,ifinputisastring).Thefirst[byte]stringinaresultlististheportionof inputthatmatchedpattern.Iftwoportionsof inputcanmatchpattern,thenthematchthatstarts earliestisfound.Additional[byte]stringsarereturnedinthelistifpattern containsparenthesizedsub-expressions(butnotwhentheopening parenthesisisfollowedby?).Matchesforthe sub-expressionsareprovidedintheorderoftheopeningparentheses inpattern.Whensub-expressionsoccurinbranchesofan |“or”pattern,ina*“zeroormore” pattern,orotherplaceswheretheoverallpatterncansucceedwithout amatchforthesub-expression,thena#fisreturnedforthe sub-expressionifitdidnotcontributetothefinalmatch.Whena singlesub-expressionoccurswithina*“zeroormore” patternorothermultiple-matchpositions,thentherightmostmatch associatedwiththesub-expressionisreturnedinthelist.Iftheoptionaloutput-portisprovidedasanoutputport, thepartofinputfromitsbeginning(notstart-pos) thatprecedesthematchiswrittentotheport.Allofinput uptoend-posiswrittentotheportifnomatchis found.Thisfunctionalityismostusefulwheninputisan inputport.Whenmatchinganinputport,amatchfailurereadsupto end-posbytes(orend-of-file),evenifpattern beginswithastart-of-string^;seealso regexp-try-match.Onsuccess,allbytesuptoandincluding thematchareeventuallyreadfromtheport,butmatchingproceedsby firstpeekingbytesfromtheport(usingpeek-bytes-avail!), andthen(re‑)readingmatchingbytestodiscardthemafterthematch resultisdetermined.Non-matchingbytesmaybereadanddiscarded beforethematchisdetermined.Thematcherpeeksinblockingmode onlyasfarasnecessarytodetermineamatch,butitmaypeekextra bytestofillaninternalbufferifimmediatelyavailable(i.e., withoutblocking).Greedyrepeatoperatorsinpattern,such as*or+,tendtoforcereadingtheentire contentoftheport(uptoend-pos)todetermineamatch.Iftheinputportisreadsimultaneouslybyanotherthread,orifthe portisacustomportwithinconsistentreadingandpeekingprocedures (seeCustomPorts),thenthebytesthatarepeekedand usedformatchingmaybedifferentthanthebytesreadanddiscarded afterthematchcompletes;thematcherinspectsonlythepeeked bytes.Toavoidsuchinterleaving,useregexp-match-peek (withaprogress-evtargument)followedby port-commit-peeked.Examples:>(regexp-match #rx"x." "12x4x6")'("x4")>(regexp-match #rx"y." "12x4x6")#f>(regexp-match #rx"x." "12x4x6" 3)'("x6")>(regexp-match #rx"x." "12x4x6" 3 4)#f>(regexp-match #rx#"x." "12x4x6")'(#"x4")>(regexp-match #rx"x." "12x4x6" 0 #f (current-output-port))12'("x4")>(regexp-match #rx"(-[0-9]*)+" "a-12--345b")'("-12--345""-345")procedure(regexp-match* pattern   input   [start-pos   end-pos   input-prefix   #:match-select match-select   #:gap-select? gap-select])  → (if (and (or (string? pattern) (regexp? pattern))         (or (string? input) (path? input)))    (listof (or/c string? (listof (or/c #f string?))))    (listof (or/c bytes? (listof (or/c #f bytes?)))))  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : (or/c string? bytes? path? input-port?)  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  input-prefix : bytes? = #""  match-select : (or/c (list? .->. (or/c any/c list?))      #f) = car  gap-select : any/c = #fLikeregexp-match,buttheresultisalistofstringsor bytestringscorrespondingtoasequenceofmatchesof patternininput.Thepatternisusedinordertofindmatches,whereeach matchattemptstartsattheendofthelastmatch,and^is allowedtomatchthebeginningoftheinput(ifinput-prefix is#"")onlyforthefirstmatch.Emptymatchesarehandled likeothermatches,returningazero-lengthstringorbytesequence (theyaremoreusefulinmakingthisacomplementof regexp-split),butpatternisrestrictedfrom matchinganemptysequenceimmediatelyafteranemptymatch.Ifinputcontainsnomatches(intherangestart-pos toend-pos),nullisreturned.Otherwise,eachitem intheresultinglistisadistinctsubstringorbytesequencefrom inputthatmatchespattern.Theend-pos argumentcanbe#ftomatchtotheendofinput (whichcorrespondstoanend-of-fileifinputisaninput port).Examples:>(regexp-match* #rx"x." "12x4x6")'("x4""x6")>(regexp-match* #rx"x*" "12x4x6")'("""""x""""x""""")match-selectspecifiesthecollectedresults.Thedefaultof carmeansthattheresultisthelistofmatcheswithout returningparenthesizedsub-patterns.Itcanbegivenasa‘selector’ functionwhichchoosesanitemfromalist,oritcanchoosealistof items.Forexample,youcanusecdrtogetalistoflists ofparenthesizedsub-patternsmatches,orvalues(asan identityfunction)togetthefullmatchesaswell.(Notethatthe selectormustchooseanelementofitsinputlistoralistof elements,butitmustnotinspectitsinputastheycanbeeithera listofstringsoralistofpositionpairs.Furthermore,the selectormustbeconsistentinitschoice(s).)Examples:>(regexp-match* #rx"x(.)" "12x4x6" #:match-select cadr)'("4""6")>(regexp-match* #rx"x(.)" "12x4x6" #:match-select values)'(("x4""4")("x6""6"))Inaddition,specifyinggap-selectasanon-#fvalue willmaketheresultaninterleavedlistofthematchesaswellasthe separatorsbetweenthemmatches,startingandendingwithaseparator. Inthiscase,match-selectcanbegivenas#fto returnonlytheseparators,makingsuchusesequivalentto regexp-split.Examples:>(regexp-match* #rx"x(.)" "12x4x6" #:match-select cadr #:gap-select? #t)'("12""4""""6""")>(regexp-match* #rx"x(.)" "12x4x6" #:match-select #f #:gap-select? #t)'("12""""")procedure(regexp-try-match pattern   input   [start-pos   end-pos   output-port   input-prefix])  → (or/c #f (cons/c bytes? (listof (or/c bytes? #f))))  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : input-port?  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  output-port : (or/c output-port? #f) = #f  input-prefix : bytes? = #""Likeregexp-matchoninputports,exceptthatifthematch fails,nocharactersarereadanddiscardedfromin.Thisprocedureisespeciallyusefulwithapatternthat beginswithastart-of-string^orwithanon-#f end-pos,sinceeachlimitstheamountofpeekingintothe port.Otherwise,bewarethatalargeportionofthestreammaybe peeked(andthereforepulledintomemory)beforethematchsucceedsor fails.procedure(regexp-match-positions pattern   input   [start-pos   end-pos   output-port   input-prefix])  → (or/c (cons/c (cons/c exact-nonnegative-integer?                      exact-nonnegative-integer?)              (listof (or/c (cons/c exact-integer?                                    exact-integer?)                            #f)))      #f)  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : (or/c string? bytes? path? input-port?)  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  output-port : (or/c output-port? #f) = #f  input-prefix : bytes? = #""Likeregexp-match,butreturnsalistofnumberpairs(and #f)insteadofalistofstrings.Eachpairofnumbersrefers toarangeofcharactersorbytesininput.Iftheresultfor thesameargumentswithregexp-matchwouldbealistofbyte strings,theresultingrangescorrespondtobyteranges;inthatcase, ifinputisacharacterstring,thebyterangescorrespondto bytesintheUTF-8encodingofthestring.Rangeresultsarereturnedinasubstring-and subbytes-compatiblemanner,independentof start-pos.Inthecaseofaninputport,thereturned positionsindicatethenumberofbytesthatwereread,including start-pos,beforethefirstmatchingbyte.Examples:>(regexp-match-positions #rx"x." "12x4x6")'((2.4))>(regexp-match-positions #rx"x." "12x4x6" 3)'((4.6))>(regexp-match-positions #rx"(-[0-9]*)+" "a-12--345b")'((1.9)(5.9))Rangeresultsafterthefirstonecanincludenegativenumbersif input-prefixisnon-emptyandifpatternincludesa lookbehindpattern.Suchrangesstartintheinput-prefix insteadofinput.Moregenerally,whenstart-posis positive,thenrangeresultsthatarelessthanstart-pos startininput-prefix.Examples:>(regexp-match-positions #rx"(?<=(.))." "a" 0 #f #f #"x")'((0.1)(-1.0))>(regexp-match-positions #rx"(?<=(..))." "a" 0 #f #f #"x")#f>(regexp-match-positions #rx"(?<=(..))." "_a" 1 #f #f #"x")#fAlthoughinput-prefixisalwaysabytestring,whenthe returnedpositionsarestringindicesandtheyrefertoaportionof input-prefix,thentheycorrespondtoaUTF-8decodingof atailofinput-prefix.Examples:>(bytes-length (string->bytes/utf-8 "λ"))2>(regexp-match-positions #rx"(?<=(.))." "a" 0 #f #f (string->bytes/utf-8 "λ"))'((0.1)(-1.0))procedure(regexp-match-positions* pattern   input   [start-pos   end-pos   input-prefix   #:match-select match-select])  → (or/c (listof (cons/c exact-nonnegative-integer?                      exact-nonnegative-integer?))      (listof (listof (or/c #f (cons/c exact-nonnegative-integer?                                       exact-nonnegative-integer?)))))  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : (or/c string? bytes? path? input-port?)  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  input-prefix : bytes? = #""  match-select : (list? .->. (or/c any/c list?)) = carLikeregexp-match-positions,butreturnsmultiplematches likeregexp-match*.Examples:>(regexp-match-positions* #rx"x." "12x4x6")'((2.4)(4.6))>(regexp-match-positions* #rx"x(.)" "12x4x6" #:match-select cadr)'((3.4)(5.6))Notethatunlikeregexp-match*,thereisno #:gap-select?inputkeyword,asthisinformationcanbeeasily inferredfromtheresultingmatches.procedure(regexp-match? pattern      input      [start-pos      end-pos      output-port      input-prefix]) → boolean?  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : (or/c string? bytes? path? input-port?)  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  output-port : (or/c output-port? #f) = #f  input-prefix : bytes? = #""Likeregexp-match,butreturnsmerely#twhenthe matchsucceeds,#fotherwise.Examples:>(regexp-match? #rx"x." "12x4x6")#t>(regexp-match? #rx"y." "12x4x6")#fprocedure(regexp-match-exact? pattern input) → boolean?  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : (or/c string? bytes? path?)Likeregexp-match?,but#tisonlyreturnedwhenthe firstfoundmatchistotheentirecontentofinput.Examples:>(regexp-match-exact? #rx"x." "12x4x6")#f>(regexp-match-exact? #rx"1.*x." "12x4x6")#tBewarethatregexp-match-exact?canreturn#fif patterngeneratesapartialmatchforinputfirst,evenif patterncouldalsogenerateacompletematch.Tocheckifthereisany matchofpatternthatcoversallofinput,use rexexp-match?with^(?:pattern)$ instead.Examples:>(regexp-match-exact? #rx"a|ab" "ab")#f>(regexp-match? #rx"^(?:a|ab)$" "ab")#tThe(?:)groupingisnecessarybecauseconcatenationhas lowerprecedencethanalternation;theregularexpressionwithoutit, ^a|ab$,matchesanyinputthateitherstartswith aorendswithab.Example:>(regexp-match? #rx"^a|ab$" "123ab")#tprocedure(regexp-match-peek pattern   input   [start-pos   end-pos   progress   input-prefix])  → (or/c (cons/c bytes? (listof (or/c bytes? #f)))      #f)  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : input-port?  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  progress : (or/c evt #f) = #f  input-prefix : bytes? = #""Likeregexp-matchoninputports,butonlypeeksbytesfrom inputinsteadofreadingthem.Furthermore,insteadof anoutputport,thelastoptionalargumentisaprogresseventfor input(seeport-progress-evt).Ifprogress becomesready,thenthematchstopspeekingfrominput andreturns#f.Theprogressargumentcanbe #f,inwhichcasethepeekmaycontinuewithinconsistent informationifanotherprocessmeanwhilereadsfrom input.Examples:>(define p (open-input-string "aabcd"))>(regexp-match-peek ".*bc" p)'(#"aabc")>(regexp-match-peek ".*bc" p 2)'(#"abc")>(regexp-match ".*bc" p 2)'(#"abc")>(peek-char p)#\d>(regexp-match ".*bc" p)#f>(peek-char p)#procedure(regexp-match-peek-positions pattern   input   [start-pos   end-pos   progress   input-prefix])  → (or/c (cons/c (cons/c exact-nonnegative-integer?                      exact-nonnegative-integer?)              (listof (or/c (cons/c exact-nonnegative-integer?                                    exact-nonnegative-integer?)                            #f)))      #f)  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : input-port?  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  progress : (or/c evt #f) = #f  input-prefix : bytes? = #""Likeregexp-match-positionsoninputports,butonlypeeks bytesfrominputinsteadofreadingthem,andwitha progressargumentlikeregexp-match-peek.procedure(regexp-match-peek-immediate pattern   input   [start-pos   end-pos   progress   input-prefix])  → (or/c (cons/c bytes? (listof (or/c bytes? #f)))      #f)  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : input-port?  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  progress : (or/c evt #f) = #f  input-prefix : bytes? = #""Likeregexp-match-peek,butitattemptstomatchonlybytes thatareavailablefrominputwithoutblocking.The matchfailsifnot-yet-availablecharactersmightbeusedtomatch pattern.procedure(regexp-match-peek-positions-immediate pattern   input   [start-pos   end-pos   progress   input-prefix])  → (or/c (cons/c (cons/c exact-nonnegative-integer?                      exact-nonnegative-integer?)              (listof (or/c (cons/c exact-nonnegative-integer?                                    exact-nonnegative-integer?)                            #f)))      #f)  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : input-port?  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  progress : (or/c evt #f) = #f  input-prefix : bytes? = #""Likeregexp-match-peek-positions,butitattemptstomatch onlybytesthatareavailablefrominputwithout blocking.Thematchfailsifnot-yet-availablecharactersmightbe usedtomatchpattern.procedure(regexp-match-peek-positions* pattern   input   [start-pos   end-pos   input-prefix   #:match-select match-select])  → (or/c (listof (cons/c exact-nonnegative-integer?                      exact-nonnegative-integer?))      (listof (listof (or/c #f (cons/c exact-nonnegative-integer?                                       exact-nonnegative-integer?)))))  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : input-port?  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  input-prefix : bytes? = #""  match-select : (list? .->. (or/c any/c list?)) = carLikeregexp-match-peek-positions,butreturnsmultiplematcheslike regexp-match-positions*.procedure(regexp-match/end pattern   input   [start-pos   end-pos   output-port   input-prefix   count])  → (if (and (or (string? pattern) (regexp? pattern))         (or/c (string? input) (path? input)))    (or/c #f (cons/c string? (listof (or/c string? #f))))    (or/c #f (cons/c bytes?  (listof (or/c bytes?  #f)))))(or/c #f bytes?)  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : (or/c string? bytes? path? input-port?)  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  output-port : (or/c output-port? #f) = #f  input-prefix : bytes? = #""  count : exact-nonnegative-integer? = 1Likeregexp-match,butwithasecondresult:abyte stringofuptocountbytesthatcorrespondtotheinput (possiblyincludingtheinput-prefix)leadingtotheendof thematch;thesecondresultis#fifnomatchisfound.Thesecondresultcanbeusefulasaninput-prefixfor attemptingasecondmatchoninputstartingfromtheendof thefirstmatch.Inthatcase,useregexp-max-lookbehind todetermineanappropriatevalueforcount.procedure(regexp-match-positions/end pattern   input   [start-pos   end-pos   input-prefix   count])  → (listof (cons/c exact-nonnegative-integer?                exact-nonnegative-integer?))(or/c #f bytes?)  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : (or/c string? bytes? path? input-port?)  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  input-prefix : bytes? = #""  count : exact-nonnegative-integer? = 1procedure(regexp-match-peek-positions/end pattern   input   [start-pos   end-pos   progress   input-prefix   count])  → (or/c (cons/c (cons/c exact-nonnegative-integer?                      exact-nonnegative-integer?)              (listof (or/c (cons/c exact-nonnegative-integer?                                    exact-nonnegative-integer?)                            #f)))      #f)(or/c #f bytes?)  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : input-port?  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  progress : (or/c evt #f) = #f  input-prefix : bytes? = #""  count : exact-nonnegative-integer? = 1procedure(regexp-match-peek-positions-immediate/end pattern   input   [start-pos   end-pos   progress   input-prefix   count])  → (or/c (cons/c (cons/c exact-nonnegative-integer?                      exact-nonnegative-integer?)              (listof (or/c (cons/c exact-nonnegative-integer?                                    exact-nonnegative-integer?)                            #f)))      #f)(or/c #f bytes?)  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : input-port?  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  progress : (or/c evt #f) = #f  input-prefix : bytes? = #""  count : exact-nonnegative-integer? = 1Likeregexp-match-positions,etc.,butwithasecondresult likeregexp-match/end.4.8.5 RegexpSplittingprocedure(regexp-split pattern   input   [start-pos   end-pos   input-prefix])  → (if (and (or (string? pattern) (regexp? pattern))         (string? input))    (cons/c string? (listof string?))    (cons/c bytes? (listof bytes?)))  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : (or/c string? bytes? input-port?)  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  input-prefix : bytes? = #""Thecomplementofregexp-match*:theresultisalistof strings(ifpatternisastringorcharacterregexpand inputisastring)orbytestrings(otherwise)from inputthatareseparatedbymatchesto pattern.Adjacentmatchesareseparatedwith""or #"".Zero-lengthmatchesaretreatedthesameasfor regexp-match*.Ifinputcontainsnomatches(intherangestart-pos toend-pos),theresultisalistcontaininginput’s content(fromstart-postoend-pos)asasingle element.Ifamatchoccursatthebeginningofinput(at start-pos),theresultinglistwillstartwithanempty stringorbytestring,andifamatchoccursattheend(at end-pos),thelistwillendwithanemptystringorbyte string.Theend-posargumentcanbe#f,inwhich casesplittinggoestotheendofinput(whichcorrespondsto anend-of-fileifinputisaninputport).Examples:>(regexp-split #rx"+" "12  34")'("12""34")>(regexp-split #rx"." "12  34")'("""""""""""""")>(regexp-split #rx"" "12  34")'("""1""2""""""3""4""")>(regexp-split #rx"*" "12  34")'("""1""2""""3""4""")>(regexp-split #px"\\b" "12,13and14.")'("""12"",""13""""and""""14"".")>(regexp-split #rx"+" "")'("")4.8.6 RegexpSubstitutionprocedure(regexp-replace pattern   input   insert   [input-prefix])  → (if (and (or (string? pattern) (regexp? pattern))         (string? input))    string?    bytes?)  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : (or/c string? bytes?)  insert : (or/c string? bytes?      ((string?) () #:rest (listof string?) .->*. string?)      ((bytes?) () #:rest (listof bytes?) .->*. bytes?))  input-prefix : bytes? = #""Performsamatchusingpatternoninput,andthen returnsastringorbytestringinwhichthematchingportionof inputisreplacedwithinsert.Ifpattern matchesnopartofinput,theninputisreturned unmodified.Theinsertargumentcanbeeithera(byte)string,ora functionthatreturnsa(byte)string.Inthelattercase,the functionisappliedonthelistofvaluesthatregexp-match wouldreturn(i.e.,thefirstargumentisthecompletematch,andthen oneargumentforeachparenthesizedsub-expression)toobtaina replacement(byte)string.Ifpatternisastringorcharacterregexpandinput isastring,theninsertmustbeastringoraprocedurethat acceptstrings,andtheresultisastring.Ifpatternisa bytestringorbyteregexp,orifinputisabytestring, theninsertasastringisconvertedtoabytestring, insertasaprocedureiscalledwithabytestring,andthe resultisabytestring.Ifinsertcontains&,then& isreplacedwiththematchingportionofinputbeforeitis substitutedintothematch’splace.Ifinsertcontains \‹n›forsomeinteger‹n›,thenitis replacedwiththe‹n›thmatchingsub-expressionfrom input.A&and\0arealiases.If the‹n›thsub-expressionwasnotusedinthematch,orif ‹n›isgreaterthanthenumberofsub-expressionsin pattern,then\‹n›isreplacedwiththe emptystring.Tosubstitutealiteral&or\,use \&and\\,respectively,in insert.A\$ininsertis equivalenttoanemptysequence;thiscanbeusedtoterminatea number‹n›following\.Ifa\in insertisfollowedbyanythingotherthanadigit, &,\,or$,thenthe\ byitselfistreatedas\0.Notethatthe\describedinthepreviousparagraphsisa characterorbyteofinsert.Towritesuchaninsert asaRacketstringliteral,anescaping\isneeded beforethe\.Forexample,theRacketconstant "\\1"is\1.Examples:>(regexp-replace #rx"mi" "micasa" "su")"sucasa">(regexp-replace #rx"mi" "micasa" string-upcase)"MIcasa">(regexp-replace #rx"([Mm])i([a-zA-Z]*)" "MiCasa" "\\1y\\2")"MyCasa">(regexp-replace #rx"([Mm])i([a-zA-Z]*)" "micervezaMiMiMi"                  "\\1y\\2")"mycervezaMiMiMi">(regexp-replace #rx"x" "12x4x6" "\\\\")"12\\4x6">(display (regexp-replace #rx"x" "12x4x6" "\\\\"))12\4x6procedure(regexp-replace* pattern      input      insert      [start-pos      end-pos      input-prefix]) → (or/c string? bytes?)  pattern : (or/c string? bytes? regexp? byte-regexp?)  input : (or/c string? bytes?)  insert : (or/c string? bytes?      ((string?) () #:rest (listof string?) .->*. string?)      ((bytes?) () #:rest (listof bytes?) .->*. bytes?))  start-pos : exact-nonnegative-integer? = 0  end-pos : (or/c exact-nonnegative-integer? #f) = #f  input-prefix : bytes? = #""Likeregexp-replace,exceptthateveryinstanceof patternininputisreplacedwithinsert, insteadofjustthefirstmatch.Theresultisinputonlyif therearenomatches,start-posis0,and end-posis#forthelengthofinput. Onlynon-overlappinginstancesof patternininputarereplaced,soinstancesof patternwithininsertedstringsarenotreplaced recursively.Zero-lengthmatchesaretreatedthesameasin regexp-match*.Theoptionalstart-posandend-posargumentsselect aportionofinputformatching;thedefaultistheentire stringorthestreamuptoanend-of-file.Examples:>(regexp-replace* #rx"([Mm])i([a-zA-Z]*)" "micervezaMiMiMi"                   "\\1y\\2")"mycervezaMyMiMi">(regexp-replace* #rx"([Mm])i([a-zA-Z]*)" "micervezaMiMiMi"                   (lambda (all one two)                     (string-append (string-downcase one) "y"                                    (string-upcase two))))"myCERVEZAmyMIMi">(regexp-replace* #px"\\w" "helloworld" string-upcase 0 5)"HELLOworld">(display (regexp-replace* #rx"x" "12x4x6" "\\\\"))12\4\6Changedinversion8.1.0.7ofpackagebase:Changedtoreturninputwhenno replacementsareperformed.procedure(regexp-replaces input replacements) → (or/c string? bytes?)  input : (or/c string? bytes?)  replacements : (listof (list/c (or/c string? bytes? regexp? byte-regexp?)         (or/c string? bytes?             ((string?) () #:rest (listof string?) .->*. string?)             ((bytes?) () #:rest (listof bytes?) .->*. bytes?))))Performsachainofregexp-replace*operations,whereeach elementinreplacementsspecifiesareplacementasa (listpatternreplacement).Thereplacementsaredonein order,solaterreplacementscanapplytopreviousinsertions.Examples:>(regexp-replaces "zero-or-more?"                   '([#rx"-" "_"] [#rx"(.*)\\?$" "is_\\1"]))"is_zero_or_more">(regexp-replaces "zero-or-more?"                   '([#rx"e" "o"] [#rx"o" "oo"]))"zooroo-oor-mooroo?"procedure(regexp-replace-quote str) → string?  str : string?(regexp-replace-quote bstr) → bytes?  bstr : bytes?Producesastringsuitableforuseasthethirdargumentto regexp-replacetoinserttheliteralsequenceofcharacters instrorbytesinbstrasareplacement. Concretely,every\and&instror bstrisprotectedbyaquoting\.Examples:>(regexp-replace #rx"UT" "GoUT!" "A&M")"GoAUTM!">(regexp-replace #rx"UT" "GoUT!" (regexp-replace-quote "A&M"))"GoA&M!"  top  contents  ←prev  up  next→  



請為這篇文章評分?