4.8 Regular Expressions - Racket Documentation
文章推薦指數: 80 %
If a character regexp is used with a byte string or input port, it matches UTF-8 encodings (see Encodings and Locales) of matching character streams; if a byte ...
►TheRacketReference1 LanguageModel2 NotationforDocumentation3 SyntacticForms4 Datatypes5 Structures6 ClassesandObjects7 Units8 Contracts9 PatternMatching10 ControlFlow11 ConcurrencyandParallelism12 Macros13 InputandOutput14 ReflectionandSecurity15 OperatingSystem16 MemoryManagement17 UnsafeOperations18 RunningRacketBibliographyIndex▼4 Datatypes4.1 Equality4.2 Booleans4.3 Numbers4.4 Strings4.5 ByteStrings4.6 Characters4.7 Symbols4.8 RegularExpressions4.9 Keywords4.10 PairsandLists4.11 MutablePairsandLists4.12 Vectors4.13 Boxes4.14 HashTables4.15 SequencesandStreams4.16 Dictionaries4.17 Sets4.18 Procedures4.19 Void4.20 Undefined►4.8 RegularExpressions4.8.1 RegexpSyntax4.8.2 AdditionalSyntacticConstraints4.8.3 RegexpConstructors4.8.4 RegexpMatching4.8.5 RegexpSplitting4.8.6 RegexpSubstitutionOnthispage:4.8.1 RegexpSyntax4.8.2 AdditionalSyntacticConstraints4.8.3 RegexpConstructorsregexp?pregexp?byte-regexp?byte-pregexp?regexppregexpbyte-regexpbyte-pregexpregexp-quoteregexp-max-lookbehind4.8.4 RegexpMatchingregexp-matchregexp-match*regexp-try-matchregexp-match-positionsregexp-match-positions*regexp-match?regexp-match-exact?regexp-match-peekregexp-match-peek-positionsregexp-match-peek-immediateregexp-match-peek-positions-immediateregexp-match-peek-positions*regexp-match/ endregexp-match-positions/ endregexp-match-peek-positions/ endregexp-match-peek-positions-immediate/ end4.8.5 RegexpSplittingregexp-split4.8.6 RegexpSubstitutionregexp-replaceregexp-replace*regexp-replacesregexp-replace-quote top contents ←prev up next→ 4.8 RegularExpressionsRegularExpressionsinTheRacketGuideintroducesregularexpressions.Regularexpressionsarespecifiedasstringsorbyte
strings,usingthesamepatternlanguageaseithertheUnixutility
egreporPerl.Astring-specifiedpatternproducesacharacter
regexpmatcher,andabyte-stringpatternproducesabyteregexp
matcher.Ifacharacterregexpisusedwithabytestringorinput
port,itmatchesUTF-8encodings(seeEncodingsandLocales)of
matchingcharacterstreams;ifabyteregexpisusedwithacharacter
string,itmatchesbytesintheUTF-8encodingofthestring.Aregularexpressionthatisrepresentedasastringorbytestring
canbecompiledtoaregexpvalue,whichcanbeusedmore
efficientlybyfunctionssuchasregexp-matchcomparedtothe
stringorbytestringform.Theregexpand
byte-regexpproceduresconvertastringorbytestring
(respectively)intoaregexpvalueusingasyntaxofregular
expressionsthatismostcompatibletoegrep.The
pregexpandbyte-pregexpproceduresproducearegexp
valueusingaslightlydifferentsyntaxofregularexpressionsthatis
morecompatiblewithPerl.Tworegexpvaluesareequal?iftheyhavethesame
source,usethesamepatternlanguage,andarebothcharacterregexps
orbothbyteregexps.Aliteralorprintedregexpvaluestartswith#rxor
#px.SeeReadingRegularExpressions
forinformationonreading
regularexpressionsandPrintingRegularExpressions
forinformationonprintingregularexpressions.Regexp
valuesproducedbythedefaultreaderareinternedin
read-syntaxmode.OntheBCvariantofRacket,
theinternalsizeofaregexpvalueislimitedto32kilobytes;this
limitroughlycorrespondstoasourcestringwith32,000literal
charactersor5,000operators.4.8.1 RegexpSyntaxThefollowingsyntaxspecificationsdescribethecontentofastring
thatrepresentsaregularexpression.Thesyntaxofthecorresponding
stringmayinvolveextraescapecharacters.Forexample,theregular
expression(.*)\1canberepresentedwiththestring
"(.*)\\1"ortheregexpconstant#rx"(.*)\\1";the
\intheregularexpressionmustbeescapedtoincludeit
inastringorregexpconstant.Theregexpandpregexpsyntaxesshareacommoncore: ‹regexp› ::= ‹pces› Match‹pces› |‹regexp›|‹regexp› Matcheither‹regexp›,tryleftfirst ex1 ‹pces› ::= Matchempty |‹pce›‹pces› Match‹pce›followedby‹pces› ‹pce› ::= ‹repeat› Match‹repeat›,longestpossible ex3 |‹repeat›? Match‹repeat›,shortestpossible ex6 |‹atom› Match‹atom›exactlyonce ‹repeat› ::= ‹atom›* Match‹atom›0ormoretimes ex3 |‹atom›+ Match‹atom›1ormoretimes ex4 |‹atom›? Match‹atom›0or1times ex5 ‹atom› ::= (‹regexp›) Matchsub-expression‹regexp›andreport ex11 |[‹rng›] Matchanycharacterin‹rng› ex2 |[^‹rng›] Matchanycharacternotin‹rng› ex12 |. Matchany(exceptnewlineinmultimode) ex13 |^ Matchstart(orafternewlineinmultimode) ex14 |$ Matchend(orbeforenewlineinmultimode) ex15 |‹literal› Matchasingleliteralcharacter ex1 |(?‹mode›:‹regexp›) Match‹regexp›using‹mode› ex35 |(?>‹regexp›) Match‹regexp›,onlyfirstpossible |‹look› Matchemptyif‹look›matches |(?‹tst›‹pces›|‹pces›) Match1st‹pces›if‹tst›,else2nd‹pces› ex36 |(?‹tst›‹pces›) Match‹pces›if‹tst›,emptyifnot‹tst› |\atendofpattern Matchthenulcharacter(ASCII0) ‹rng› ::= ] ‹rng›contains]only ex27 |- ‹rng›contains-only ex28 |‹mrng› ‹rng›containseverythingin‹mrng› |‹mrng›- ‹rng›contains-andeverythingin‹mrng› ‹mrng› ::= ]‹lrng› ‹mrng›contains]andeverythingin‹lrng› ex29 |-‹lrng› ‹mrng›contains-andeverythingin‹lrng› ex29 |‹lirng› ‹mrng›containseverythingin‹lirng› ‹lirng› ::= ‹riliteral› ‹lirng›containsaliteralcharacter |‹riliteral›-‹rliteral› ‹lirng›containsUnicoderangeinclusive ex22 |‹lirng›‹lrng› ‹lirng›containseverythinginboth ‹lrng› ::= ^ ‹lrng›contains^ ex30 |‹rliteral›-‹rliteral› ‹lrng›containsUnicoderangeinclusive |^‹lrng› ‹lrng›contains^andmore |‹lirng› ‹lrng›containseverythingin‹lirng› ‹look› ::= (?=‹regexp›) Matchif‹regexp›matches ex31 |(?!‹regexp›) Matchif‹regexp›doesn'tmatch ex32 |(?<=‹regexp›) Matchif‹regexp›matchespreceding ex33 |(?(regexp-match#rx"a|b""cat"); ex1'("a")>(regexp-match#rx"[at]""cat"); ex2'("a")>(regexp-match#rx"ca*[at]""caaat"); ex3'("caaat")>(regexp-match#rx"ca+[at]""caaat"); ex4'("caaat")>(regexp-match#rx"ca?t?""ct"); ex5'("ct")>(regexp-match#rx"ca*?[at]""caaat"); ex6'("ca")>(regexp-match#px"ca{2}""caaat"); ex7,uses#px'("caa")>(regexp-match#px"ca{2,}t""catcaat"); ex8,uses#px'("caat")>(regexp-match#px"ca{,2}t""caaatcat"); ex9,uses#px'("cat")>(regexp-match#px"ca{1,2}t""caaatcat"); ex10,uses#px'("cat")>(regexp-match#rx"(c*)(a*)""caat"); ex11'("caa""c""aa")>(regexp-match#rx"[^ca]""caat"); ex12'("t")>(regexp-match#rx".(.).""cat"); ex13'("cat""a")>(regexp-match#rx"^a|^c""cat"); ex14'("c")>(regexp-match#rx"a$|t$""cat"); ex15'("t")>(regexp-match#px"c(.)\\1t""caat"); ex16,uses#px'("caat""a")>(regexp-match#px".\\b.""catinhat"); ex17,uses#px'("t")>(regexp-match#px".\\B.""catinhat"); ex18,uses#px'("ca")>(regexp-match#px"\\p{Ll}""Cat"); ex19,uses#px'("a")>(regexp-match#px"\\P{Ll}""cat!"); ex20,uses#px'("!")>(regexp-match#rx"\\|""c|t"); ex21'("|")>(regexp-match#rx"[a-f]*""cat"); ex22'("ca")>(regexp-match#px"[a-f\\d]*""1cat"); ex23,uses#px'("1ca")>(regexp-match#px"[\\w]""cathat"); ex24,uses#px'("h")>(regexp-match#px"t[\\s]""cat\nhat"); ex25,uses#px'("t\n")>(regexp-match#px"[[:lower:]]+""Cat"); ex26,uses#px'("at")>(regexp-match#rx"[]]""c]t"); ex27'("]")>(regexp-match#rx"[-]""c-t"); ex28'("-")>(regexp-match#rx"[]a[]+""c[a]t"); ex29'("[a]")>(regexp-match#rx"[a^]+""ca^t"); ex30'("a^")>(regexp-match#rx".a(?=p)""catnap"); ex31'("na")>(regexp-match#rx".a(?!t)""catnap"); ex32'("na")>(regexp-match#rx"(?<=n)a.""catnap"); ex33'("ap")>(regexp-match#rx"(?(regexp-match#rx"(?i:a)[tp]""cATnAp"); ex35'("Ap")>(regexp-match#rx"(?(?<=c)a|b)+""cabal"); ex36'("ab")4.8.2 AdditionalSyntacticConstraintsInadditiontomatchingagrammar,regularexpressionsmustmeettwo
syntacticrestrictions:Ina‹repeat›otherthan‹atom›?,
the‹atom›mustnotmatchanemptysequence.Ina(?<=‹regexp›)or
(?0 ‹atom›* : [0,∞] ‹atom› : [n,m] n>0 ‹atom›+ : [1,∞] ‹atom› : [n,m] ‹atom›? : [0,m] ‹atom› : [n,m] n>0 ‹atom›{‹n›} : [n*‹n›,m*‹n›] ‹atom› : [n,m] n>0 ‹atom›{‹n›,} : [n*‹n›,∞] ‹atom› : [n,m] n>0 ‹atom›{,‹m›} : [0,m*‹m›] ‹atom› : [n,m] n>0 ‹atom›{‹n›,‹m›} : [n*‹n›,m*‹m›] ‹regexp› : [n,m] (‹regexp›) : [n,m] αN=n ‹regexp› : [n,m] (?‹mode›:‹regexp›) : [n,m] ‹regexp› : [n,m] (?=‹regexp›) : [0,0] ‹regexp› : [n,m] (?!‹regexp›) : [0,0] ‹regexp› : [n,m] m‹regexp›) : [n,m] ‹tst› : [n0,m0] ‹pces›1 : [n1,m1] ‹pces›2 : [n2,m2] (?‹tst›‹pces›1|‹pces›2) : [min(n1,n2),max(m1,m2)] ‹tst› : [n0,m0] ‹pces› : [n1,m1] (?‹tst›‹pces›) : [0,m1] (‹n›) : [αN,∞] [‹rng›] : [1,1] [^‹rng›] : [1,1] . : [1,1] ^ : [0,0] $ : [0,0] ‹literal› : [1,1] \‹n› : [αN,∞] ‹class› : [1,1] \b : [0,0] \B : [0,0] \p{‹property›} : [1,6] \P{‹property›} : [1,6]4.8.3 RegexpConstructorsprocedure(regexp? v) → boolean? v : any/cReturns#tifvisaregexpvaluecreatedby
regexporpregexp,#fotherwise.procedure(pregexp? v) → boolean? v : any/cReturns#tifvisaregexpvaluecreatedby
pregexp(notregexp),#fotherwise.procedure(byte-regexp? v) → boolean? v : any/cReturns#tifvisaregexpvaluecreatedby
byte-regexporbyte-pregexp,#fotherwise.procedure(byte-pregexp? v) → boolean? v : any/cReturns#tifvisaregexpvaluecreatedby
byte-pregexp(notbyte-regexp),#f
otherwise.procedure(regexp str) → regexp? str : string?(regexp str handler) → any str : string? handler : (or/c #f (string? -> any))Takesastringrepresentationofaregularexpression(usingthe
syntaxinRegexpSyntax)andcompilesitintoaregexp
value.Otherregularexpressionproceduresaccepteitherastringora
regexpvalueasthematchingpattern.Ifaregularexpressionstring
isusedmultipletimes,itisfastertocompilethestringoncetoa
regexpvalueanduseitforrepeatedmatchesinsteadofusingthe
stringeachtime.Ifhandlerisprovidedandnot#f,itiscalledand
itsresultisreturnedwhenstrisnotavalidrepresentation
ofaregularexpression;theargumenttohandlerisastring
thatdescribestheproblemwithstr.Ifhandleris
#fornotprovided,thenexn:fail:contractexceptionisraised.Theobject-nameprocedurereturns
thesourcestringforaregexpvalue.Examples:>(regexp "ap*le")#rx"ap*le">(object-name #rx"ap*le")"ap*le">(regexp "+" (λ (s) (list s)))'("`+`followsnothinginpattern")Changedinversion6.5.0.1ofpackagebase:Addedthehandlerargument.procedure(pregexp str) → pregexp? str : string?(pregexp str handler) → any str : string? handler : (or/c #f (string? -> any))Likeregexp,exceptthatitusesaslightlydifferentsyntax
(seeRegexpSyntax).Theresultcanbeusedwith
regexp-match,etc.,justliketheresultfrom
regexp.Examples:>(pregexp "ap*le")#px"ap*le">(regexp? #px"ap*le")#t>(pregexp "+" (λ (s) (vector s)))'#("`+`followsnothinginpattern")Changedinversion6.5.0.1ofpackagebase:Addedthehandlerargument.procedure(byte-regexp bstr) → byte-regexp? bstr : bytes?(byte-regexp bstr handler) → any bstr : bytes? handler : (or/c #f (bytes? -> any))Takesabyte-stringrepresentationofaregularexpression(usingthe
syntaxinRegexpSyntax)andcompilesitintoa
byte-regexpvalue.Ifhandlerisprovided,itiscalledanditsresultisreturned
ifstrisnotavalidrepresentationofaregularexpression.Theobject-nameprocedure
returnsthesourcebytestringforaregexpvalue.Examples:>(byte-regexp #"ap*le")#rx#"ap*le">(object-name #rx#"ap*le")#"ap*le">(byte-regexp "ap*le")byte-regexp:contractviolation expected:bytes? given:"ap*le">(byte-regexp #"+" (λ (s) (list s)))'("`+`followsnothinginpattern")Changedinversion6.5.0.1ofpackagebase:Addedthehandlerargument.procedure(byte-pregexp bstr) → byte-pregexp? bstr : bytes?(byte-pregexp bstr handler) → any bstr : bytes? handler : (or/c #f (bytes? -> any))Likebyte-regexp,exceptthatitusesaslightlydifferent
syntax(seeRegexpSyntax).Theresultcanbeusedwith
regexp-match,etc.,justliketheresultfrom
byte-regexp.Examples:>(byte-pregexp #"ap*le")#px#"ap*le">(byte-pregexp #"+" (λ (s) (vector s)))'#("`+`followsnothinginpattern")Changedinversion6.5.0.1ofpackagebase:Addedthehandlerargument.procedure(regexp-quote str [case-sensitive?]) → string? str : string? case-sensitive? : any/c = #t(regexp-quote bstr [case-sensitive?]) → bytes? bstr : bytes? case-sensitive? : any/c = #tProducesastringorbytestringsuitableforusewithregexp
tomatchtheliteralsequenceofcharactersinstror
sequenceofbytesinbstr.Ifcase-sensitive?is
true(thedefault),theresultingregexpmatcheslettersin
strorbytescase-sensitively,otherwiseitmatches
case-insensitively.Examples:>(regexp-match "." "apple.scm")'("a")>(regexp-match (regexp-quote ".") "apple.scm")'(".")procedure(regexp-max-lookbehind pattern) → exact-nonnegative-integer? pattern : (or/c regexp? byte-regexp?)Returnsthemaximumnumberofbytesthatpatternmayconsult
beforethestartingpositionofamatchtodeterminethematch.For
example,thepattern(?<=abc)dconsultsthreebytes
precedingamatchingd,whilee(?<=a..)dconsults
twobytesbeforeamatchinged.A^patternmay
consultaprecedingbytetodeterminewhetherthecurrentpositionis
thestartoftheinputorofaline.4.8.4 RegexpMatchingprocedure(regexp-match pattern input [start-pos end-pos output-port input-prefix]) → (if (and (or (string? pattern) (regexp? pattern)) (or (string? input) (path? input))) (or/c #f (cons/c string? (listof (or/c string? #f)))) (or/c #f (cons/c bytes? (listof (or/c bytes? #f))))) pattern : (or/c string? bytes? regexp? byte-regexp?) input : (or/c string? bytes? path? input-port?) start-pos : exact-nonnegative-integer? = 0 end-pos : (or/c exact-nonnegative-integer? #f) = #f output-port : (or/c output-port? #f) = #f input-prefix : bytes? = #""Attemptstomatchpattern(astring,bytestring,
regexpvalue,orbyte-regexpvalue)oncetoaportionof
input.Thematcherfindsaportionofinputthat
matchesandisclosesttothestartoftheinput(after
start-pos).Ifinputisapath,itisconvertedtoabytestringwith
path->bytesifpatternisabytestringora
byte-basedregexp.Otherwise,inputisconvertedtoastring
withpath->string.Theoptionalstart-posandend-posargumentsselect
aportionofinputformatching;thedefaultistheentire
stringorthestreamuptoanend-of-file.Wheninputisa
string,start-posisacharacterposition;when
inputisabytestring,thenstart-posisabyte
position;andwheninputisaninputport,start-pos
isthenumberofbytestoskipbeforestartingtomatch.The
end-posargumentcanbe#f,whichcorrespondstothe
endofthestringoranend-of-fileinthestream;otherwise,itisa
characterorbyteposition,likestart-pos.Ifinput
isaninputport,andifanend-of-fileisreachedbefore
start-posbytesareskipped,thenthematchfails.Inpattern,astart-of-string^referstothefirst
positionofinputafterstart-pos,assumingthat
input-prefixis#"".Theend-of-input$
referstotheend-posthpositionor(inthecaseofaninput
port)anend-of-file,whichevercomesfirst.Theinput-prefixspecifiesbytesthateffectivelyprecede
inputforthepurposesof^andotherlook-behind
matching.Forexample,a#""prefixmeansthat^
matchesatthebeginningofthestream,whilea#"\n"
input-prefixmeansthatastart-of-line^canmatch
thebeginningoftheinput,whileastart-of-file^cannot.Ifthematchfails,#fisreturned.Ifthematchsucceeds,a
listcontainingstringsorbytestring,andpossibly#f,is
returned.Thelistcontainsstringsonlyifinputisastring
andpatternisnotabyteregexp.Otherwise,thelist
containsbytestrings(substringsoftheUTF-8encodingof
input,ifinputisastring).Thefirst[byte]stringinaresultlististheportionof
inputthatmatchedpattern.Iftwoportionsof
inputcanmatchpattern,thenthematchthatstarts
earliestisfound.Additional[byte]stringsarereturnedinthelistifpattern
containsparenthesizedsub-expressions(butnotwhentheopening
parenthesisisfollowedby?).Matchesforthe
sub-expressionsareprovidedintheorderoftheopeningparentheses
inpattern.Whensub-expressionsoccurinbranchesofan
|“or”pattern,ina*“zeroormore”
pattern,orotherplaceswheretheoverallpatterncansucceedwithout
amatchforthesub-expression,thena#fisreturnedforthe
sub-expressionifitdidnotcontributetothefinalmatch.Whena
singlesub-expressionoccurswithina*“zeroormore”
patternorothermultiple-matchpositions,thentherightmostmatch
associatedwiththesub-expressionisreturnedinthelist.Iftheoptionaloutput-portisprovidedasanoutputport,
thepartofinputfromitsbeginning(notstart-pos)
thatprecedesthematchiswrittentotheport.Allofinput
uptoend-posiswrittentotheportifnomatchis
found.Thisfunctionalityismostusefulwheninputisan
inputport.Whenmatchinganinputport,amatchfailurereadsupto
end-posbytes(orend-of-file),evenifpattern
beginswithastart-of-string^;seealso
regexp-try-match.Onsuccess,allbytesuptoandincluding
thematchareeventuallyreadfromtheport,butmatchingproceedsby
firstpeekingbytesfromtheport(usingpeek-bytes-avail!),
andthen(re‑)readingmatchingbytestodiscardthemafterthematch
resultisdetermined.Non-matchingbytesmaybereadanddiscarded
beforethematchisdetermined.Thematcherpeeksinblockingmode
onlyasfarasnecessarytodetermineamatch,butitmaypeekextra
bytestofillaninternalbufferifimmediatelyavailable(i.e.,
withoutblocking).Greedyrepeatoperatorsinpattern,such
as*or+,tendtoforcereadingtheentire
contentoftheport(uptoend-pos)todetermineamatch.Iftheinputportisreadsimultaneouslybyanotherthread,orifthe
portisacustomportwithinconsistentreadingandpeekingprocedures
(seeCustomPorts),thenthebytesthatarepeekedand
usedformatchingmaybedifferentthanthebytesreadanddiscarded
afterthematchcompletes;thematcherinspectsonlythepeeked
bytes.Toavoidsuchinterleaving,useregexp-match-peek
(withaprogress-evtargument)followedby
port-commit-peeked.Examples:>(regexp-match #rx"x." "12x4x6")'("x4")>(regexp-match #rx"y." "12x4x6")#f>(regexp-match #rx"x." "12x4x6" 3)'("x6")>(regexp-match #rx"x." "12x4x6" 3 4)#f>(regexp-match #rx#"x." "12x4x6")'(#"x4")>(regexp-match #rx"x." "12x4x6" 0 #f (current-output-port))12'("x4")>(regexp-match #rx"(-[0-9]*)+" "a-12--345b")'("-12--345""-345")procedure(regexp-match* pattern input [start-pos end-pos input-prefix #:match-select match-select #:gap-select? gap-select]) → (if (and (or (string? pattern) (regexp? pattern)) (or (string? input) (path? input))) (listof (or/c string? (listof (or/c #f string?)))) (listof (or/c bytes? (listof (or/c #f bytes?))))) pattern : (or/c string? bytes? regexp? byte-regexp?) input : (or/c string? bytes? path? input-port?) start-pos : exact-nonnegative-integer? = 0 end-pos : (or/c exact-nonnegative-integer? #f) = #f input-prefix : bytes? = #"" match-select : (or/c (list? .->. (or/c any/c list?)) #f) = car gap-select : any/c = #fLikeregexp-match,buttheresultisalistofstringsor
bytestringscorrespondingtoasequenceofmatchesof
patternininput.Thepatternisusedinordertofindmatches,whereeach
matchattemptstartsattheendofthelastmatch,and^is
allowedtomatchthebeginningoftheinput(ifinput-prefix
is#"")onlyforthefirstmatch.Emptymatchesarehandled
likeothermatches,returningazero-lengthstringorbytesequence
(theyaremoreusefulinmakingthisacomplementof
regexp-split),butpatternisrestrictedfrom
matchinganemptysequenceimmediatelyafteranemptymatch.Ifinputcontainsnomatches(intherangestart-pos
toend-pos),nullisreturned.Otherwise,eachitem
intheresultinglistisadistinctsubstringorbytesequencefrom
inputthatmatchespattern.Theend-pos
argumentcanbe#ftomatchtotheendofinput
(whichcorrespondstoanend-of-fileifinputisaninput
port).Examples:>(regexp-match* #rx"x." "12x4x6")'("x4""x6")>(regexp-match* #rx"x*" "12x4x6")'("""""x""""x""""")match-selectspecifiesthecollectedresults.Thedefaultof
carmeansthattheresultisthelistofmatcheswithout
returningparenthesizedsub-patterns.Itcanbegivenasa‘selector’
functionwhichchoosesanitemfromalist,oritcanchoosealistof
items.Forexample,youcanusecdrtogetalistoflists
ofparenthesizedsub-patternsmatches,orvalues(asan
identityfunction)togetthefullmatchesaswell.(Notethatthe
selectormustchooseanelementofitsinputlistoralistof
elements,butitmustnotinspectitsinputastheycanbeeithera
listofstringsoralistofpositionpairs.Furthermore,the
selectormustbeconsistentinitschoice(s).)Examples:>(regexp-match* #rx"x(.)" "12x4x6" #:match-select cadr)'("4""6")>(regexp-match* #rx"x(.)" "12x4x6" #:match-select values)'(("x4""4")("x6""6"))Inaddition,specifyinggap-selectasanon-#fvalue
willmaketheresultaninterleavedlistofthematchesaswellasthe
separatorsbetweenthemmatches,startingandendingwithaseparator.
Inthiscase,match-selectcanbegivenas#fto
returnonlytheseparators,makingsuchusesequivalentto
regexp-split.Examples:>(regexp-match* #rx"x(.)" "12x4x6" #:match-select cadr #:gap-select? #t)'("12""4""""6""")>(regexp-match* #rx"x(.)" "12x4x6" #:match-select #f #:gap-select? #t)'("12""""")procedure(regexp-try-match pattern input [start-pos end-pos output-port input-prefix]) → (or/c #f (cons/c bytes? (listof (or/c bytes? #f)))) pattern : (or/c string? bytes? regexp? byte-regexp?) input : input-port? start-pos : exact-nonnegative-integer? = 0 end-pos : (or/c exact-nonnegative-integer? #f) = #f output-port : (or/c output-port? #f) = #f input-prefix : bytes? = #""Likeregexp-matchoninputports,exceptthatifthematch
fails,nocharactersarereadanddiscardedfromin.Thisprocedureisespeciallyusefulwithapatternthat
beginswithastart-of-string^orwithanon-#f
end-pos,sinceeachlimitstheamountofpeekingintothe
port.Otherwise,bewarethatalargeportionofthestreammaybe
peeked(andthereforepulledintomemory)beforethematchsucceedsor
fails.procedure(regexp-match-positions pattern input [start-pos end-pos output-port input-prefix]) → (or/c (cons/c (cons/c exact-nonnegative-integer? exact-nonnegative-integer?) (listof (or/c (cons/c exact-integer? exact-integer?) #f))) #f) pattern : (or/c string? bytes? regexp? byte-regexp?) input : (or/c string? bytes? path? input-port?) start-pos : exact-nonnegative-integer? = 0 end-pos : (or/c exact-nonnegative-integer? #f) = #f output-port : (or/c output-port? #f) = #f input-prefix : bytes? = #""Likeregexp-match,butreturnsalistofnumberpairs(and
#f)insteadofalistofstrings.Eachpairofnumbersrefers
toarangeofcharactersorbytesininput.Iftheresultfor
thesameargumentswithregexp-matchwouldbealistofbyte
strings,theresultingrangescorrespondtobyteranges;inthatcase,
ifinputisacharacterstring,thebyterangescorrespondto
bytesintheUTF-8encodingofthestring.Rangeresultsarereturnedinasubstring-and
subbytes-compatiblemanner,independentof
start-pos.Inthecaseofaninputport,thereturned
positionsindicatethenumberofbytesthatwereread,including
start-pos,beforethefirstmatchingbyte.Examples:>(regexp-match-positions #rx"x." "12x4x6")'((2.4))>(regexp-match-positions #rx"x." "12x4x6" 3)'((4.6))>(regexp-match-positions #rx"(-[0-9]*)+" "a-12--345b")'((1.9)(5.9))Rangeresultsafterthefirstonecanincludenegativenumbersif
input-prefixisnon-emptyandifpatternincludesa
lookbehindpattern.Suchrangesstartintheinput-prefix
insteadofinput.Moregenerally,whenstart-posis
positive,thenrangeresultsthatarelessthanstart-pos
startininput-prefix.Examples:>(regexp-match-positions #rx"(?<=(.))." "a" 0 #f #f #"x")'((0.1)(-1.0))>(regexp-match-positions #rx"(?<=(..))." "a" 0 #f #f #"x")#f>(regexp-match-positions #rx"(?<=(..))." "_a" 1 #f #f #"x")#fAlthoughinput-prefixisalwaysabytestring,whenthe
returnedpositionsarestringindicesandtheyrefertoaportionof
input-prefix,thentheycorrespondtoaUTF-8decodingof
atailofinput-prefix.Examples:>(bytes-length (string->bytes/utf-8 "λ"))2>(regexp-match-positions #rx"(?<=(.))." "a" 0 #f #f (string->bytes/utf-8 "λ"))'((0.1)(-1.0))procedure(regexp-match-positions* pattern input [start-pos end-pos input-prefix #:match-select match-select]) → (or/c (listof (cons/c exact-nonnegative-integer? exact-nonnegative-integer?)) (listof (listof (or/c #f (cons/c exact-nonnegative-integer? exact-nonnegative-integer?))))) pattern : (or/c string? bytes? regexp? byte-regexp?) input : (or/c string? bytes? path? input-port?) start-pos : exact-nonnegative-integer? = 0 end-pos : (or/c exact-nonnegative-integer? #f) = #f input-prefix : bytes? = #"" match-select : (list? .->. (or/c any/c list?)) = carLikeregexp-match-positions,butreturnsmultiplematches
likeregexp-match*.Examples:>(regexp-match-positions* #rx"x." "12x4x6")'((2.4)(4.6))>(regexp-match-positions* #rx"x(.)" "12x4x6" #:match-select cadr)'((3.4)(5.6))Notethatunlikeregexp-match*,thereisno
#:gap-select?inputkeyword,asthisinformationcanbeeasily
inferredfromtheresultingmatches.procedure(regexp-match? pattern input [start-pos end-pos output-port input-prefix]) → boolean? pattern : (or/c string? bytes? regexp? byte-regexp?) input : (or/c string? bytes? path? input-port?) start-pos : exact-nonnegative-integer? = 0 end-pos : (or/c exact-nonnegative-integer? #f) = #f output-port : (or/c output-port? #f) = #f input-prefix : bytes? = #""Likeregexp-match,butreturnsmerely#twhenthe
matchsucceeds,#fotherwise.Examples:>(regexp-match? #rx"x." "12x4x6")#t>(regexp-match? #rx"y." "12x4x6")#fprocedure(regexp-match-exact? pattern input) → boolean? pattern : (or/c string? bytes? regexp? byte-regexp?) input : (or/c string? bytes? path?)Likeregexp-match?,but#tisonlyreturnedwhenthe
firstfoundmatchistotheentirecontentofinput.Examples:>(regexp-match-exact? #rx"x." "12x4x6")#f>(regexp-match-exact? #rx"1.*x." "12x4x6")#tBewarethatregexp-match-exact?canreturn#fif
patterngeneratesapartialmatchforinputfirst,evenif
patterncouldalsogenerateacompletematch.Tocheckifthereisany
matchofpatternthatcoversallofinput,use
rexexp-match?with^(?:pattern)$
instead.Examples:>(regexp-match-exact? #rx"a|ab" "ab")#f>(regexp-match? #rx"^(?:a|ab)$" "ab")#tThe(?:)groupingisnecessarybecauseconcatenationhas
lowerprecedencethanalternation;theregularexpressionwithoutit,
^a|ab$,matchesanyinputthateitherstartswith
aorendswithab.Example:>(regexp-match? #rx"^a|ab$" "123ab")#tprocedure(regexp-match-peek pattern input [start-pos end-pos progress input-prefix]) → (or/c (cons/c bytes? (listof (or/c bytes? #f))) #f) pattern : (or/c string? bytes? regexp? byte-regexp?) input : input-port? start-pos : exact-nonnegative-integer? = 0 end-pos : (or/c exact-nonnegative-integer? #f) = #f progress : (or/c evt #f) = #f input-prefix : bytes? = #""Likeregexp-matchoninputports,butonlypeeksbytesfrom
inputinsteadofreadingthem.Furthermore,insteadof
anoutputport,thelastoptionalargumentisaprogresseventfor
input(seeport-progress-evt).Ifprogress
becomesready,thenthematchstopspeekingfrominput
andreturns#f.Theprogressargumentcanbe
#f,inwhichcasethepeekmaycontinuewithinconsistent
informationifanotherprocessmeanwhilereadsfrom
input.Examples:>(define p (open-input-string "aabcd"))>(regexp-match-peek ".*bc" p)'(#"aabc")>(regexp-match-peek ".*bc" p 2)'(#"abc")>(regexp-match ".*bc" p 2)'(#"abc")>(peek-char p)#\d>(regexp-match ".*bc" p)#f>(peek-char p)#
延伸文章資訊
- 1[C#] Regular Expression自學筆記| Mike's開發瘋 - - 點部落
&p_cate=all)"; Match match = Regex.Match(categoryUrl, pattern); if (match.Success) { Group g=matc...
- 2Regex.Match 方法(System.Text.RegularExpressions)
在輸入字串搜尋規則運算式的項目,並傳回正確結果為單一Match 物件。 ... IgnoreCase); // Match the regular expression pattern agai...
- 3REGEXMATCH - Google 文件編輯器說明
REGEXMATCH. 某段文字是否符合規則運算式。 用法示範. REGEXMATCH("Spreadsheets", "S.r") ...
- 4RegExr: Learn, Build, & Test RegEx
Regular expression tester with syntax highlighting, PHP / PCRE & JS Support, contextual help, che...
- 5Regex – Match Any Character(s) - HowToDoInJava