This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Add ANYOFM regnode
This is a specialized ANYOF node for use when the code points in it
have characteristics that allow them to be matched with a mask instead
of a bit map. When this happens, the speed up is pretty spectacular:
Key:
Ir Instruction read
Dr Data read
Dw Data write
COND conditional branches
IND indirect branches
The numbers represent raw counts per loop iteration.
Results of ('b' x 10000) . 'a' =~ /[Aa]/
blead mask Ratio %
-------- ------- -------
Ir 153132.0 25636.0 597.3
Dr 40909.0 2155.0 1898.3
Dw 20593.0 593.0 3472.7
COND 20529.0 3028.0 678.0
IND 22.0 22.0 100.0
See the comments in regcomp.c or
http://nntp.perl.org/group/perl.perl5.porters/249001 for a description
of the cases that this new technique can handle. But several common
ones include the C0 controls (on ASCII platforms), [01], [0-7], [Aa] and
any other ASCII case pair.
The set of ASCII characters also could be done with this node instead of
having the special ASCII regnode, reducing code size and complexity.
I haven't investigated the speed loss of doing so.
A NANYOFM node could be created for matching the complements this one
matches.
A pattern like /A/i is not affected by this commit, but the regex
optimizer could be changed to take advantage of this commit. What would
need to be done is for it to look at the first byte of an EXACTFish node
and if its one of the case pairs this handles, to generate a synthetic
start class for it. This would automatically invoke the sped up code.