3 perlreapi - Perl regular expression plugin interface
7 As of Perl 5.9.5 there is a new interface for plugging and using
8 regular expression engines other than the default one.
10 Each engine is supposed to provide access to a constant structure of the
13 typedef struct regexp_engine {
14 REGEXP* (*comp) (pTHX_
15 const SV * const pattern, const U32 flags);
19 char* strend, char* strbeg,
20 I32 minend, SV* screamer,
21 void* data, U32 flags);
22 char* (*intuit) (pTHX_
23 REGEXP * const rx, SV *sv,
24 char *strpos, char *strend, U32 flags,
25 struct re_scream_pos_data_s *data);
26 SV* (*checkstr) (pTHX_ REGEXP * const rx);
27 void (*free) (pTHX_ REGEXP * const rx);
28 void (*numbered_buff_FETCH) (pTHX_
32 void (*numbered_buff_STORE) (pTHX_
35 SV const * const value);
36 I32 (*numbered_buff_LENGTH) (pTHX_
40 SV* (*named_buff) (pTHX_
45 SV* (*named_buff_iter) (pTHX_
47 const SV * const lastkey,
49 SV* (*qr_package)(pTHX_ REGEXP * const rx);
51 void* (*dupe) (pTHX_ REGEXP * const rx, CLONE_PARAMS *param);
53 REGEXP* (*op_comp) (...);
56 When a regexp is compiled, its C<engine> field is then set to point at
57 the appropriate structure, so that when it needs to be used Perl can find
58 the right routines to do so.
60 In order to install a new regexp handler, C<$^H{regcomp}> is set
61 to an integer which (when casted appropriately) resolves to one of these
62 structures. When compiling, the C<comp> method is executed, and the
63 resulting C<regexp> structure's engine field is expected to point back at
66 The pTHX_ symbol in the definition is a macro used by Perl under threading
67 to provide an extra argument to the routine holding a pointer back to
68 the interpreter that is executing the regexp. So under threading all
69 routines get an extra argument.
75 REGEXP* comp(pTHX_ const SV * const pattern, const U32 flags);
77 Compile the pattern stored in C<pattern> using the given C<flags> and
78 return a pointer to a prepared C<REGEXP> structure that can perform
79 the match. See L</The REGEXP structure> below for an explanation of
80 the individual fields in the REGEXP struct.
82 The C<pattern> parameter is the scalar that was used as the
83 pattern. Previous versions of Perl would pass two C<char*> indicating
84 the start and end of the stringified pattern; the following snippet can
85 be used to get the old parameters:
88 char* exp = SvPV(pattern, plen);
89 char* xend = exp + plen;
91 Since any scalar can be passed as a pattern, it's possible to implement
92 an engine that does something with an array (C<< "ook" =~ [ qw/ eek
93 hlagh / ] >>) or with the non-stringified form of a compiled regular
94 expression (C<< "ook" =~ qr/eek/ >>). Perl's own engine will always
95 stringify everything using the snippet above, but that doesn't mean
96 other engines have to.
98 The C<flags> parameter is a bitfield which indicates which of the
99 C<msixp> flags the regex was compiled with. It also contains
100 additional info, such as if C<use locale> is in effect.
102 The C<eogc> flags are stripped out before being passed to the comp
103 routine. The regex engine does not need to know if any of these
104 are set, as those flags should only affect what Perl does with the
105 pattern and its match variables, not how it gets compiled and
108 By the time the comp callback is called, some of these flags have
109 already had effect (noted below where applicable). However most of
110 their effect occurs after the comp callback has run, in routines that
111 read the C<< rx->extflags >> field which it populates.
113 In general the flags should be preserved in C<< rx->extflags >> after
114 compilation, although the regex engine might want to add or delete
115 some of them to invoke or disable some special behavior in Perl. The
116 flags along with any special behavior they cause are documented below:
118 The pattern modifiers:
122 =item C</m> - RXf_PMf_MULTILINE
124 If this is in C<< rx->extflags >> it will be passed to
125 C<Perl_fbm_instr> by C<pp_split> which will treat the subject string
126 as a multi-line string.
128 =item C</s> - RXf_PMf_SINGLELINE
130 =item C</i> - RXf_PMf_FOLD
132 =item C</x> - RXf_PMf_EXTENDED
134 If present on a regex, C<"#"> comments will be handled differently by the
135 tokenizer in some cases.
137 TODO: Document those cases.
139 =item C</p> - RXf_PMf_KEEPCOPY
145 The character set semantics are determined by an enum that is contained
146 in this field. This is still experimental and subject to change, but
147 the current interface returns the rules by use of the in-line function
148 C<get_regex_charset(const U32 flags)>. The only currently documented
149 value returned from it is REGEX_LOCALE_CHARSET, which is set if
150 C<use locale> is in effect. If present in C<< rx->extflags >>,
151 C<split> will use the locale dependent definition of whitespace
152 when RXf_SKIPWHITE or RXf_WHITE is in effect. ASCII whitespace
153 is defined as per L<isSPACE|perlapi/isSPACE>, and by the internal
154 macros C<is_utf8_space> under UTF-8, and C<isSPACE_LC> under C<use
165 If C<split> is invoked as C<split ' '> or with no arguments (which
166 really means C<split(' ', $_)>, see L<split|perlfunc/split>), Perl will
167 set this flag. The regex engine can then check for it and set the
168 SKIPWHITE and WHITE extflags. To do this, the Perl engine does:
170 if (flags & RXf_SPLIT && r->prelen == 1 && r->precomp[0] == ' ')
171 r->extflags |= (RXf_SKIPWHITE|RXf_WHITE);
175 These flags can be set during compilation to enable optimizations in
176 the C<split> operator.
182 If the flag is present in C<< rx->extflags >> C<split> will delete
183 whitespace from the start of the subject string before it's operated
184 on. What is considered whitespace depends on if the subject is a
185 UTF-8 string and if the C<RXf_PMf_LOCALE> flag is set.
187 If RXf_WHITE is set in addition to this flag, C<split> will behave like
188 C<split " "> under the Perl engine.
192 Tells the split operator to split the target string on newlines
193 (C<\n>) without invoking the regex engine.
195 Perl's engine sets this if the pattern is C</^/> (C<plen == 1 && *exp
196 == '^'>), even under C</^/s>; see L<split|perlfunc>. Of course a
197 different regex engine might want to use the same optimizations
198 with a different syntax.
202 Tells the split operator to split the target string on whitespace
203 without invoking the regex engine. The definition of whitespace varies
204 depending on if the target string is a UTF-8 string and on
205 if RXf_PMf_LOCALE is set.
207 Perl's engine sets this flag if the pattern is C<\s+>.
211 Tells the split operator to split the target string on
212 characters. The definition of character varies depending on if
213 the target string is a UTF-8 string.
215 Perl's engine sets this flag on empty patterns, this optimization
216 makes C<split //> much faster than it would otherwise be. It's even
217 faster than C<unpack>.
223 I32 exec(pTHX_ REGEXP * const rx,
224 char *stringarg, char* strend, char* strbeg,
225 I32 minend, SV* screamer,
226 void* data, U32 flags);
228 Execute a regexp. The arguments are
234 The regular expression to execute.
238 This strangely-named arg is the SV to be matched against. Note that the
239 actual char array to be matched against is supplied by the arguments
240 described below; the SV is just used to determine UTF8ness, C<pos()> etc.
244 Pointer to the physical start of the string.
248 Pointer to the character following the physical end of the string (i.e.
253 Pointer to the position in the string where matching should start; it might
254 not be equal to C<strbeg> (for example in a later iteration of C</.../g>).
258 Minimum length of string (measured in bytes from C<stringarg>) that must
259 match; if the engine reaches the end of the match but hasn't reached this
260 position in the string, it should fail.
264 Optimisation data; subject to change.
268 Optimisation flags; subject to change.
274 char* intuit(pTHX_ REGEXP * const rx,
275 SV *sv, char *strpos, char *strend,
276 const U32 flags, struct re_scream_pos_data_s *data);
278 Find the start position where a regex match should be attempted,
279 or possibly if the regex engine should not be run because the
280 pattern can't match. This is called, as appropriate, by the core,
281 depending on the values of the C<extflags> member of the C<regexp>
286 SV* checkstr(pTHX_ REGEXP * const rx);
288 Return a SV containing a string that must appear in the pattern. Used
289 by C<split> for optimising matches.
293 void free(pTHX_ REGEXP * const rx);
295 Called by Perl when it is freeing a regexp pattern so that the engine
296 can release any resources pointed to by the C<pprivate> member of the
297 C<regexp> structure. This is only responsible for freeing private data;
298 Perl will handle releasing anything else contained in the C<regexp> structure.
300 =head2 Numbered capture callbacks
302 Called to get/set the value of C<$`>, C<$'>, C<$&> and their named
303 equivalents, ${^PREMATCH}, ${^POSTMATCH} and $^{MATCH}, as well as the
304 numbered capture groups (C<$1>, C<$2>, ...).
306 The C<paren> parameter will be C<1> for C<$1>, C<2> for C<$2> and so
307 forth, and have these symbolic values for the special variables:
309 ${^PREMATCH} RX_BUFF_IDX_CARET_PREMATCH
310 ${^POSTMATCH} RX_BUFF_IDX_CARET_POSTMATCH
311 ${^MATCH} RX_BUFF_IDX_CARET_FULLMATCH
312 $` RX_BUFF_IDX_PREMATCH
313 $' RX_BUFF_IDX_POSTMATCH
314 $& RX_BUFF_IDX_FULLMATCH
316 Note that in Perl 5.17.3 and earlier, the last three constants were also
317 used for the caret variants of the variables.
320 The names have been chosen by analogy with L<Tie::Scalar> methods
321 names with an additional B<LENGTH> callback for efficiency. However
322 named capture variables are currently not tied internally but
323 implemented via magic.
325 =head3 numbered_buff_FETCH
327 void numbered_buff_FETCH(pTHX_ REGEXP * const rx, const I32 paren,
330 Fetch a specified numbered capture. C<sv> should be set to the scalar
331 to return, the scalar is passed as an argument rather than being
332 returned from the function because when it's called Perl already has a
333 scalar to store the value, creating another one would be
334 redundant. The scalar can be set with C<sv_setsv>, C<sv_setpvn> and
335 friends, see L<perlapi>.
337 This callback is where Perl untaints its own capture variables under
338 taint mode (see L<perlsec>). See the C<Perl_reg_numbered_buff_fetch>
339 function in F<regcomp.c> for how to untaint capture variables if
340 that's something you'd like your engine to do as well.
342 =head3 numbered_buff_STORE
344 void (*numbered_buff_STORE) (pTHX_
347 SV const * const value);
349 Set the value of a numbered capture variable. C<value> is the scalar
350 that is to be used as the new value. It's up to the engine to make
351 sure this is used as the new value (or reject it).
355 if ("ook" =~ /(o*)/) {
356 # 'paren' will be '1' and 'value' will be 'ee'
360 Perl's own engine will croak on any attempt to modify the capture
361 variables, to do this in another engine use the following callback
362 (copied from C<Perl_reg_numbered_buff_store>):
365 Example_reg_numbered_buff_store(pTHX_
368 SV const * const value)
371 PERL_UNUSED_ARG(paren);
372 PERL_UNUSED_ARG(value);
375 Perl_croak(aTHX_ PL_no_modify);
378 Actually Perl will not I<always> croak in a statement that looks
379 like it would modify a numbered capture variable. This is because the
380 STORE callback will not be called if Perl can determine that it
381 doesn't have to modify the value. This is exactly how tied variables
382 behave in the same situation:
385 use base 'Tie::Scalar';
387 sub TIESCALAR { bless [] }
389 sub STORE { die "This doesn't get called" }
393 tie my $sv => "CaptureVar";
396 Because C<$sv> is C<undef> when the C<y///> operator is applied to it,
397 the transliteration won't actually execute and the program won't
398 C<die>. This is different to how 5.8 and earlier versions behaved
399 since the capture variables were READONLY variables then; now they'll
400 just die when assigned to in the default engine.
402 =head3 numbered_buff_LENGTH
404 I32 numbered_buff_LENGTH (pTHX_
409 Get the C<length> of a capture variable. There's a special callback
410 for this so that Perl doesn't have to do a FETCH and run C<length> on
411 the result, since the length is (in Perl's case) known from an offset
412 stored in C<< rx->offs >>, this is much more efficient:
414 I32 s1 = rx->offs[paren].start;
415 I32 s2 = rx->offs[paren].end;
418 This is a little bit more complex in the case of UTF-8, see what
419 C<Perl_reg_numbered_buff_length> does with
420 L<is_utf8_string_loclen|perlapi/is_utf8_string_loclen>.
422 =head2 Named capture callbacks
424 Called to get/set the value of C<%+> and C<%->, as well as by some
425 utility functions in L<re>.
427 There are two callbacks, C<named_buff> is called in all the cases the
428 FETCH, STORE, DELETE, CLEAR, EXISTS and SCALAR L<Tie::Hash> callbacks
429 would be on changes to C<%+> and C<%-> and C<named_buff_iter> in the
430 same cases as FIRSTKEY and NEXTKEY.
432 The C<flags> parameter can be used to determine which of these
433 operations the callbacks should respond to. The following flags are
436 Which L<Tie::Hash> operation is being performed from the Perl level on
437 C<%+> or C<%+>, if any:
448 If C<%+> or C<%-> is being operated on, if any.
453 If this is being called as C<re::regname>, C<re::regnames> or
454 C<re::regnames_count>, if any. The first two will be combined with
455 C<RXapif_ONE> or C<RXapif_ALL>.
459 RXapif_REGNAMES_COUNT
461 Internally C<%+> and C<%-> are implemented with a real tied interface
462 via L<Tie::Hash::NamedCapture>. The methods in that package will call
463 back into these functions. However the usage of
464 L<Tie::Hash::NamedCapture> for this purpose might change in future
465 releases. For instance this might be implemented by magic instead
466 (would need an extension to mgvtbl).
470 SV* (*named_buff) (pTHX_ REGEXP * const rx, SV * const key,
471 SV * const value, U32 flags);
473 =head3 named_buff_iter
475 SV* (*named_buff_iter) (pTHX_
477 const SV * const lastkey,
482 SV* qr_package(pTHX_ REGEXP * const rx);
484 The package the qr// magic object is blessed into (as seen by C<ref
485 qr//>). It is recommended that engines change this to their package
486 name for identification regardless of if they implement methods
489 The package this method returns should also have the internal
490 C<Regexp> package in its C<@ISA>. C<< qr//->isa("Regexp") >> should always
491 be true regardless of what engine is being used.
493 Example implementation might be:
496 Example_qr_package(pTHX_ REGEXP * const rx)
499 return newSVpvs("re::engine::Example");
502 Any method calls on an object created with C<qr//> will be dispatched to the
503 package as a normal object.
505 use re::engine::Example;
507 $re->meth; # dispatched to re::engine::Example::meth()
509 To retrieve the C<REGEXP> object from the scalar in an XS function use
510 the C<SvRX> macro, see L<"REGEXP Functions" in perlapi|perlapi/REGEXP
515 REGEXP * re = SvRX(sv);
519 void* dupe(pTHX_ REGEXP * const rx, CLONE_PARAMS *param);
521 On threaded builds a regexp may need to be duplicated so that the pattern
522 can be used by multiple threads. This routine is expected to handle the
523 duplication of any private data pointed to by the C<pprivate> member of
524 the C<regexp> structure. It will be called with the preconstructed new
525 C<regexp> structure as an argument, the C<pprivate> member will point at
526 the B<old> private structure, and it is this routine's responsibility to
527 construct a copy and return a pointer to it (which Perl will then use to
528 overwrite the field as passed to this routine.)
530 This allows the engine to dupe its private data but also if necessary
531 modify the final structure if it really must.
533 On unthreaded builds this field doesn't exist.
537 This is private to the Perl core and subject to change. Should be left
540 =head1 The REGEXP structure
542 The REGEXP struct is defined in F<regexp.h>. All regex engines must be able to
543 correctly build such a structure in their L</comp> routine.
545 The REGEXP structure contains all the data that Perl needs to be aware of
546 to properly work with the regular expression. It includes data about
547 optimisations that Perl can use to determine if the regex engine should
548 really be used, and various other control info that is needed to properly
549 execute patterns in various contexts, such as if the pattern anchored in
550 some way, or what flags were used during the compile, or if the
551 program contains special constructs that Perl needs to be aware of.
553 In addition it contains two fields that are intended for the private
554 use of the regex engine that compiled the pattern. These are the
555 C<intflags> and C<pprivate> members. C<pprivate> is a void pointer to
556 an arbitrary structure, whose use and management is the responsibility
557 of the compiling engine. Perl will never modify either of these
560 typedef struct regexp {
561 /* what engine created this regexp? */
562 const struct regexp_engine* engine;
564 /* what re is this a lightweight copy of? */
565 struct regexp* mother_re;
567 /* Information about the match that the Perl core uses to manage
569 U32 extflags; /* Flags used both externally and internally */
570 I32 minlen; /* mininum possible number of chars in */
572 I32 minlenret; /* mininum possible number of chars in $& */
573 U32 gofs; /* chars left of pos that we search from */
575 /* substring data about strings that must appear
576 in the final match, used for optimisations */
577 struct reg_substr_data *substrs;
579 U32 nparens; /* number of capture groups */
581 /* private engine specific data */
582 U32 intflags; /* Engine Specific Internal flags */
583 void *pprivate; /* Data private to the regex engine which
584 created this object. */
586 /* Data about the last/current match. These are modified during
588 U32 lastparen; /* highest close paren matched ($+) */
589 U32 lastcloseparen; /* last close paren matched ($^N) */
590 regexp_paren_pair *swap; /* Swap copy of *offs */
591 regexp_paren_pair *offs; /* Array of offsets for (@-) and
594 char *subbeg; /* saved or original string so \digit works
596 SV_SAVED_COPY /* If non-NULL, SV which is COW from original */
597 I32 sublen; /* Length of string pointed by subbeg */
598 I32 suboffset; /* byte offset of subbeg from logical start of
600 I32 subcoffset; /* suboffset equiv, but in chars (for @-/@+) */
602 /* Information about the match that isn't often used */
603 I32 prelen; /* length of precomp */
604 const char *precomp; /* pre-compilation regular expression */
606 char *wrapped; /* wrapped version of the pattern */
607 I32 wraplen; /* length of wrapped */
609 I32 seen_evals; /* number of eval groups in the pattern - for
611 HV *paren_names; /* Optional hash of paren names */
613 /* Refcount of this regexp */
614 I32 refcnt; /* Refcount of this regexp */
617 The fields are discussed in more detail below:
621 This field points at a C<regexp_engine> structure which contains pointers
622 to the subroutines that are to be used for performing a match. It
623 is the compiling routine's responsibility to populate this field before
624 returning the regexp object.
626 Internally this is set to C<NULL> unless a custom engine is specified in
627 C<$^H{regcomp}>, Perl's own set of callbacks can be accessed in the struct
628 pointed to by C<RE_ENGINE_PTR>.
632 TODO, see L<http://www.mail-archive.com/perl5-changes@perl.org/msg17328.html>
636 This will be used by Perl to see what flags the regexp was compiled
637 with, this will normally be set to the value of the flags parameter by
638 the L<comp|/comp> callback. See the L<comp|/comp> documentation for
641 =head2 C<minlen> C<minlenret>
643 The minimum string length (in characters) required for the pattern to match.
645 prune the search space by not bothering to match any closer to the end of a
646 string than would allow a match. For instance there is no point in even
647 starting the regex engine if the minlen is 10 but the string is only 5
648 characters long. There is no way that the pattern can match.
650 C<minlenret> is the minimum length (in characters) of the string that would
651 be found in $& after a match.
653 The difference between C<minlen> and C<minlenret> can be seen in the
658 where the C<minlen> would be 3 but C<minlenret> would only be 2 as the \d is
659 required to match but is not actually included in the matched content. This
660 distinction is particularly important as the substitution logic uses the
661 C<minlenret> to tell if it can do in-place substitutions (these can
662 result in considerable speed-up).
666 Left offset from pos() to start match at.
670 Substring data about strings that must appear in the final match. This
671 is currently only used internally by Perl's engine, but might be
672 used in the future for all engines for optimisations.
674 =head2 C<nparens>, C<lastparen>, and C<lastcloseparen>
676 These fields are used to keep track of how many paren groups could be matched
677 in the pattern, which was the last open paren to be entered, and which was
678 the last close paren to be entered.
682 The engine's private copy of the flags the pattern was compiled with. Usually
683 this is the same as C<extflags> unless the engine chose to modify one of them.
687 A void* pointing to an engine-defined data structure. The Perl engine uses the
688 C<regexp_internal> structure (see L<perlreguts/Base Structures>) but a custom
689 engine should use something else.
693 Unused. Left in for compatibility with Perl 5.10.0.
697 A C<regexp_paren_pair> structure which defines offsets into the string being
698 matched which correspond to the C<$&> and C<$1>, C<$2> etc. captures, the
699 C<regexp_paren_pair> struct is defined as follows:
701 typedef struct regexp_paren_pair {
706 If C<< ->offs[num].start >> or C<< ->offs[num].end >> is C<-1> then that
707 capture group did not match. C<< ->offs[0].start/end >> represents C<$&> (or
708 C<${^MATCH}> under C<//p>) and C<< ->offs[paren].end >> matches C<$$paren> where
711 =head2 C<precomp> C<prelen>
713 Used for optimisations. C<precomp> holds a copy of the pattern that
714 was compiled and C<prelen> its length. When a new pattern is to be
715 compiled (such as inside a loop) the internal C<regcomp> operator
716 checks if the last compiled C<REGEXP>'s C<precomp> and C<prelen>
717 are equivalent to the new one, and if so uses the old pattern instead
718 of compiling a new one.
720 The relevant snippet from C<Perl_pp_regcomp>:
722 if (!re || !re->precomp || re->prelen != (I32)len ||
723 memNE(re->precomp, t, len))
724 /* Compile a new pattern */
726 =head2 C<paren_names>
728 This is a hash used internally to track named capture groups and their
729 offsets. The keys are the names of the buffers the values are dualvars,
730 with the IV slot holding the number of buffers with the given name and the
731 pv being an embedded array of I32. The values may also be contained
732 independently in the data array in cases where named backreferences are
737 Holds information on the longest string that must occur at a fixed
738 offset from the start of the pattern, and the longest string that must
739 occur at a floating offset from the start of the pattern. Used to do
740 Fast-Boyer-Moore searches on the string to find out if its worth using
741 the regex engine at all, and if so where in the string to search.
743 =head2 C<subbeg> C<sublen> C<saved_copy> C<suboffset> C<subcoffset>
745 Used during the execution phase for managing search and replace patterns,
746 and for providing the text for C<$&>, C<$1> etc. C<subbeg> points to a
747 buffer (either the original string, or a copy in the case of
748 C<RX_MATCH_COPIED(rx)>), and C<sublen> is the length of the buffer. The
749 C<RX_OFFS> start and end indices index into this buffer.
751 In the presence of the C<REXEC_COPY_STR> flag, but with the addition of
752 the C<REXEC_COPY_SKIP_PRE> or C<REXEC_COPY_SKIP_POST> flags, an engine
753 can choose not to copy the full buffer (although it must still do so in
754 the presence of C<RXf_PMf_KEEPCOPY> or the relevant bits being set in
755 C<PL_sawampersand>). In this case, it may set C<suboffset> to indicate the
756 number of bytes from the logical start of the buffer to the physical start
757 (i.e. C<subbeg>). It should also set C<subcoffset>, the number of
758 characters in the offset. The latter is needed to support C<@-> and C<@+>
759 which work in characters, not bytes.
761 =head2 C<wrapped> C<wraplen>
763 Stores the string C<qr//> stringifies to. The Perl engine for example
764 stores C<(?^:eek)> in the case of C<qr/eek/>.
766 When using a custom engine that doesn't support the C<(?:)> construct
767 for inline modifiers, it's probably best to have C<qr//> stringify to
768 the supplied pattern, note that this will create undesired patterns in
771 my $x = qr/a|b/; # "a|b"
772 my $y = qr/c/i; # "c"
773 my $z = qr/$x$y/; # "a|bc"
775 There's no solution for this problem other than making the custom
776 engine understand a construct like C<(?:)>.
780 This stores the number of eval groups in the pattern. This is used for security
781 purposes when embedding compiled regexes into larger patterns with C<qr//>.
785 The number of times the structure is referenced. When this falls to 0, the
786 regexp is automatically freed by a call to pregfree. This should be set to 1 in
787 each engine's L</comp> routine.
791 Originally part of L<perlreguts>.
795 Originally written by Yves Orton, expanded by E<AElig>var ArnfjE<ouml>rE<eth>
800 Copyright 2006 Yves Orton and 2007 E<AElig>var ArnfjE<ouml>rE<eth> Bjarmason.
802 This program is free software; you can redistribute it and/or modify it under
803 the same terms as Perl itself.