This is a live mirror of the Perl 5 development currently hosted at
22 months agoFix typos in HACKERS; add clarification
Karl Williamson [Sun, 6 Oct 2019 03:29:07 +0000 (21:29 -0600)]
Fix typos in HACKERS; add clarification

(cherry picked from commit 3f7ae74acac1e4fa34913985e9fc9119d517ca6c)
Signed-off-by: Nicolas R <>
22 months agoBackport UTF8_CHK_SKIP
Karl Williamson [Wed, 9 Oct 2019 15:12:49 +0000 (09:12 -0600)]
Backport UTF8_CHK_SKIP

And revise an existing item to use it.

(cherry picked from commit 45578f404ccd72c0dbfb539cb5c438e8a46106fa)
Signed-off-by: Nicolas R <>
22 months agoBackport UTF8_SKIP
Karl Williamson [Wed, 9 Oct 2019 15:10:42 +0000 (09:10 -0600)]
Backport UTF8_SKIP

which is just a synonym for UTF8SKIP

(cherry picked from commit 393aff27a03ba3dfe30531be2950e3702fc07bd2)
Signed-off-by: Nicolas R <>
22 months agoparts/inc/misc: Backport UNI to/from NATIVE
Karl Williamson [Wed, 9 Oct 2019 15:05:06 +0000 (09:05 -0600)]
parts/inc/misc: Backport UNI to/from NATIVE

And change the way we setup similar defines.  On perls before EBCDIC
existed, these simply return their arguments.  But a module now can
unconditionally include these, and it will do the right thing.

(cherry picked from commit e836478411ddc77f5155d0fd4e4f2a25f28dc47e)
Signed-off-by: Nicolas R <>
22 months agoUpdated apidoc.fnc to latest blead
Karl Williamson [Sun, 6 Oct 2019 04:02:05 +0000 (22:02 -0600)]
Updated apidoc.fnc to latest blead

(cherry picked from commit 89cfe197a6de2c56f9d7ef5e75ce074b6f7c8579)
Signed-off-by: Nicolas R <>
22 months agoGet latest blead embed.fnc
Karl Williamson [Sun, 6 Oct 2019 04:01:46 +0000 (22:01 -0600)]
Get latest blead embed.fnc

(cherry picked from commit 1d506a06fb1782b574eae0c8177b0af23823f1ed)
Signed-off-by: Nicolas R <>
22 months agodevel/regenerate: Add --yes option
Karl Williamson [Wed, 9 Oct 2019 14:52:19 +0000 (08:52 -0600)]
devel/regenerate: Add --yes option

This answers yes to the standard questions automatically.  It's handy
when you want to run the job nohup, and really know what you're doing.

(cherry picked from commit e8dfa8b9ad3c70c21ed7c7c33d9049c94ea8cbbc)
Signed-off-by: Nicolas R <>
22 months agocat_file util in Makefile
Nicolas R [Tue, 8 Oct 2019 15:26:36 +0000 (09:26 -0600)]
cat_file util in Makefile

References #137

(cherry picked from commit fb6dad1b13facfbcdf834a6b95db2320410d7c78)
Signed-off-by: Nicolas R <>
22 months agoPATCH: gh#17227 heap-buffer-overflow
Karl Williamson [Fri, 8 Nov 2019 17:29:05 +0000 (10:29 -0700)]
PATCH: gh#17227 heap-buffer-overflow

There were two problems this uncovered.  One was that a floating point
expression with both operands ints truncated before becoming floating.
One operand needs to be floating.

The second is that the expansion of a non-UTF-8 byte needs to be
considered based on non-UTF-8, rather than its UTF-8 representation.

22 months agoFix tr/// compilation on VMS
Karl Williamson [Fri, 8 Nov 2019 17:14:33 +0000 (10:14 -0700)]
Fix tr/// compilation on VMS

64-bits on that platform require a long long, and 1UL isn't.  I should
have copied more carefully the similar code in utf8.h

(reported to me privately by Craig Berry)

22 months agoLink to more useful section of perlop from readpipe
Dan Book [Thu, 7 Nov 2019 23:44:47 +0000 (18:44 -0500)]
Link to more useful section of perlop from readpipe

qx is only briefly mentioned in the "I/O Operators" section of perlop. It is better to link to the section where it is discussed in detail.

22 months agoperlop - Make "STRING" section heading consistent
Dan Book [Thu, 7 Nov 2019 23:53:14 +0000 (18:53 -0500)]
perlop - Make "STRING" section heading consistent

All of the similar section headings are enclosed in C<>.

22 months agoperlguts: Revise pod of UTF8f
Karl Williamson [Thu, 7 Nov 2019 20:32:18 +0000 (13:32 -0700)]
perlguts: Revise pod of UTF8f

This is really about strings.  SVs are more conveniently printed using

22 months agoSilence some compiler warnings
Karl Williamson [Thu, 7 Nov 2019 17:41:28 +0000 (10:41 -0700)]
Silence some compiler warnings

These were introduced in the tr/// changes in the series
merged in 240494d6992696a7a350217c131e1d5dc1444a0c

22 months agoMerge branch 'Remove swashes from core' into blead
Karl Williamson [Thu, 7 Nov 2019 04:23:18 +0000 (21:23 -0700)]
Merge branch 'Remove swashes from core' into blead

This branch reimplements the final use of swashes in core, tr///, and
then proceeds to remove the swash implementation from core.

Swashes are still used in Unicode::UCD, though this can also be changed.
But there are higher priority tasks to do at the moment.

I started work on this more than two releases ago, and it finally is

22 months agoRemove lib/unicore/
Karl Williamson [Wed, 6 Nov 2019 17:32:31 +0000 (10:32 -0700)]
Remove lib/unicore/

This file was for the use of  But now that that is
incorporated into Unicode::UCD, move the definitions from to
lib/unicore/ which is used by Unicode::UCD.  This allows removing
package names.

22 months Remove 'none' from swash
Karl Williamson [Wed, 6 Nov 2019 17:02:45 +0000 (10:02 -0700)] Remove 'none' from swash

This was only used by tr///, and hence is no longer relevant.  I never
really understood it.

22 months agoRemove
Karl Williamson [Wed, 6 Nov 2019 16:40:11 +0000 (09:40 -0700)]

The only remaining user of this is Unicode::UCD, and so most of the code
from is moved into that

It removes a no-longer relevant test (that had been changed into a skip
anyway), and it changes or removes the no-longer relevant references in
comments to

Later commits will do some simplification as not all the previous
functionality is needed.  This commit removed only the parts that were
preventing compilation and tests passing.

22 months agoRemove swashes from core
Karl Williamson [Tue, 5 Nov 2019 05:27:39 +0000 (22:27 -0700)]
Remove swashes from core

Also references to the term.

22 months agoop.c: Remove no-longer used function
Karl Williamson [Tue, 5 Nov 2019 05:18:05 +0000 (22:18 -0700)]
op.c: Remove no-longer used function

22 months agohandy.h: Change references to swashes
Karl Williamson [Tue, 5 Nov 2019 05:17:08 +0000 (22:17 -0700)]
handy.h: Change references to swashes

As these are no longer used.

22 months agoPorting/todo.pod: Rmv reference to fixing swashes
Karl Williamson [Tue, 5 Nov 2019 05:14:30 +0000 (22:14 -0700)]
Porting/todo.pod: Rmv reference to fixing swashes

22 months agoUnTODO some tests fixed by the previous commit
Karl Williamson [Tue, 5 Nov 2019 05:10:56 +0000 (22:10 -0700)]
UnTODO some tests fixed by the previous commit

22 months agoReimplement tr/// without swashes
Karl Williamson [Tue, 5 Nov 2019 04:30:48 +0000 (21:30 -0700)]
Reimplement tr/// without swashes

This large commit removes the last use of swashes from core.

It replaces swashes by inversion maps.  This data structure is already
in use for some Unicode properties, such as case changing.

The inversion map data structure leads to straight forward
implementation code, so I collapsed the two doop.c routines
do_trans_complex_utf8() and do_trans_simple_utf8() into one.  A few
conditionals could be avoided in the loop if this function were split so
that one version didn't have to test for, e.g., squashing, but I suspect
these are in the noise in the loop, which has to deal with UTF-8
conversions.  This should be faster than the previous implementation
anyway.  I measured the differences some releases back, and inversion
maps were faster than the equivalent swash for up to 512 or 1024
different ranges.  These numbers are unlikely to be exceeded in tr///
except possibly in machine-generated ones.

Inversion maps are capable of handling both UTF-8 and non-UTF-8 cases,
but I left in the existing non-UTF-8 implementation, which uses tables,
because I suspect it is faster.  This means that there is extra code,
purely for runtime performance.

An inversion map is always created from the input, and then if the table
implementation is to be used, the table is easily derived from the map.
Prior to this commit, the table implementation was used in certain edge
cases involving code points above 255.  Those cases are now handled by
the inversion map implementation, because it would have taken extra code
to detect them, and I didn't think it was worth it.  That could be
changed if I am wrong.

Creating an inversion map for all inputs essentially normalizes them,
and then the same logic is usable for all.  This fixes some false
negatives in the previous implementation.  It also allows for detecting
if the actual transliteration can be done in place.  Previously, the
code mostly punted on that detection for the UTF-8 case.

This also allows for accurate counting of the lengths of the two sides,
fixing some longstanding TODO warning tests.

A new flag is created, OPpTRANS_CAN_FORCE_UTF8, when the tr/// has a
below 256 character resolving to one that requires UTF-8.  If this isn't
set, the code knows that a non-UTF-8 input won't become UTF-8 in the
process, and so can take short cuts.  The bit representing this flag is
the same as OPpTRANS_FROM_UTF, which is no longer used.  That name is
left in so that the dozen-ish modules in cpan that refer to it can still
compile.  AFAICT none of them actually use the flag, as well they
shouldn't since it is private to the core.

Inversion maps are ideally suited for tr/// implementations.  An issue
with them in general is that for some pathological data, they can become
fragmented requiring more space than you would expect, to represent the
underlying data.  However, the typical tr/// would not have this issue,
requiring only very short inversion maps to represent; in some cases
shorter than the table implementation.

Inversion maps are also easier to deparse than swashes.  A deparse TODO
was also fixed by this commit, and the code to deparse UTF-8 inputs is

One could implement specialized data structures for specific types of
inputs.  For example, a common tr/// form is a single range, like
tr/A-Z/a-z/.  That could be implemented without a table and be quite
fast.  An intermediate step would be to use the inversion map
implementation always when the transliteration is a single range, and
then special case length=1 maps at execution time.

Thanks to Nicholas Rochemagne for his help on B

22 months agointrpvar.h: Add variable for use in tr///
Karl Williamson [Thu, 3 Oct 2019 04:34:37 +0000 (22:34 -0600)]
intrpvar.h: Add variable for use in tr///

This is part of this branch of changes.

22 months agoop.c: Add debugging dump function
Karl Williamson [Tue, 19 Feb 2019 04:14:47 +0000 (21:14 -0700)]
op.c: Add debugging dump function

This function dumps out an inversion map

22 months agoop.h: Add synonyms for some tr/// values
Karl Williamson [Mon, 4 Nov 2019 21:59:02 +0000 (14:59 -0700)]
op.h: Add synonyms for some tr/// values

22 months agoChange names of some OPpTRANS flags
Karl Williamson [Mon, 4 Nov 2019 21:55:16 +0000 (14:55 -0700)]
Change names of some OPpTRANS flags

These two flags will shortly become obsolete, replaced by ones with
different meanings.  This flag makes the new ones the normal ones, and
makes the old names synonyms so that code that refers to them can

22 months agodoop.c: Refactor do_trans_complex()
Karl Williamson [Mon, 4 Nov 2019 21:38:58 +0000 (14:38 -0700)]
doop.c: Refactor do_trans_complex()

I had trouble understanding how this uncommented routine worked.  And it
turned out to be broken, squeezing the pre-transliterated characters
instead of the post-transliterated ones.  This fixes the TODO test added
in the previous commit.

22 months agot/op/tr.t: Add tests, incl. a TODO
Karl Williamson [Tue, 5 Nov 2019 05:13:43 +0000 (22:13 -0700)]
t/op/tr.t: Add tests, incl. a TODO

This adds a TODO test which demonstrates that the current tr/// is
broken, to be fixed by the next commit.

It adds other tests designed to stress the forthcoming revisions in the
implementation of tr///.

22 months agodoop.c: Change name of variable
Karl Williamson [Mon, 4 Nov 2019 21:17:09 +0000 (14:17 -0700)]
doop.c: Change name of variable

This helped me understand what was going on in this function

22 months agodoop.c: Change out-of-bounds value
Karl Williamson [Mon, 4 Nov 2019 20:12:21 +0000 (13:12 -0700)]
doop.c: Change out-of-bounds value

This currently uses 0xfeedface as a marker for something that isn't a
legal value.  But that could in fact become legal at same point.  This
defines a value TR_OOB that can be guaranteed not to become legal.

22 months agodoop.c: Add, revise comments
Karl Williamson [Sat, 2 Nov 2019 16:15:22 +0000 (10:15 -0600)]
doop.c: Add, revise comments

22 months agoop.c: Simplify expression.
Karl Williamson [Sun, 27 Oct 2019 15:55:10 +0000 (09:55 -0600)]
op.c: Simplify expression.

This also makes sure 'struct_size' has the correct value in it for any
future uses.

22 months agoregen/ Add tables that partition by UTF-8 length
Karl Williamson [Wed, 2 Oct 2019 21:36:19 +0000 (15:36 -0600)]
regen/ Add tables that partition by UTF-8 length

These will be used in a future commit.  This creates equivalence classes
of ranges of code points whose UTF-8 representations are the same length

22 months agoop.c, doop.c Use mnemonics instead of numeric values
Karl Williamson [Wed, 2 Oct 2019 20:47:24 +0000 (14:47 -0600)]
op.c, doop.c Use mnemonics instead of numeric values

For legibility and maintainability

22 months agodoop.c: Add a parameter to a few fcns
Karl Williamson [Wed, 2 Oct 2019 18:33:01 +0000 (12:33 -0600)]
doop.c: Add a parameter to a few fcns

instead of deriving it each time from inside the function.  This is in
preparation for future commits.

22 months agoChange macro name in tr/// code
Karl Williamson [Wed, 2 Oct 2019 15:38:53 +0000 (09:38 -0600)]
Change macro name in tr/// code

This makes it more mnemonic.  Also add an explanation in toke.c

22 months agoop.c: Comments only
Karl Williamson [Wed, 27 Feb 2019 21:05:18 +0000 (14:05 -0700)]
op.c: Comments only

Indent for clarity, and add a comment

22 months agodoop.c, op.c: White-space only
Karl Williamson [Wed, 27 Feb 2019 20:56:00 +0000 (13:56 -0700)]
doop.c, op.c: White-space only

Remove trailing blanks and outdent a doubly indented block

22 months agoop.c: Indent some code
Karl Williamson [Wed, 27 Feb 2019 20:26:53 +0000 (13:26 -0700)]
op.c: Indent some code

This is in preparation for a future commit which will surround this with
an 'if'.

22 months agotoke.c: comment, White-space only
Karl Williamson [Sat, 2 Nov 2019 20:45:23 +0000 (14:45 -0600)]
toke.c: comment, White-space only

Wrap a too-long line

22 months agoAllow core to work with code points above IV_MAX
Karl Williamson [Wed, 2 Oct 2019 21:29:05 +0000 (15:29 -0600)]
Allow core to work with code points above IV_MAX

Higher has been reserved for core use, and a future commit will want to
finally do this.

22 months agoMove some static fcns from regcomp.c to invlist_inline.h
Karl Williamson [Tue, 19 Feb 2019 02:37:53 +0000 (19:37 -0700)]
Move some static fcns from regcomp.c to invlist_inline.h

They are still only accessible from regcomp.c, but this is in
preparation for them to be usable from other core files as well.

22 months agoregcomp.c: Change name of static function.
Karl Williamson [Wed, 2 Oct 2019 03:57:59 +0000 (21:57 -0600)]
regcomp.c: Change name of static function.

This removes an unnecessary leading underscore

22 months agoinvlist_inline.h: White space only
Karl Williamson [Tue, 19 Feb 2019 02:31:25 +0000 (19:31 -0700)]
invlist_inline.h: White space only

Fold a too-long line

22 months agoinvlist_inline.h: Restrict files symbols are in
Karl Williamson [Tue, 19 Feb 2019 02:30:31 +0000 (19:30 -0700)]
invlist_inline.h: Restrict files symbols are in

These are only needed in regcomp.c, so restrict them to that file

22 months agoUpdate IO-Compress to CPAN version 2.089
Chris 'BinGOs' Williams [Wed, 6 Nov 2019 22:06:58 +0000 (22:06 +0000)]
Update IO-Compress to CPAN version 2.089


  2.089 3 November 2019

      * bin/streamzip
        Add zipstream to EXE_FILES

  2.088 31 October 2019

      * t/105oneshot-zip-only.t
        Fix reset of CompSize

      * Add Support Details

      * Update site for Bzip2 to sourceware

      * Fix number of tests

      * Add streamzip script to bin

      * zipdetails

        * Update zipdetails to version 1.11

        * Zip64 extra field typo

      * t/105oneshot-zip-only.t
        test with deflated directory

      * t/105oneshot-zip-only.t
        Add test for encrypted Zip files

      * Documentation Updates

      * Mention xz, lzma etc

22 months agoUpdate Compress-Raw-Bzip2 to CPAN version 2.089
Chris 'BinGOs' Williams [Wed, 6 Nov 2019 22:00:47 +0000 (22:00 +0000)]
Update Compress-Raw-Bzip2 to CPAN version 2.089


  2.089 3 November 2019

      * No Changes

  2.088 31 October 2019

      * Add Support Details

      * upgrade to Bzip2 1.0.8

22 months agoUpdate Compress-Raw-Zlib to CPAN version 2.089
Chris 'BinGOs' Williams [Wed, 6 Nov 2019 21:59:15 +0000 (21:59 +0000)]
Update Compress-Raw-Zlib to CPAN version 2.089


  2.089 3 November 2019

      * No Changes

  2.088 31 October 2019

      * Add SUPPORT section

      * 000prereq.t: dump Perl version

22 months agoprevent a race between name-based stat and an open modifying atime
Tony Cook [Sun, 3 Nov 2019 22:52:22 +0000 (09:52 +1100)]
prevent a race between name-based stat and an open modifying atime

Most linux systems rarely update atime, so it's very unlikely
for this issue to trigger there, but on a system with classic atime
behaviour this was a race between open modifying atime and time()
ticking over.

gh #17234

22 months agoadd defensive parens
Yves Orton [Tue, 5 Nov 2019 23:32:06 +0000 (00:32 +0100)]
add defensive parens

22 months agorework U8TOxx_LE macros to force unsigned access
Yves Orton [Tue, 5 Nov 2019 23:05:17 +0000 (00:05 +0100)]
rework U8TOxx_LE macros to force unsigned access

This introduces a _shifted_octet() utility macro to make things
more clear, it also adds support for USE_UNALIGNED_PTR_DEREF for
little-endian platforms that allow unaligned access. This must
be manually defined and ONLY affects little endian builds currently,
and is there primarily for -g builds on x86 (eg for perl developers

22 months agoFix the UTO*_LE macros
Matt Turner [Tue, 5 Nov 2019 18:22:58 +0000 (10:22 -0800)]
Fix the UTO*_LE macros

Embarrassingly I got confused and swapped them (BYTEORDER == 0x1234 etc
is not great...). Somehow the perl test suite still passes with this,
but fortunately the test suite for modules like
Algorithm-MinPerfHashTwoLevel caught the problem.


22 months agoRemove compiler in .travis.yml
Steve Peters [Tue, 5 Nov 2019 17:08:13 +0000 (11:08 -0600)]
Remove compiler  in .travis.yml

SInce `clang` was removed as a compiler option in the previous change to `.travis.yml`, no need to exclude Linux from using it.  Also, no need for the excludes for OS X since was removed.

22 months agoperlop.pod: Slight clarification
Karl Williamson [Sat, 2 Nov 2019 20:41:49 +0000 (14:41 -0600)]
perlop.pod: Slight clarification

22 months agotoke.c: comment changes
Karl Williamson [Tue, 5 Nov 2019 04:55:53 +0000 (21:55 -0700)]
toke.c: comment changes

These should have been included in

22 months agogv.c: SVf needs to be surrounded by spaces
Karl Williamson [Mon, 4 Nov 2019 20:54:41 +0000 (13:54 -0700)]
gv.c: SVf needs to be surrounded by spaces

perl can be compiled with C++, and this is illegal there.

22 months agoRemove unused `key` and `orig_keyword` parameters from `yyl_key_core`
Dagfinn Ilmari Mannsåker [Tue, 5 Nov 2019 12:31:28 +0000 (12:31 +0000)]
Remove unused `key` and `orig_keyword` parameters from `yyl_key_core`

They were only ever passed as zeros, so just make them local to the

22 months agoRename `tmp` local to `key` in `yyl_keylookup`
Dagfinn Ilmari Mannsåker [Tue, 5 Nov 2019 12:29:39 +0000 (12:29 +0000)]
Rename `tmp` local to `key` in `yyl_keylookup`

Also only initialise it just before it's actually used.

22 months agoRemove unused `key` parameter from `yyl_just_a_word`
Dagfinn Ilmari Mannsåker [Tue, 5 Nov 2019 12:25:14 +0000 (12:25 +0000)]
Remove unused `key` parameter from `yyl_just_a_word`

22 months agot/re/speed.t: increase timeout
David Mitchell [Tue, 5 Nov 2019 11:43:57 +0000 (11:43 +0000)]
t/re/speed.t: increase timeout

Test 58 is supposed to finish in milliseconds, or take 10s of seconds
if it hits the bug. It currently tests for <= 1s and is failing
occasionally in smokes. Increase the timeout to 2s and see if the issue
goes away. Also add a diag() to display the elapsed time on failure.

22 months agoenforce strict for barewords in multiconcat
Tony Cook [Mon, 4 Nov 2019 21:00:25 +0000 (08:00 +1100)]
enforce strict for barewords in multiconcat

gh #17254

22 months agoDoc: fix sv_setsv_cow reference
Sergey Aleynikov [Mon, 4 Nov 2019 17:58:32 +0000 (20:58 +0300)]
Doc: fix sv_setsv_cow reference

It's no more present in pp_hot.c

22 months agoS_gv_fetchmeth_internal fix STRLEN warning
Nicolas R [Mon, 4 Nov 2019 19:03:54 +0000 (13:03 -0600)]
S_gv_fetchmeth_internal fix STRLEN warning

Fixes #17250

cast STRLEN to int

Fix warnings from recent change GH #17222
We could also consider casting it using '

22 months ago[MERGE] Even smaller toke
Aaron Crane [Mon, 4 Nov 2019 11:00:14 +0000 (11:00 +0000)]
[MERGE] Even smaller toke

This continues the work merged in 5015bd0bb5ee7e0fa1ede1669bdfcd7bb5f10ebd,
factoring code out of Perl_yylex() and its callees, in the hope of making
the lexer easier to understand locally.

The work in the previous branch used unbounded tail recursion in some
places. Some compilers may have been able to compile the tail-recursive
calls as jumps under some optimisation modes, but we cannot rely on that. So
this further work replaces the unbounded recursion with gotos or other
control structures. It's a little annoying not to have been able to
eliminate those gotos, but these remaining gotos all occur within a function
that's considerably smaller and easier to understand, and they merely jump
all the way back to the top of that function.

After these changes, the largest remaining piece of Perl_yylex() is just
over 900 lines (down from originally >4100), and consists of a single switch
statement, all of whose case groups are independent.

This branch also contains a note in perldelta that this major refactoring
has taken place.

Closes #17220

22 months agoperldelta for recent toke.c refactoring 17241/head
Aaron Crane [Fri, 25 Oct 2019 10:21:51 +0000 (11:21 +0100)]
perldelta for recent toke.c refactoring

22 months agotoke.c: const-ify formbrack parameters
Aaron Crane [Fri, 1 Nov 2019 15:29:03 +0000 (15:29 +0000)]
toke.c: const-ify formbrack parameters

22 months agotoke.c: replace recursive calls to yyl_try() with goto
Aaron Crane [Fri, 1 Nov 2019 15:10:14 +0000 (15:10 +0000)]
toke.c: replace recursive calls to yyl_try() with goto

The downside of writing these calls recursively is that not all compilers
will compile the tail-position calls as jumps; that's especially true in
earlier versions of this refactoring process (where yyl_try() took a large
number of arguments), but it's not in general something we can expect to
happen — especially in the presence of `-O0` or similar compiler options.
This can lead to call-stack overflow in some circumstances.

Most recursive calls to yyl_try() occur within yyl_try() itself, so we can
easily replace them with an explicit `goto` (which is what most compilers
would use for the recursive calls anyway, now that yyl_try() takes ≤3

There are only two other recursive-call cases. One is yyl_fake_eof(), which
as far as I can tell is never called repeatedly within a single file; this
seems safe.

The other is yyl_eol(). It has exactly two distinct return paths, so this
commit moves the retry logic into its yyl_try() caller.

With this change, we no longer seem to trigger call-stack overflow.

Closes #17220

22 months agotoke.c: delete unused bof parameters
Aaron Crane [Fri, 25 Oct 2019 14:43:00 +0000 (15:43 +0100)]
toke.c: delete unused bof parameters

22 months agotoke.c: don't pass around a copy of PL_parser->saw_infix_sigil
Aaron Crane [Fri, 25 Oct 2019 10:18:39 +0000 (11:18 +0100)]
toke.c: don't pass around a copy of PL_parser->saw_infix_sigil

There's exactly one place where we need to consult it (and that only for
producing good error messages in a specific group of term-after-term

The reason for passing it around was so that it could be reset to false
early on in the process of lexing a token, while then allowing the three
separate cases that might need to set it true to do so independently.

Instead, centralise the logic of determining when it needs to be true.

22 months agotoke.c: remove some spurious orig_keyword uses
Aaron Crane [Wed, 23 Oct 2019 21:00:47 +0000 (22:00 +0100)]
toke.c: remove some spurious orig_keyword uses

22 months agotoke.c: remove formbrack argument from yyl_try()
Aaron Crane [Wed, 23 Oct 2019 20:50:28 +0000 (21:50 +0100)]
toke.c: remove formbrack argument from yyl_try()

With this commit, yyl_try() has few enough arguments that the RETRY()
macro no longer serves any useful purpose; delete it too.

22 months agotoke.c: delete weird initial_state arg to yyl_try()
Aaron Crane [Wed, 23 Oct 2019 20:32:51 +0000 (21:32 +0100)]
toke.c: delete weird initial_state arg to yyl_try()

I thought I was going to end up using this for more stuff, but I've
found better approaches.

This commit also removes two more goto targets.

22 months agotoke.c: factor out static yyl_keylookup()
Aaron Crane [Wed, 23 Oct 2019 16:14:16 +0000 (17:14 +0100)]
toke.c: factor out static yyl_keylookup()

22 months agotoke.c: factor out static yyl_key_core() and yyl_word_or_keyword()
Aaron Crane [Wed, 23 Oct 2019 15:51:58 +0000 (16:51 +0100)]
toke.c: factor out static yyl_key_core() and yyl_word_or_keyword()

22 months agotoke.c: bundle some yyl_just_a_word() params into a struct
Aaron Crane [Wed, 23 Oct 2019 12:11:02 +0000 (13:11 +0100)]
toke.c: bundle some yyl_just_a_word() params into a struct

This makes calls to it much easier to understand.

22 months agotoke.c: factor out static yyl_just_a_word()
Aaron Crane [Wed, 23 Oct 2019 11:21:53 +0000 (12:21 +0100)]
toke.c: factor out static yyl_just_a_word()

22 months agotoke.c: stop passing around several needless local variables
Aaron Crane [Tue, 22 Oct 2019 21:49:00 +0000 (22:49 +0100)]
toke.c: stop passing around several needless local variables

I introduced these parameters as part of mechanically refactoring goto-heavy
logic into subroutines. However, they aren't actually needed through most of
the code. Even in the recursive case (in which yyl_try() or one of its
callees will call itself), we can reset the variables to zero.

22 months agotoke.c: factor out static yyl_strictwarn_bareword()
Aaron Crane [Mon, 21 Oct 2019 15:51:54 +0000 (17:51 +0200)]
toke.c: factor out static yyl_strictwarn_bareword()

22 months agotoke.c: remove the really_sub goto label
Aaron Crane [Mon, 21 Oct 2019 12:12:16 +0000 (14:12 +0200)]
toke.c: remove the really_sub goto label

This permits some additional pleasing simplifications.

22 months agotoke.c: factor out static yyl_constant_op()
Aaron Crane [Mon, 21 Oct 2019 11:55:22 +0000 (13:55 +0200)]
toke.c: factor out static yyl_constant_op()

With the removal of another goto label!

22 months agotoke.c: factor out static yyl_safe_bareword()
Aaron Crane [Mon, 21 Oct 2019 11:39:10 +0000 (13:39 +0200)]
toke.c: factor out static yyl_safe_bareword()

22 months agotoke.c: fold some initialisations into the corresponding declarations
Aaron Crane [Mon, 21 Oct 2019 11:34:33 +0000 (13:34 +0200)]
toke.c: fold some initialisations into the corresponding declarations

22 months agotoke.c: factor out static yyl_fatcomma()
Aaron Crane [Mon, 21 Oct 2019 10:43:17 +0000 (12:43 +0200)]
toke.c: factor out static yyl_fatcomma()

This removes a goto label.

22 months agotoke.c: factor out static yyl_fake_eof()
Aaron Crane [Mon, 21 Oct 2019 10:17:40 +0000 (12:17 +0200)]
toke.c: factor out static yyl_fake_eof()

22 months agoFix copy-and-paste mistake in U8TO64_LE
Matt Turner [Mon, 4 Nov 2019 00:13:48 +0000 (16:13 -0800)]
Fix copy-and-paste mistake in U8TO64_LE


22 months agoFactor out common code from sv_derived_from_* subs family
Sergey Aleynikov [Tue, 29 Oct 2019 20:40:03 +0000 (23:40 +0300)]
Factor out common code from sv_derived_from_* subs family
into one that takes both SV*/char*+len arguments, like hv_common,
to be able to use speedups from SV* stash lookup API.

22 months agoCorrections: one grammatical; one POD formatting
James E Keenan [Sun, 3 Nov 2019 20:24:36 +0000 (15:24 -0500)]
Corrections: one grammatical; one POD formatting

22 months agoop.h: Remove obsolete #define
Karl Williamson [Sat, 2 Nov 2019 21:02:19 +0000 (15:02 -0600)]
op.h: Remove obsolete #define

This is no longer used.

22 months agotoke.c: Fix bug tr/// upgrading to UTF-8 in middle
Karl Williamson [Sat, 2 Nov 2019 19:59:38 +0000 (13:59 -0600)]
toke.c: Fix bug tr/// upgrading to UTF-8 in middle

Consider tr/\x{ff}-\x{100}/AB/.

While parsing, the code keeps an offset from the beginning of the output
to the beginning of the second number in the range.  This is purely for
speed so that it wouldn't have to re-find the beginning of that value,
when it already knew it.

But the example above shows the folly of this shortcut.  The second
number in the range causes the output to be upgraded to UTF-8, which
makes that offset invalid in general.  Change to re-find the beginning.

22 months agomathoms.c,utf8.c: Update to use UTF8_CHK_SKIP
Karl Williamson [Sun, 3 Nov 2019 17:23:05 +0000 (10:23 -0700)]
mathoms.c,utf8.c: Update to use UTF8_CHK_SKIP

This new macro expands to what these did.  Update to use the common

22 months agoPATCH: Character class code broke MSWin32 compilation
Karl Williamson [Sun, 3 Nov 2019 16:50:56 +0000 (09:50 -0700)]
PATCH: Character class code broke MSWin32 compilation

I'm not sure why this didn't show up elsewhere, but we have embed.fnc
entries for non-existent functions that should have been removed in

In addition, I see several more functions that should have been removed,
and this commit removes them.

22 months agoFix typo
Max Maischein [Sun, 3 Nov 2019 19:44:44 +0000 (20:44 +0100)]
Fix typo

As noted by @bulk88 in #12898

22 months agoAnd silence some silly examples.
Max Maischein [Fri, 25 Oct 2019 18:20:56 +0000 (20:20 +0200)]
And silence some silly examples.

From RT88754

Adapted to the current Perl by Max Maischein

Created from

22 months agoAdd explicit list of supported Perl versions and URL where to find it
Max Maischein [Sun, 3 Nov 2019 18:15:29 +0000 (19:15 +0100)]
Add explicit list of supported Perl versions and URL where to find it

This addresses the comments of @bulk88 made in #12898

22 months agoa template for vulnerability announcements
Tony Cook [Tue, 10 Dec 2013 04:42:30 +0000 (15:42 +1100)]
a template for vulnerability announcements

not that we ever make mistakes

22 months agoInversion lists are SvPOK
Karl Williamson [Sat, 2 Nov 2019 16:06:33 +0000 (10:06 -0600)]
Inversion lists are SvPOK

They always have a string, and making them this allows B to access it.

22 months agoOn OP_READLINE, OPf_SPECIAL is set for <<>>, clear for <>.
Nicholas Clark [Sat, 2 Nov 2019 21:08:48 +0000 (22:08 +0100)]
On OP_READLINE, OPf_SPECIAL is set for <<>>, clear for <>.

22 months agoUpdate perlfaq to CPAN version 5.20191102
Karen Etheridge [Sat, 2 Nov 2019 05:39:50 +0000 (22:39 -0700)]
Update perlfaq to CPAN version 5.20191102


5.20191102  2019-11-02 05:34:43Z
  * fix bad pod markup in perlfaq8 (PR #78; thanks, Joaquín Ferrero!)
  * remove stale section about (PR #82, Dan Book)
  * update perlfaq9 to reference Email::Stuffer (PR #79, Dan Book)
  * update perlfaq9 to reference URL::Search (PR #80, Dan Book)
  * update perlfaq9 to use HTTP::Tiny (PR #81, Dan Book)
  * fix some broken links (issue #71, dctabuyz)