This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
perl5.git
8 years agoUpgrade libnet from version 3.07 to 3.08
Steve Hay [Wed, 6 Jan 2016 08:14:36 +0000 (08:14 +0000)]
Upgrade libnet from version 3.07 to 3.08

8 years agofix a typo in perl5233delta.pod
Tony Cook [Wed, 6 Jan 2016 05:16:04 +0000 (16:16 +1100)]
fix a typo in perl5233delta.pod

Pointed out by Andrew Rodland (hobbs) on #p5p

8 years ago[perl #126240] avoid leaking memory when setting $ENV{foo} on darwin
Tony Cook [Wed, 6 Jan 2016 03:27:46 +0000 (14:27 +1100)]
[perl #126240] avoid leaking memory when setting $ENV{foo} on darwin

My change in e396210 was incomplete, that change was intended to
force use of setenv()/unsetenv() on Darwin, but ended up using putenv()
instead, which is a leaky mechanism.

Added darwin to the list of many others that work better with setenv()/
unsetenv().

8 years agoperlsyn: change = to == in conditional in do/while example
Lukas Mai [Tue, 5 Jan 2016 23:35:24 +0000 (00:35 +0100)]
perlsyn: change = to == in conditional in do/while example

... also remove unused LOOP label from 'last' example, mention 'redo'
(works like 'next' in this case), add example that combines
'next'/'last' (and requires the label).

8 years agoUpgrade Unicode-Normalize from version 1.24 to 1.25
Steve Hay [Tue, 5 Jan 2016 17:37:21 +0000 (17:37 +0000)]
Upgrade Unicode-Normalize from version 1.24 to 1.25

8 years agoUpgrade bignum from version 0.41 to 0.42
Steve Hay [Tue, 5 Jan 2016 13:12:45 +0000 (13:12 +0000)]
Upgrade bignum from version 0.41 to 0.42

8 years agoSilence t/porting/cmp_version.t after Math-Big* upgrades
Steve Hay [Mon, 4 Jan 2016 14:15:43 +0000 (14:15 +0000)]
Silence t/porting/cmp_version.t after Math-Big* upgrades

8 years agoUpgrade Math-BigRat from version 0.260801 to 0.260802
Steve Hay [Mon, 4 Jan 2016 13:55:50 +0000 (13:55 +0000)]
Upgrade Math-BigRat from version 0.260801 to 0.260802

(This maintains the one minor divergence between blead and cpan. The blead
version first appeared in 50a54b125c. I haven't examined whether this
difference needs to remain, or whether we can switch to the cpan version.)

8 years agoUpgrade Math-BigInt-FastCalc from version 0.38 to 0.40
Steve Hay [Mon, 4 Jan 2016 13:33:22 +0000 (13:33 +0000)]
Upgrade Math-BigInt-FastCalc from version 0.38 to 0.40

8 years agoUpgrade Math-BigInt from version 1.999710 to 1.999714
Steve Hay [Mon, 4 Jan 2016 13:32:30 +0000 (13:32 +0000)]
Upgrade Math-BigInt from version 1.999710 to 1.999714

8 years agoperlgit: many small changes
Lukas Mai [Tue, 5 Jan 2016 12:04:24 +0000 (13:04 +0100)]
perlgit: many small changes

- verbatimize a paragraph of sample commands
- grammar: sent -> send
- consistently hyperlink all email addresses
- hyperlink RT tickets
- hyperlink commit hashes
- consistently refer to bisect.pl as F<Porting/bisect.pl>
- add F< > to things that look like filenames

8 years agoclarify meaning of waitpid returning 0 [perl #127080]
Lukas Mai [Fri, 1 Jan 2016 14:45:47 +0000 (15:45 +0100)]
clarify meaning of waitpid returning 0 [perl #127080]

8 years agoexplain meaning of negative PIDs in waitpid [perl #127080]
Lukas Mai [Fri, 1 Jan 2016 14:35:58 +0000 (15:35 +0100)]
explain meaning of negative PIDs in waitpid [perl #127080]

8 years ago[perl #126922] avoid access to uninitialized memory in win32 crypt()
Tony Cook [Thu, 17 Dec 2015 00:15:31 +0000 (11:15 +1100)]
[perl #126922] avoid access to uninitialized memory in win32 crypt()

Previously the Win32 crypt implementation() would access the first
and second characters of the salt, even if the salt was zero length.

Add validation that will detect both a short salt and invalid
characters in the salt.

8 years agoAdd Configure support for fdclose() for [perl #126847].
Andy Dougherty [Wed, 30 Dec 2015 03:58:51 +0000 (22:58 -0500)]
Add Configure support for fdclose() for [perl #126847].

This patch also adjusts the generated files suggested by
Porting/checkcfgvar.pl.

8 years agoPATCH: Re: [perl #126847] fdclose(3) patch
Andy Dougherty [Wed, 30 Dec 2015 03:47:42 +0000 (22:47 -0500)]
PATCH: Re: [perl #126847] fdclose(3) patch

This patch uses the fdclose() function from FreeBSD if it
is available.  It is based on the original patch supplied
by Mariusz Zaborski <oshogbo@FreeBSD.org> in the RT ticket.

The next patch will add Configure support for HAS_FDCLOSE.

8 years agoPorting/bench.pl: add --compact option
David Mitchell [Mon, 4 Jan 2016 13:15:19 +0000 (13:15 +0000)]
Porting/bench.pl: add --compact option

With this, you specify which perl executable you want the results for,
and it will display the result in a much more compact form than when
displaying the results for all perls, with just one line per test.

8 years agoPorting/bench.pl: preserve test order
David Mitchell [Mon, 4 Jan 2016 11:47:18 +0000 (11:47 +0000)]
Porting/bench.pl: preserve test order

In the absence of a --sort option, process and display the tests in the
order they appear in the test file, rather than in alphabetical order.

This is because the layout in the benchmark file usually follows some sort
of logical order

8 years agoRemove superfluous entry in checkAUTHORS.pl.
James E Keenan [Sun, 3 Jan 2016 23:00:17 +0000 (18:00 -0500)]
Remove superfluous entry in checkAUTHORS.pl.

8 years agoperldelta for 4732711e2548
Tony Cook [Sun, 3 Jan 2016 22:43:18 +0000 (09:43 +1100)]
perldelta for 4732711e2548

8 years agoRemove nm from libswanted
Andreas Koenig [Sun, 3 Jan 2016 07:40:33 +0000 (08:40 +0100)]
Remove nm from libswanted

Nm stood for "New Math" library in the context of 1994. 2014 a conflicting
library libnm appeared that has a network manager context.

8 years agoProvide additional email address for contributor Mattia Barbon.
James E Keenan [Sun, 3 Jan 2016 21:59:46 +0000 (16:59 -0500)]
Provide additional email address for contributor Mattia Barbon.

8 years agoReplace :: with __ in THIS like it's done for parameters/return values
Mattia Barbon [Sun, 3 Jan 2016 20:54:31 +0000 (21:54 +0100)]
Replace :: with __ in THIS like it's done for parameters/return values

Apart from being more consistent, this simplifies writing XS code
wrapping C++ classes into a nested Perl namespace (it reqquires only
a typedef for Foo__Bar rather than two, one for Foo_Bar and the other
for Foo::Bar).

Impact is likely to be minimmal: it will only affect classes:
- in C++ extensions (there is no way to make Foo::Bar *THIS compile in C)
- that use Foo::Bar only as a receiver (if they use it as a
  parameter/return value the typedef is already there)

Given that a class is always used as the return valus in a normal
constructor, this case should be relatively rare.

given this Foo.xs file:

    MODULE=Foo PACKAGE=Foo::Bar

    TYPEMAP: <<EOT
    TYPEMAP
    Foo::Bar * T_PTRREF
    EOT

    Foo::Bar *
    Foo::Bar::moo(Foo::Bar *foo)

the output of

     perl -Ilib lib/ExtUtils/xsubpp -prototypes Foo.xs
        | grep -A8 moo | head -n 10

changes from:

    XS_EUPXS(XS_Foo__Bar_moo); /* prototype to pass -Wmissing-prototypes */
    XS_EUPXS(XS_Foo__Bar_moo)
    {
        dVAR; dXSARGS;
        if (items != 2)
            croak_xs_usage(cv,  "THIS, foo");
        {
            Foo::Bar *      THIS;
            Foo__Bar *      RETVAL;
            Foo__Bar *      foo;

to:

    XS_EUPXS(XS_Foo__Bar_moo); /* prototype to pass -Wmissing-prototypes */
    XS_EUPXS(XS_Foo__Bar_moo)
    {
        dVAR; dXSARGS;
        if (items != 2)
           croak_xs_usage(cv,  "THIS, foo");
        {
            Foo__Bar *      THIS;
            Foo__Bar *      RETVAL;
            Foo__Bar *      foo;

8 years agoperldelta: podlators upgrade to 4.04
James E Keenan [Sun, 3 Jan 2016 19:51:52 +0000 (14:51 -0500)]
perldelta: podlators upgrade to 4.04

8 years agoupdate podlators to 4.04
Karen Etheridge [Sun, 3 Jan 2016 19:05:17 +0000 (11:05 -0800)]
update podlators to 4.04

8 years agofix -DPERL_GLOBAL_STRUCT_PRIVATE builds
David Mitchell [Sun, 3 Jan 2016 19:34:26 +0000 (19:34 +0000)]
fix -DPERL_GLOBAL_STRUCT_PRIVATE builds

t/porting/libperl.t checks that, under -DPERL_GLOBAL_STRUCT_PRIVATE
builds, there are no bss symbols. This line in locale.c was failing that
test:

    static char ret[128] = "";

By changing it to

    static char ret[128] = "x";

it's no longer BSS data and the test passes.

Bit of a hack, but that function only exists in debugging builds, so it
doesn't matter too much.

8 years agoRemove unwarranted assertion in Perl_newATTRSUB_x()
Aaron Crane [Sun, 3 Jan 2016 14:29:43 +0000 (14:29 +0000)]
Remove unwarranted assertion in Perl_newATTRSUB_x()

RT #126845: if a stub subroutine definition with a prototype has been seen,
then any subsequent stub (or definition) of the same subroutine with an
attribute was causing an assertion failure because of a null pointer.

This assertion was added in 2eaf799e74b14dc77b90d5484a3fd4ceac12b46a, which
itself would already have triggered this assertion failure, even though all
subsequent uses of the pointer in question were guarded with non-null
conditions. So merely deleting the assertion is the right thing.

8 years ago*glob{FILEHANDLE} is no longer deprecated
Ricardo Signes [Fri, 1 Jan 2016 02:54:49 +0000 (21:54 -0500)]
*glob{FILEHANDLE} is no longer deprecated

We are now trying to use deprecation warnings only when we believe
that a behavior will really be removed or made an error.  Since we
don't plan to do that with *glob{FILEHANDLE}, the warning is not
useful and may be harmful.

See discussion at [perl #127060].

8 years agoUpdate podlators to version 4.03
Karen Etheridge [Sun, 20 Dec 2015 03:08:24 +0000 (19:08 -0800)]
Update podlators to version 4.03

8 years agopodcheck.t: tell the author where the problems db is located
Karen Etheridge [Fri, 1 Jan 2016 18:46:31 +0000 (10:46 -0800)]
podcheck.t: tell the author where the problems db is located

8 years agoRMG: fix typo, clarify instructions a bit
Karen Etheridge [Mon, 21 Dec 2015 05:22:08 +0000 (21:22 -0800)]
RMG: fix typo, clarify instructions a bit

8 years ago[PATCH] Try more crypt algorithms in the tests, for OpenBSD.
Andy Dougherty [Thu, 31 Dec 2015 14:01:06 +0000 (09:01 -0500)]
[PATCH] Try more crypt algorithms in the tests, for OpenBSD.

OpenBSD implements the Blowfish algorithm, but not the MD5 one used
by glibc.   Enhance the crypt and taint tests to try both algorithms.
If neither works, fall back to no algorithm.  The Blowfish salt
is taken from the OpenBSD crypt(3) page.

8 years agorelease schedule: add release managers for 2016Q1
Ricardo Signes [Tue, 29 Dec 2015 21:01:39 +0000 (16:01 -0500)]
release schedule: add release managers for 2016Q1

8 years agoFile::Find: update POD/comments
Lukas Mai [Mon, 28 Dec 2015 01:03:20 +0000 (02:03 +0100)]
File::Find: update POD/comments

- change double spaces to single spaces
- remove comment that got lost during the POD reshuffling in f4eedc6b8c8
  (and probably should have been a commit message in the first place)
- remove use of "EG:" that makes no sense to me
- remove reference to hints/machten.sh (removed in e94c1c0554 6 years
  ago)
- change L<The wanted function> to L</The wanted function> because
  that's what internal links should look like according to perlpod
- change S<_> to C<_> (it was S< _> originally but the space got lost
  during a revert, making S<> into a no-op (but why would you write
  S< _> in the first place?))
- link "taint-mode" to perlsec (probably only makes a difference in
  HTML, not man)
- various typo/grammar fixes
- teach podcheck.t about find(1)
- bump version

8 years agoperlpodspec: fix typo
Lukas Mai [Mon, 28 Dec 2015 00:58:50 +0000 (01:58 +0100)]
perlpodspec: fix typo

8 years agoregcomp.c: Add comment.
Karl Williamson [Sat, 26 Dec 2015 19:37:00 +0000 (12:37 -0700)]
regcomp.c: Add comment.

This should have been included in commit
285b5ca0145796a915dec03e87e0176fd4681041

8 years agoregexec.c: Avoid a function call
Karl Williamson [Sat, 26 Dec 2015 19:35:32 +0000 (12:35 -0700)]
regexec.c: Avoid a function call

Not infrequently, a UTF-8 string will contain ASCII.  In this case, by
adding a test for this we can avoid the function call that is needed for
more complicated cases.

8 years agoregcomp.h: Remove extraneous 'struct's
Karl Williamson [Sat, 26 Dec 2015 19:34:07 +0000 (12:34 -0700)]
regcomp.h: Remove extraneous 'struct's

Better to not have this clutter.

8 years agoregcomp.h: Fix shift and mask
Karl Williamson [Sat, 26 Dec 2015 18:47:26 +0000 (11:47 -0700)]
regcomp.h: Fix shift and mask

The mask removed here was to make sure that right shifting didn't
propagate the sign bit, but is unnecessary as the value shifted is
unsigned.  And confining things to a U8 with that mask assumes that the
bit vector being operated on has 256 elements max.  This isn't
necessarily true these days, as one can change ANYOF_BITMAP_SIZE.
In fact changing that number was failing until this commit.

It also adds white space to make it easier to read.

8 years agoregcomp.h: Use more basic macro in #defines
Karl Williamson [Sat, 26 Dec 2015 18:28:09 +0000 (11:28 -0700)]
regcomp.h: Use more basic macro in #defines

Instead of having this code repeated in several places, call
the more base macro from the others.

8 years agoregcomp.h: Free up bit in ANYOF FLAGS field
Karl Williamson [Fri, 25 Dec 2015 05:42:08 +0000 (22:42 -0700)]
regcomp.h: Free up bit in ANYOF FLAGS field

I've long been confronted with trying to do things to create a spare bit
to use.  I thought it easier now, while it's fresh in my mind, to free
up one for future use, rather than re-learn things when it next becomes
necessary.  It would have been a different story if the freed bit had
required a performance penalty.

This commit also updates the comments about how to create even more
spare bits should it become necessary.

8 years agoregcomp.h: Shorten, clarify names of internal flags
Karl Williamson [Wed, 23 Dec 2015 19:43:30 +0000 (12:43 -0700)]
regcomp.h: Shorten, clarify names of internal flags

Some of the names are expanded slightly and not shortened

8 years agoAPItest.xs: Silence compiler warning on 32-bit machines
Karl Williamson [Wed, 23 Dec 2015 19:38:23 +0000 (12:38 -0700)]
APItest.xs: Silence compiler warning on 32-bit machines

One warning remains, otherwise things don't work.

8 years agomktables: Free up some memory after final use
Karl Williamson [Wed, 23 Dec 2015 19:30:40 +0000 (12:30 -0700)]
mktables: Free up some memory after final use

This may be enough for some platforms that aren't able to compile the
Unicode tables to work.  BUt it's quite late in the process.  The
ultimate solution would be for the tables to all be compiled ahead of
time.  That is under consideration for the future.

8 years agot/thread_it.pl: Increase stack size for AIX
Karl Williamson [Wed, 23 Dec 2015 18:29:08 +0000 (11:29 -0700)]
t/thread_it.pl: Increase stack size for AIX

This is enough to get the smoker to pass t/re/pat_thr.t

8 years agoUpdate release manager's guide
David Golden [Tue, 22 Dec 2015 20:49:17 +0000 (15:49 -0500)]
Update release manager's guide

8 years agoPATCH: [perl #126261: Assertion failure on missing [ in qr//
Karl Williamson [Tue, 22 Dec 2015 03:52:50 +0000 (20:52 -0700)]
PATCH: [perl #126261: Assertion failure on missing [ in qr//

This is the result of the regex compiler creating a temporary buffer to
parse a portion of the input pattern, and then when an error or warning
occurs in that buffer, trying to use addresses both inside it and the
original pattern.

The solution here is a general one, that confines the heavy lifting to
one macro, plus a little setup and tear-down around the use of the
temporary buffer.  The comments in the code detail how we relate the
address of the error in the temporary back to the parallel address in
the input pattern.

8 years agoregcomp.c: update RExC_start when parsing outside input
Karl Williamson [Tue, 22 Dec 2015 03:38:14 +0000 (20:38 -0700)]
regcomp.c: update RExC_start when parsing outside input

I noticed this while code reading.  In places, regcomp parses not the
input pattern but a temporary buffer it constructs, based on that input
pattern.  RExC_start should be updated so it always is pointing to the
same buffer as the parse pointer; otherwise segfaults can happen.

I have no idea how one currently can get into the situation this
protects against, so there are no tests added.

8 years agoregcomp.c: Add a stable pattern end pointer.
Karl Williamson [Tue, 22 Dec 2015 01:26:37 +0000 (18:26 -0700)]
regcomp.c: Add a stable pattern end pointer.

RExC_end is set sometimes during pattern compilation to perhaps another
string in memory.  Messages are output based on the original string, so
create an end pointer that is in terms of that original string,
otherwise could get segfaults.

8 years agot/lib/warnings/regcomp: Fix typo in comment
Karl Williamson [Tue, 22 Dec 2015 01:18:36 +0000 (18:18 -0700)]
t/lib/warnings/regcomp: Fix typo in comment

8 years agoregcomp.c: Use macro instead of recalculating
Karl Williamson [Tue, 22 Dec 2015 00:56:13 +0000 (17:56 -0700)]
regcomp.c: Use macro instead of recalculating

There is a macro that does the job that this code does.  Use it.

8 years agoregcomp.c: Move calculations to common macro
Karl Williamson [Mon, 21 Dec 2015 04:48:04 +0000 (21:48 -0700)]
regcomp.c: Move calculations to common macro

This consolidates identical calculations into a single place, which
makes things easier to maintain.

Probably the reason they previously were dispersed, is because now the
common macro has to evaluate the same expression more than once.  Since
the macro is used to return a list, it can't be turned into a single
statement.

Any decent optimizing compiler will extract the common subexpressions
and evaluate them just once.  But even if not, the macro is called only
in the event of a fatal error (in which case speed is not important), or
to raise a warning, which we expect to be rare, and the extra work is
negligible in comparison with what is needed to output the message.

8 years agoregcomp.h: reword some comments
Karl Williamson [Mon, 21 Dec 2015 20:37:20 +0000 (13:37 -0700)]
regcomp.h: reword some comments

8 years agoregcomp.c: Make some params to a static fcn const
Karl Williamson [Mon, 21 Dec 2015 21:47:05 +0000 (14:47 -0700)]
regcomp.c: Make some params to a static fcn const

This is just acting on the TODO comment.

8 years agoregcomp.c: Add 2 basic assertions
Karl Williamson [Fri, 20 Nov 2015 03:51:04 +0000 (20:51 -0700)]
regcomp.c: Add 2 basic assertions

These should be true because an SV* should always have a trailing NUL,
but a lot of things in this code depend on it.  It's worthwhile to point
that out; I wasn't sure it was true until I investigated.  And an
assert() makes sure it is really true

8 years agopp_hot.c: Add assertion
Karl Williamson [Wed, 21 Oct 2015 04:23:00 +0000 (22:23 -0600)]
pp_hot.c: Add assertion

This will make the cause of any future failures more clear.

8 years agoperlapi: Clarify 'string' vs. buffer
Karl Williamson [Wed, 21 Oct 2015 04:21:42 +0000 (22:21 -0600)]
perlapi: Clarify 'string' vs. buffer

A string strictly is NUL terminated, but our terminology is lax

8 years agoutf8.h: Add 2 assertions
Karl Williamson [Wed, 21 Oct 2015 04:08:59 +0000 (22:08 -0600)]
utf8.h: Add 2 assertions

This makes sure in DEBUGGING builds that the macro is called correctly.

8 years agoControlled demolition, CoreList is 5.20151220
Chris 'BinGOs' Williams [Tue, 22 Dec 2015 14:29:48 +0000 (14:29 +0000)]
Controlled demolition, CoreList is 5.20151220

8 years agoDeprecate to_utf8_case()
Karl Williamson [Tue, 22 Dec 2015 04:29:12 +0000 (21:29 -0700)]
Deprecate to_utf8_case()

See http://nntp.perl.org/group/perl.perl5.porters/233287

8 years agoBump the perl version in various places for 5.23.7
David Golden [Mon, 21 Dec 2015 23:17:43 +0000 (18:17 -0500)]
Bump the perl version in various places for 5.23.7

8 years agoCreate new perldelta.pod for v5.23.7
David Golden [Mon, 21 Dec 2015 23:07:32 +0000 (18:07 -0500)]
Create new perldelta.pod for v5.23.7

8 years agoUpdated release schedule
David Golden [Mon, 21 Dec 2015 22:59:15 +0000 (17:59 -0500)]
Updated release schedule

8 years agoUpdated Porting/epigraphs.pod for v5.23.6
David Golden [Mon, 21 Dec 2015 22:58:32 +0000 (17:58 -0500)]
Updated Porting/epigraphs.pod for v5.23.6

8 years agoadd new release to perlhist v5.23.6
David Golden [Mon, 21 Dec 2015 18:37:03 +0000 (13:37 -0500)]
add new release to perlhist

8 years agoUpdate perldelta with additional module updates
David Golden [Mon, 21 Dec 2015 18:31:37 +0000 (13:31 -0500)]
Update perldelta with additional module updates

8 years agoUpdate perldelta with Module::CoreList version bump
David Golden [Mon, 21 Dec 2015 18:15:03 +0000 (13:15 -0500)]
Update perldelta with Module::CoreList version bump

8 years agoUpdate Module::CoreList from 5.23.6
David Golden [Mon, 21 Dec 2015 18:14:48 +0000 (13:14 -0500)]
Update Module::CoreList from 5.23.6

8 years agoUpdate perldelta to near-final state
David Golden [Mon, 21 Dec 2015 17:01:22 +0000 (12:01 -0500)]
Update perldelta to near-final state

8 years agoperldelta for case changing on caseless language
Karl Williamson [Mon, 21 Dec 2015 15:38:38 +0000 (08:38 -0700)]
perldelta for case changing on caseless language

8 years agoperldelta for -Dr fix
Karl Williamson [Mon, 21 Dec 2015 05:28:38 +0000 (22:28 -0700)]
perldelta for -Dr fix

8 years agoUpdate perldelta
David Golden [Mon, 21 Dec 2015 04:52:01 +0000 (23:52 -0500)]
Update perldelta

This commit adds various release notes covering:

* module updates
* documentation updates
* some bug fixes and internal changes

8 years agoCorrect perldelta typo
David Golden [Mon, 21 Dec 2015 02:16:19 +0000 (21:16 -0500)]
Correct perldelta typo

8 years agoAdd alternate email address for dagolden to checkAUTHORS.pl
David Golden [Mon, 21 Dec 2015 02:19:47 +0000 (21:19 -0500)]
Add alternate email address for dagolden to checkAUTHORS.pl

8 years agoperldelta for 18371617dfb (B::Deparse)
Lukas Mai [Mon, 21 Dec 2015 02:23:05 +0000 (03:23 +0100)]
perldelta for 18371617dfb (B::Deparse)

8 years agoDo not define invlistEQ in the re extension.
Craig A. Berry [Sun, 20 Dec 2015 16:12:36 +0000 (10:12 -0600)]
Do not define invlistEQ in the re extension.

Because it's already defined in regcomp.c and the VMS build was
failing with a linker error (multiply-defined symbol).

8 years agoregcomp.c: Skip some work
Karl Williamson [Sat, 19 Dec 2015 18:22:04 +0000 (11:22 -0700)]
regcomp.c: Skip some work

We can optimize ANYOF nodes that are equivalent to POSIX character
classes.  Discovering if they are equivalent takes work, which can be
skipped with a simple test that will rule out many run-of-the-mill
character classes.

8 years agoregcomp.c: White space only
Karl Williamson [Sat, 19 Dec 2015 18:19:35 +0000 (11:19 -0700)]
regcomp.c: White space only

Indent a section of code in preparation for the next commit which will
make it into a block.

8 years agoregcomp.c: Add comments
Karl Williamson [Sat, 19 Dec 2015 18:14:07 +0000 (11:14 -0700)]
regcomp.c: Add comments

8 years agomktables: Add "$0:" to its first output
Karl Williamson [Sat, 19 Dec 2015 16:49:00 +0000 (09:49 -0700)]
mktables: Add "$0:" to its first output

So in a make, it is abundantly clear where the messages are coming from

8 years agoregcomp.c: Silence uninit compiler warning
Karl Williamson [Sat, 19 Dec 2015 05:59:35 +0000 (22:59 -0700)]
regcomp.c: Silence uninit compiler warning

This shouldn't actually happen, and g++ under -O0 didn't flag it, but
gcc under -O2 does, so initialize to an illegal value

8 years agoregcomp.c: Remove outdated comments
Karl Williamson [Sat, 19 Dec 2015 05:51:23 +0000 (22:51 -0700)]
regcomp.c: Remove outdated comments

These were invalidated by commit
709be747a32edc503b4645d9c5396bd4b40100d2

8 years agoFix -Dr problems.
Karl Williamson [Sat, 19 Dec 2015 05:04:20 +0000 (22:04 -0700)]
Fix -Dr problems.

Commits 108316fb65dc7243a1c5d87b4b29068b7d62d32e
and 5e85fd899767ba3003766fc9289c0ee2d8427d10
broke -Dr output in rare cases.

8 years agoperldelta for 572cd850,406d5545 (signbit)
Jarkko Hietaniemi [Fri, 18 Dec 2015 13:36:25 +0000 (08:36 -0500)]
perldelta for 572cd850,406d5545 (signbit)

8 years agoperldelta for the hexfp %a fixes.
Jarkko Hietaniemi [Fri, 18 Dec 2015 13:26:41 +0000 (08:26 -0500)]
perldelta for the hexfp %a fixes.

8 years agoperldelta for 3118d7d,74c6ce8,1f02ab1 (ppc64el fp)
Jarkko Hietaniemi [Fri, 18 Dec 2015 13:13:39 +0000 (08:13 -0500)]
perldelta for 3118d7d,74c6ce8,1f02ab1 (ppc64el fp)

8 years agoperldelta for 68bcb86 (openindiana: useshrplib for all solaris)
Jarkko Hietaniemi [Fri, 18 Dec 2015 13:12:57 +0000 (08:12 -0500)]
perldelta for 68bcb86 (openindiana: useshrplib for all solaris)

8 years agoConfigure: notes on the m68881 extended precision format
Jarkko Hietaniemi [Thu, 17 Dec 2015 02:57:31 +0000 (21:57 -0500)]
Configure: notes on the m68881 extended precision format

8 years agoDouble-double implementations differ.
Jarkko Hietaniemi [Fri, 18 Dec 2015 12:19:12 +0000 (07:19 -0500)]
Double-double implementations differ.

8 years agoOptimize some qr/[...]/ classes
Karl Williamson [Thu, 17 Dec 2015 17:22:44 +0000 (10:22 -0700)]
Optimize some qr/[...]/ classes

Bracketed character classes generally generate an ANYOF-type regnode,
which consists of a bitmap for the lower code points, and an inversion
list or swash to handle ones not in the bitmap.  They take up more
memory than other regnode types.  There are already some optimizations
that use a smaller and/or faster regnode instead.  For example, some
people prefer not to use a backslash to escape metacharacters, instead
writing something like /abc[.]def/.  This has for some time generated
the same thing as /abc\.def/ does, namely a single EXACT node, which is
both smaller and faster than an ANYOF node in the middle of two EXACT
nodes.

This commit adds some optimizations that hadn't been done previously.
Now things like /[\p{Word}]/ will optimize to \w, for example.  I had
not done this before, because my tests had shown very little performance
difference, but I had added most of the code to regcomp.c so it wouldn't
get lost, #ifdef'd out.

It turns out that I hadn't tested on code points above the bitmap, which
with this commit have a small, but appreciable speed up in matching, so
this commit enables and finishes that code.

Prior to this commit, things like /[[:word:]]/ were optimized to \w, but
things like /[_[:word:]]/ were not.  This commit fixes that.

If the following command is run on a perl compiled with -O2 and no
DEBUGGING:

    blead Porting/bench.pl --raw --benchfile=charclass_perf --perlargs=-Ilib /path_to_prior_perl="before this commit" /path_to_this_perl=after

and the file 'charclass_perf' contains
    [
        'regex::charclass::ascii' => {
            desc    => 'charclass, ascii range',
            setup   => 'my $a = qr/[\p{Word}]/',
            code    => '"A" =~ $a'
        },
        'regex::charclass::upper_latin1' => {
            desc    => 'charclass, upper latin1 range',
            setup   => 'my $a = qr/[\p{Word}]/',
            code    => '"\x{e0}" =~ $a'
        },
        'regex::charclass::above_latin1' => {
            desc    => 'charclass, above latin1 range',
            setup   => 'my $a = qr/[\p{Word}]/',
            code    => '"\x{100}" =~ $a'
        },
        'regex::charclass::high_Unicode' => {
            desc    => 'charclass, high Unicode code point',
            setup   => 'my $a = qr/[\p{Word}]/',
            code    => '"\x{10FFFF}" =~ $a'
        },
    ];

the following results are obtained:

The numbers represent raw counts per loop iteration.

regex::charclass::above_latin1
charclass, above latin1 range

       before this commit    after
       ------------------ --------
    Ir             3344.0   2888.0
    Dr              971.0    855.0
    Dw              604.0    541.0
  COND              575.0    504.0
   IND               25.0     25.0

COND_m               11.0     10.7
 IND_m               10.0     10.0

 Ir_m1                8.9      6.0
 Dr_m1                3.0      3.2
 Dw_m1                1.5      1.4

 Ir_mm                0.0      0.0
 Dr_mm                0.0      0.0
 Dw_mm                0.0      0.0

regex::charclass::ascii
charclass, ascii range

       before this commit    after
       ------------------ --------
    Ir             2661.0   2649.0
    Dr              798.0    795.0
    Dw              516.0    517.0
  COND              467.0    465.0
   IND               23.0     23.0

COND_m               10.0      8.8
 IND_m               10.0     10.0

 Ir_m1                7.9      0.0
 Dr_m1                2.9      3.1
 Dw_m1                1.3      1.3

 Ir_mm                0.0      0.0
 Dr_mm                0.0      0.0
 Dw_mm                0.0      0.0

regex::charclass::high_Unicode
charclass, high Unicode code point

       before this commit    after
       ------------------ --------
    Ir             3344.0   2888.0
    Dr              971.0    855.0
    Dw              604.0    541.0
  COND              575.0    504.0
   IND               25.0     25.0

COND_m               11.0     10.7
 IND_m               10.0     10.0

 Ir_m1                8.9      6.0
 Dr_m1                3.0      3.2
 Dw_m1                1.5      1.4

 Ir_mm                0.0      0.0
 Dr_mm                0.0      0.0
 Dw_mm                0.0      0.0

regex::charclass::upper_latin1
charclass, upper latin1 range

       before this commit    after
       ------------------ --------
    Ir             2661.0   2651.0
    Dr              798.0    796.0
    Dw              516.0    517.0
  COND              467.0    466.0
   IND               23.0     23.0

COND_m               11.0      8.8
 IND_m               10.0     10.0

 Ir_m1                7.9      0.0
 Dr_m1                2.9      3.3
 Dw_m1                1.5      1.2

 Ir_mm                0.0      0.0
 Dr_mm                0.0      0.0
 Dw_mm                0.0      0.0

8 years agoregcomp.h: Add comments
Karl Williamson [Wed, 16 Dec 2015 20:24:45 +0000 (13:24 -0700)]
regcomp.h: Add comments

8 years agoregex matching: Don't do unnecessary work
Karl Williamson [Wed, 16 Dec 2015 19:06:46 +0000 (12:06 -0700)]
regex matching: Don't do unnecessary work

This commit sets a flag at pattern compilation time to indicate if
a rare case is present that requires special handling, so that that
handling can be avoided unless necessary.

8 years agoregcomp.h: Renumber 2 flag bits
Karl Williamson [Wed, 16 Dec 2015 18:40:18 +0000 (11:40 -0700)]
regcomp.h: Renumber 2 flag bits

This changes the spare bit to be adjacent to the LOC_FOLD bit, in
preparation for the next commit, which will use that bit for a
LOC_FOLD-related use.

8 years agoregex: Free a ANYOF node bit
Karl Williamson [Wed, 16 Dec 2015 18:05:17 +0000 (11:05 -0700)]
regex: Free a ANYOF node bit

This is done by combining 2 mutually exclusive bits into one.  I hadn't
seen this possibility before because the name of one of them misled me.
It also misled me into turning on one that flag unnecessarily, and to
miss opportunities to not have to create a swash at runtime.  This
commit corrects those things as well.

8 years agoregcomp.c: Move comments adjacent to their object
Karl Williamson [Wed, 16 Dec 2015 05:42:18 +0000 (22:42 -0700)]
regcomp.c: Move comments adjacent to their object

8 years agoregcomp.c: Try simplifications in some qr/[...]/d
Karl Williamson [Wed, 16 Dec 2015 05:20:20 +0000 (22:20 -0700)]
regcomp.c: Try simplifications in some qr/[...]/d

Characters in a bracketed character class can come from a bunch of
sources, all bundled together.  Some things under /d match only when the
target string is UTF-8; some match only when it isn't UTF-8.  Other
sources may introduce ones that match regardless.  It may be that some
things are specified as conditionally matching from one source, and as
unconditionally matching from another.  We can subtract the
unconditionals from the conditionals, leaving a simpler set of things
that must be conditionally matched.  In some cases, the conditional set
may go to zero, allowing other optimizations to happen that otherwise
couldn't.  An example is

    qr/[\W\xAB]/

which before this commit compiled to:

    ANYOFD[^0-9A-Z_a-z\x{80}-\x{AA}\x{AC}-\x{FF}][{non-utf8-latin1-all}
    {utf8}0080-00A9 00AC-00B4 00B6-00B9 00BB-00BF 00D7 00F7
    02C2-02C5...] (12)

and after it, compiles to

    ANYOFD[^0-9A-Z_a-z\x{AA}\x{B5}\x{BA}\x{C0}-\x{D6}\x{D8}-\x{F6}
    \x{F8}-\x{FF}][{non-utf8-latin1-all}{utf8}02C2-02C5...] (12)

Notice that the {utf8} component has been stripped of everything below
256.  That means no swash has to be created at runtime when matching
code points below 256, unlike the case before this commit.

A starker example, though unlikely in real life except in
machine-generated code, is

    qr/[\w\W]/

Before this commit, it would generate:

    ANYOFD[\x{00}-\x{7F}][{non-utf8-latin1-all}{above_bitmap_all}
    {utf8}0080-00FF]

and afterwards, simply:

    SANY

8 years agoregcomp.c: Change variable name to be clearer
Karl Williamson [Wed, 16 Dec 2015 04:46:42 +0000 (21:46 -0700)]
regcomp.c: Change variable name to be clearer

This name confused me, and led to suboptimal code.  The new name is more
cumbersome, but won't confuse (at least it won't confuse me).

8 years agoConfigure: grep -q is not portable
Jarkko Hietaniemi [Thu, 17 Dec 2015 01:19:03 +0000 (20:19 -0500)]
Configure: grep -q is not portable

It does not work in SysV (solaris) or old BSD greps.

8 years agoRevert "Upgrade Socket from 2.020 to 2.021"
Steve Hay [Thu, 17 Dec 2015 11:08:16 +0000 (11:08 +0000)]
Revert "Upgrade Socket from 2.020 to 2.021"

This reverts commit 0bd66ca801c5fb84ee6a8feeb8114f0d8248029f.

Worked for me, but Jenkins isn't happy :-(

8 years agoUpdate META.yml following commit 0d99ea0387
Steve Hay [Thu, 17 Dec 2015 10:55:40 +0000 (10:55 +0000)]
Update META.yml following commit 0d99ea0387