This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Sawyer X [Sat, 20 Feb 2016 19:46:45 +0000 (20:46 +0100)]
Typos, POD errors, etc.
Sawyer X [Sat, 20 Feb 2016 19:22:02 +0000 (20:22 +0100)]
Update perldelta module versions (Porting/corelist-perldelta.pl)
Sawyer X [Sat, 20 Feb 2016 18:52:03 +0000 (19:52 +0100)]
update Module::CoreList (Porting/corelist.pl)
Sawyer X [Sat, 20 Feb 2016 17:22:33 +0000 (18:22 +0100)]
cleanup perldelta
Sawyer X [Sat, 20 Feb 2016 14:31:02 +0000 (15:31 +0100)]
Document 38e3b24
Sawyer X [Fri, 19 Feb 2016 23:20:38 +0000 (00:20 +0100)]
Document
e57270be442bfaa9dc23eebd67485e5a806b44e3:
I wasn't sure where or how much of it to document, but it seems
like it's important, and this relating to permissions not being
removed, I consider it security-related. This is similiar to what
the original Debian ticket that relates to it mentioned.
I've cut Niko's text a bit shorter, taken from the commit message
itself.
Sawyer X [Thu, 18 Feb 2016 16:30:07 +0000 (17:30 +0100)]
Sawyer X [Wed, 17 Feb 2016 18:15:58 +0000 (19:15 +0100)]
Update perldelta.pod:
This includes most changes I've noticed. I ran through all
commits since the last commit Stevan did. There are some commits
there which I haven't reflect and not sure whether they were,
but I will take that up with their respected authors.
Jarkko Hietaniemi [Thu, 18 Feb 2016 02:04:18 +0000 (21:04 -0500)]
Upgrade to IPC-SysV 2.05.
Craig A. Berry [Fri, 19 Feb 2016 21:33:10 +0000 (15:33 -0600)]
Cast PL_dump_re_max_len to avoid type mismatch warning.
Specifically this one on VMS:
|| ! grok_atoUV(dump_len_string, &PL_dump_re_max_len, NULL))
.....................................^
%CC-W-PTRMISMATCH, In this statement, the referenced type of the
pointer value "&(my_perl->Idump_re_max_len)" is "unsigned int",
which is not compatible with "unsigned long".
This was new code in
2bfbbbaf9ef1783ba.
Karl Williamson [Fri, 19 Feb 2016 18:16:06 +0000 (11:16 -0700)]
regcomp.c: White-space only
Re-indent and reflow to fit in 80 cols after previous commit
Karl Williamson [Fri, 19 Feb 2016 18:11:40 +0000 (11:11 -0700)]
regcomp.c: Can't do optimization if inverting
Something like /[^\W_0-9]/ was getting optimized into something it
shouldn't have been. Although, the way the execution code is
structured, I couldn't find a case where it actually made a difference.
So skip the optimization if inverting.
Karl Williamson [Thu, 18 Feb 2016 21:51:01 +0000 (14:51 -0700)]
regcomp.c: Use colors for -Dr metanotation
Use a different color for the metanotation than the code points being
output under Debugcolor. The default is to have stand out mode for the
code points, and not for the meta.
Karl Williamson [Thu, 18 Feb 2016 21:44:02 +0000 (14:44 -0700)]
regcomp.c: Backlslash {} in -Dr output
This is because it could otherwise be confused with the meta notation
used there.
Karl Williamson [Thu, 18 Feb 2016 21:43:14 +0000 (14:43 -0700)]
Revamp -Dr handling of /[...]/
This revamps the handling of -Dr for bracketed character classes. There
were bugs introduced earlier in 5.23, and this consolidates the handling
of /d classes so that the interactions can be better considered. It
tries inverting the portion that is in the bitmap range to see if the
output is shorter, and clearer that way. And it always makes the
above-bitmap code points show as not-inverted, as that is clearer.
I ran out of time before the freeze, so I had to not invert in some
cases.
Karl Williamson [Fri, 19 Feb 2016 04:47:15 +0000 (21:47 -0700)]
Add environment variable for -Dr: PERL_DUMP_RE_MAX_LEN
The regex engine when displaying debugging info, say under -Dr, will elide
data in order to keep the output from getting too long. For example,
the number of code points in all of Unicode matched by \w is quite
large, and so when displaying a pattern that matches this, only the
first some number of them are printed, and the rest are truncated,
represented by "...".
Sometimes, one wants to see more than what the
compiled-into-the-engine-max shows. This commit creates code to read
this environment variable to override the default max lengths. This
changes the lengths for everything to the input number, even if they
have different compiled maximums in the absence of this variable.
I'm not currently documenting this variable, as I don't think it works
properly under threads, and we may want to alter the behavior in various
ways as a result of gaining experience with using it.
Karl Williamson [Mon, 15 Feb 2016 23:27:20 +0000 (16:27 -0700)]
regcomp.c: Save a branch test
This branch will only be true if the answer to the previous branch was
also true, so can just move it to within that to avoid an unnecessary
test.
Karl Williamson [Mon, 15 Feb 2016 23:20:43 +0000 (16:20 -0700)]
regcomp.c: Clarify -Dr output under /l
It is now redundant to indicate that an ANYOF node is for locale, as the
regnode type ANYOFL now clearly indicates that. But also sometimes the
node is only vaid if the runtime locale is a UTF-8 one. That was not
clearly indicated.
Karl Williamson [Sun, 14 Feb 2016 01:00:36 +0000 (18:00 -0700)]
regcomp.c: Comments, white-space, add grouping () for clarity
Karl Williamson [Fri, 19 Feb 2016 04:43:14 +0000 (21:43 -0700)]
Add a parameter to a static function
This parameter will be used in a future commit, it changes the output
format of this function that displays the contents of an inversion list
so that it won't have to be parsed later, simplifying the code at that
time.
Karl Williamson [Fri, 19 Feb 2016 04:36:04 +0000 (21:36 -0700)]
Change private function to static
This function was used outside the file it contains, but was only
defined (by #ifdef's) for those few internal core files for which it was
needed. Now all those uses have gone, save for the one file. Better to
make it static so no one can circumvent those #ifdef's.
Karl Williamson [Fri, 19 Feb 2016 04:08:24 +0000 (21:08 -0700)]
regcomp.c: Change structure element size and loc
The 'strict' field is only a bool, but was declared I32, which led to
warnings on some compilers when it was passed to a function expecting a
bool. It is moved to the end of the structure, since it doesn't pack
well with the rest.
Karl Williamson [Thu, 18 Feb 2016 04:24:37 +0000 (21:24 -0700)]
regcomp.c: Move static declaration to file level
This array will be used in a future commit outside the function it
previously was declared in
Karl Williamson [Thu, 18 Feb 2016 04:17:46 +0000 (21:17 -0700)]
regcomp.c: optimization for qr/[...]/il
Certain matches are calculated as being legal only when the current
execution time local is a UTF-8 one. However, a character class can
have multiple components (and usually does), and some of those components
may be duplicates of some of these matches, and be valid regardless of
the locale. This commit removes them from the tentative list, and if it
goes to zero, clears it. This will improve execution time slightly.
Karl Williamson [Thu, 18 Feb 2016 04:15:52 +0000 (21:15 -0700)]
regcomp.c: Avoid a segfault
I stumbled across this in adding more code elsewhere, so I don't know
how to trigger it. This is in the intersection routine for two
inversion lists. The corresponding union code correctly handles the
case when the input is NULL, so just copy that to here.
Karl Williamson [Sat, 13 Feb 2016 22:35:11 +0000 (15:35 -0700)]
PATCH: [perl 127537] /\W/ regression with UTF-8
This bug is apparently uncommon in the field, as I was the one who
discovered it. It requires a UTF-8 pattern containing a complemented
posix class, like \W or \S, in an inverted character class, like
[^\Wfoo] in a pattern that also has a synthetic start class generated by
the regex optimizer for it .
The fix is trivial.
Karl Williamson [Sat, 13 Feb 2016 18:53:50 +0000 (11:53 -0700)]
regcomp.c, toke.c: swap functions being inline static
grok_bslash_x() is so large that no compiler will inline it. Move it to
dquote.c from dq_inline.c. Conversely, move form_octal_warning() to
dq_inline.c. It is so tiny that the function call overhead is scarcely
smaller than the function body.
This also moves things in embed.fnc so all these functions. are not
visible outside the few files they are supposed to be used in.
Karl Williamson [Mon, 15 Feb 2016 18:02:07 +0000 (11:02 -0700)]
Cast correctly to U8, not char
U8 is what the function being called is expecting
Karl Williamson [Tue, 16 Feb 2016 03:59:10 +0000 (20:59 -0700)]
perlapi: Hide the swash functions
These should be internal only, and we may want to get rid of them
someday. Hide their existence so that people who don't already know
about them won't be tempted to try to use them.
Karl Williamson [Tue, 16 Feb 2016 03:32:32 +0000 (20:32 -0700)]
regcomp.h: Not all ANYOF flags are in use.
So, it's better to not have a mask to include the unused ones.
Karl Williamson [Sun, 14 Feb 2016 00:21:28 +0000 (17:21 -0700)]
regcomp.c: Simplify a few lines of code
This code had been written before the isMNEMONIC_CNTRL() macro was
created. Using the macro simplifies things a little.
Karl Williamson [Sat, 13 Feb 2016 22:51:50 +0000 (15:51 -0700)]
regcomp.c: Clean up logic in function
This function uses some crude heuristics to decide whether to make a
synthetic start class or not. This commit removes some redundancies.
Karl Williamson [Thu, 11 Feb 2016 17:25:04 +0000 (10:25 -0700)]
regcomp.c: -Dr \xZZ instead of \x{ZZ}
The brackets are unnecessary and clutter the output.
Karl Williamson [Thu, 11 Feb 2016 17:12:57 +0000 (10:12 -0700)]
regcomp.c: Fix -Dr bug
It was using a wrong length calculation, which under some circumstances
caused the output to include extra bytes. Also I added comments, and
changed a variable name, so I don't have to figure this out again from
scratch.
Karl Williamson [Mon, 15 Feb 2016 18:04:36 +0000 (11:04 -0700)]
regcomp.c: Use macro to hide complexity
There is an existing macro that does these three lines in one source
line.
Karl Williamson [Sat, 13 Feb 2016 20:49:00 +0000 (13:49 -0700)]
Don't allow /\N{}/ under 're strict'
This is the one remaining empty {} that was accepted under the
experimental 'use re "strict"'.
Karl Williamson [Sat, 13 Feb 2016 22:20:49 +0000 (15:20 -0700)]
perlrecharclass: Add some missing info
Tom Hukins [Wed, 17 Feb 2016 15:04:10 +0000 (15:04 +0000)]
Remove an unused variable
Other Time::HiRes test scripts define and use $limit to cope with timing
on heavily loaded systems.
This test script defined the variable but never used it.
Daniel Dragan [Thu, 18 Feb 2016 00:15:39 +0000 (11:15 +1100)]
[perl #127556] update installperl to new location of W32 libperl link lib
commit
bf543eaf90 made the Win32 GCC or VC linkers produce
[lib]perl[5xx].[a/lib] in the /lib/CORE dir to reduce the prereq recipie
lines needing to run until XS modules can be built ("Extensions" which
builds all DLL XS modules is the longest running target and every effort
should be made for it to be started sooner by the make tool in parallel
build). The file is now made in /lib/CORE, previously it was made in root
and xcopy-ed to /lib/CORE in the same target that built the file. xcopy is
a seperate process run so was remove in that commit.
installperl doesn't use uninstalled /lib/CORE to determine the contents of
installed /lib/CORE (maybe that is a bug or bad design?), so the linking
lib was not being installed after a "[g/d]make install" making it
impossible to compile XS code on Win32 Perl. Change installperl
to look for the linking lib in /lib/CORE on Win32 and not in root. Even
though the nmake makefile still does the XCOPY since it is older/less
maintained, the installperl code still works since the root and /lib/CORE
files are identical on the nmake build and built in the same target.
James E Keenan [Wed, 17 Feb 2016 23:50:57 +0000 (18:50 -0500)]
Jarkko Hietaniemi [Wed, 17 Feb 2016 14:40:00 +0000 (09:40 -0500)]
Time::HiRes version bump.
Jarkko Hietaniemi [Wed, 17 Feb 2016 14:37:25 +0000 (09:37 -0500)]
Allow TIME_HIRES_DONT_RUN_PROBES=1 to aid cross-compiling
If that is true, the probes are compiled but not run.
https://rt.cpan.org/Ticket/Display.html?id=111391
Patch kindly supplied by Niko Tyni.
Karl Williamson [Tue, 16 Feb 2016 19:08:20 +0000 (12:08 -0700)]
t/re/reg_mesg.t: Add a couple of tests
Daniel Dragan [Thu, 11 Feb 2016 10:48:58 +0000 (05:48 -0500)]
minor comment improvements in hv.h and scope.h
-perl doesn't use malloc, it uses Newx (per interp memory)
-say what the return type is of SSNEW
Lukas Mai [Mon, 15 Feb 2016 20:17:18 +0000 (21:17 +0100)]
Revert "tweak NOT_REACHED in DEBUGGING builds"
This reverts commit
5b48e25f83f62f48ea280c49b00302e063384348.
The above commit breaks win32 builds:
IO.xs(73) : error C2065: 'my_perl' : undeclared identifier
IO.xs(73) : error C2223: left of '->IProc' must point to struct/union
where dist/IO/IO.xs contains:
69: static int
70: not_here(const char *s)
71: {
72: croak("%s not implemented on this architecture", s);
73: NORETURN_FUNCTION_END;
74: }
and perl.h contains:
# define NORETURN_FUNCTION_END NOT_REACHED;
David Mitchell [Mon, 15 Feb 2016 15:37:33 +0000 (15:37 +0000)]
doop.c: fix typo in header comment
Tony Cook [Mon, 15 Feb 2016 02:05:18 +0000 (13:05 +1100)]
perldelta: move the two Win32 gmake improvements to where they belong
Daniel Dragan [Sat, 13 Feb 2016 09:05:24 +0000 (04:05 -0500)]
fix win32 gmake with win64 VC with 32 bit GCC in PATH build failure
The assignment of PROCESSOR_ARCHITEW6432 to PROCESSOR_ARCHITECTURE near
the "When we are running from a 32bit cmd.exe on AMD64 then" comment
doesn't happen if WIN64 var was already assigned to. Do the 32/64 auto
detection only for GCC builds, not for VC builds. I not implementing 32/64
and cl version (CCTYPE setting) detection by parsing stdout of "cl<enter>"
with batch and gmake syntax at this time.
failure message:
generate_uudmap.obj : fatal error LNK1112: module machine type 'x64'
conflicts with target machine type 'X86'
GNUmakefile:1416: recipe for target '..\generate_uudmap.exe' failed
gmake: *** [..\generate_uudmap.exe] Error 2
Jarkko Hietaniemi [Sun, 14 Feb 2016 00:34:20 +0000 (19:34 -0500)]
Skip the length sanity check if d_name is pointer or less.
[perl #127511] v5.23.7-308-g1d41bb7 broke t/op/threads-dirh.t on solaris threaded builds
In other words, skip it if the dirent->d_name is a pointer (char *)
or less (struct-final char d_name[1], as it seems to be in Solaris).
The length sanity check is meant for places where the d_name is
a true array.
Follow-up to
1d41bb72.
Craig A. Berry [Sat, 13 Feb 2016 22:29:14 +0000 (16:29 -0600)]
DynaLoader shouldn't use mod2fname when finding .bs files.
Some platforms (probably only VMS and Android at the moment) take
special steps via the function DynaLoader::mod2fname to construct
a dynamic library name that will be unique and (if too long)
truncated. Then DynaLoader looks for a bootstrap file with the
exact same name as the dynamic library except with a .bs file
extension.
However, ExtUtils::MakeMaker has never produced bootstrap files
that have been run through mod2fname, so while a Foo:Bar extension
would produce a loadable library named PL__Foo_Bar.exe, the
bootstrap would be called Bar.bs. That shouldn't be a problem
since the bootstrap file is just Perl code read by Perl, but
DynaLoader has (apparently forever) been looking for
PL__Foo_Bar.bs and not finding it. So let's look for it by the
name under which it actually exists.
There are no core extensions that produce non-empty bootstrap
files and no existing test coverage, but as-yet-unintegrated
versions of MakeMaker do have such tests. See, for example,
https://github.com/Perl-Toolchain-Gang/ExtUtils-MakeMaker/commit/
7f5e9a35addeea7ebfcded28277c85f723e1a5de
Jarkko Hietaniemi [Sat, 13 Feb 2016 23:26:09 +0000 (18:26 -0500)]
Solaris /usr/bin/sed cannot handle labels of length eight.
You will just get a bunch of these from makedepend:
Finding dependencies for av.o.
Label too long: : testcont
According to POSIX, it should, eight being the minimum supported length.
Shoulda woulda coulda. Label of length seven seems to be fine.
Lukas Mai [Sat, 13 Feb 2016 21:11:59 +0000 (22:11 +0100)]
tweak NOT_REACHED in DEBUGGING builds
Previously, NOT_REACHED was defined as ASSUME(0), which is assert(0) in
DEBUGGING builds.
On some platforms (HP-UX/IA64, others?) gcc doesn't know that assert(0)
never returns. We have some code of the form (simplified):
int x;
switch (A) {
case B: x = 0; break;
case C: x = 1; break;
default: NOT_REACHED;
}
return x;
Now gcc thinks control can pass through the default branch and hit the
'return'. This triggers a warning of the form
warning: 'x' may be used uninitialized in this function
[-Wmaybe-uninitialized]
This commit defines NOT_REACHED as abort() in DEBUGGING builds.
Hopefully every compiler knows that abort() can't return, thus getting
rid of the bogus warnings.
Jarkko Hietaniemi [Sat, 13 Feb 2016 14:55:01 +0000 (09:55 -0500)]
Time::HiRes version bump.
Jarkko Hietaniemi [Sat, 13 Feb 2016 16:38:30 +0000 (11:38 -0500)]
Whitespace only: zap EOL spaces
Jarkko Hietaniemi [Sat, 13 Feb 2016 14:50:03 +0000 (09:50 -0500)]
Add the Time-HiRes Changes file from CPAN.
Jarkko Hietaniemi [Thu, 14 Jan 2016 15:31:01 +0000 (10:31 -0500)]
Add caveat on the clock_getres() resolution
Jarkko Hietaniemi [Sat, 13 Feb 2016 15:52:46 +0000 (10:52 -0500)]
Mention the OS X get_clock...() emulations.
Jarkko Hietaniemi [Thu, 14 Jan 2016 02:47:44 +0000 (21:47 -0500)]
OS X clock_nanosleep() emulation
https://developer.apple.com/library/ios/technotes/tn2169/_index.html
"High Precision Timers in iOS / OS X"
Returning the unspent time in the case of relative sleep is not implemented.
Jarkko Hietaniemi [Thu, 14 Jan 2016 13:06:26 +0000 (08:06 -0500)]
OS X clock_gettime() and clock_getres() emulation
Note that CLOCK_REALTIME and CLOCK_MONOTONIC are the same clock,
so both are monotonic.
The difference is that the CLOCK_REALTIME is offset by the
gettimeofday() result on the first use of these interfaces,
and thereafter will closely track the gettimeofday() values.
https://developer.apple.com/library/mac/qa/qa1398/_index.html
"Mach Absolute Time Units"
Jarkko Hietaniemi [Thu, 14 Jan 2016 00:11:57 +0000 (19:11 -0500)]
Add the new Time::HiRes constants to @EXPORT_OK.
Jarkko Hietaniemi [Wed, 13 Jan 2016 18:06:28 +0000 (13:06 -0500)]
Add FreeBSD specific clock_gettime() constants.
https://www.freebsd.org/cgi/man.cgi?query=clock_gettime
Jarkko Hietaniemi [Wed, 13 Jan 2016 18:03:07 +0000 (13:03 -0500)]
Add Linux-specific clock_gettime() constants.
http://man7.org/linux/man-pages/man2/clock_gettime.2.html
Jarkko Hietaniemi [Wed, 13 Jan 2016 18:01:35 +0000 (13:01 -0500)]
Sort the Time::HiRes constants, one per line
Craig A. Berry [Sat, 13 Feb 2016 15:47:11 +0000 (09:47 -0600)]
Make File::Spec::VMS->abs2rel handle Unix-format input.
We had been living under the illusion that when passed Unix-format
input, this routine could just punt to File::Spec::Unix-abs2rel.
However, the latter calls canonpath, which returns native specs,
and we ended up mixing native semantics with Unix-format
semantics and got nonsense.
For example, abs2rel('/d1/foo/bar.pl') could become '[bar.pl]'.
So instead we now follow the same basic logic regardless of input
format and there are tests to make sure abs2rel works with both.
Karl Williamson [Fri, 12 Feb 2016 03:24:37 +0000 (20:24 -0700)]
Remove POSIX isfoo() as scheduled
The functions like isalnum() have been scheduled for removal in 5.24.
This does that.
David Mitchell [Thu, 11 Feb 2016 17:31:11 +0000 (17:31 +0000)]
run regen_perly.pl
David Mitchell [Wed, 10 Feb 2016 10:53:03 +0000 (10:53 +0000)]
regen_perly.pl: improve action extracting
The regex was sometimes missing final cases from the big
action switch.
This simplifies the regex, but assumes that 'default: break;' is the last
case. This is the case in bison 2.7 and 3.0.2.
David Mitchell [Wed, 10 Feb 2016 10:21:08 +0000 (10:21 +0000)]
regen_perly.pl: print command with -v
when run verbose, print the bison command that is run
H.Merijn Brand [Thu, 11 Feb 2016 07:33:40 +0000 (08:33 +0100)]
Updated outdated link to smoke reports for HP-UX
Karl Williamson [Wed, 10 Feb 2016 22:05:45 +0000 (15:05 -0700)]
regcomp.c: Clarify error message
It is an error to specify an empty Unicode property name, like in
qr/\p{}/. It also is illegal to just say qr/\p/. Prior to this commit
the error message for that latter construct misleadingly referred to
braces. Since there are no braces in the input, they shouldn't be
mentioned.
Karl Williamson [Wed, 10 Feb 2016 21:25:31 +0000 (14:25 -0700)]
t/re/regex_sets.t: Add some tests
Karl Williamson [Wed, 10 Feb 2016 21:21:24 +0000 (14:21 -0700)]
sv.c: Handle radix being multi-byte and not UTF-8
While reviewing this code, I realized that the decimal point could
legally be a sequence of characters, not just a single one. I don't
know of any cases of that happening, but it's easy to handle that
possibility.
Karl Williamson [Wed, 10 Feb 2016 21:13:10 +0000 (14:13 -0700)]
regexec.c: Skip duplicate work
By changing the fallthrough of a case of a switch to a goto we can avoid
re-executing the test that was just done.
Karl Williamson [Wed, 10 Feb 2016 18:28:58 +0000 (11:28 -0700)]
regcomp.c: Replace invalid assertion
A future commit shows that this assertion is not valid. I don't know
how it can currently be triggered, but fix the code to properly handle
the case.
Karl Williamson [Wed, 10 Feb 2016 18:22:21 +0000 (11:22 -0700)]
regcomp.c: Avoid a function call in a common case
When the regex pattern is in UTF-8, we can avoid calling the function to
convert it to a code point for the common case where it is the same in
UTF-8 as not, i.e. if the character is ASCII on ASCII platforms (and
additionally any control on EBCDIC)
Karl Williamson [Wed, 10 Feb 2016 18:04:44 +0000 (11:04 -0700)]
regcomp.c: Add some grouping parens
&& has lower precedence than &, but it's better to be clear.
Karl Williamson [Wed, 10 Feb 2016 17:54:42 +0000 (10:54 -0700)]
utf8.h: Guard some macros against improper calls
The UTF8_IS_foo() macros have an inconsistent API. In some, the
parameter is a pointer, and in others it is a byte. In the former case,
a call of the wrong type will not compile, as it will try to dereference
a non-ptr. This commit makes the other ones not compile when called
wrongly, by using the technique shown by Lukas Mai (in
9c903d5937fa3682f21b2aece7f6011b6fcb2750) of ORing the argument with a
constant 0, which should get optimized out.
Karl Williamson [Wed, 10 Feb 2016 23:27:07 +0000 (16:27 -0700)]
regcomp.c, regexec.c: Comments, white-space only
Karl Williamson [Wed, 10 Feb 2016 23:27:13 +0000 (16:27 -0700)]
regcomp.c: Fix some parsing glitches
I undertook a code review of how regcomp.c parses things in light of the
tickets found by the fuzzer,
https://rt.perl.org/Ticket/Display.html?id=126546. This commit is the
result of my efforts so far. I was not planning to push it now, but the
work found a couple of new error messages that should be raised, and
this has to be done before the visible changes code freeze coming up all
too soon. I will add test cases after that freeze, including if to see
that these changes fix all the observed issues.
The audit was tedious, and may have missed some things. Several issues
occurred in multiple places. One is to not advance the parse by
UTF8SKIP appropriately; another is to subtract one byte from the parse
and assume that that is pointing to the beginning of the previous
character (which under UTF-8 it may not). Another is to assume that
that the pattern is a C string, that there are no interior NULs in it.
I also found unnecessary tests, given that an SV always has a
terminating NUL.
Karl Williamson [Sun, 27 Dec 2015 17:39:02 +0000 (10:39 -0700)]
regcomp.c: Extract duped code into one fcn
This takes code that was duplicated and makes it into a single static
inline function, so that maintenance tasks don't have to be done on both
copies.
Karl Williamson [Thu, 11 Feb 2016 03:46:39 +0000 (20:46 -0700)]
porting/diag.t: Handle some E<> pod escapes
These can occur in perldiag, and so must be converted into the character
that the internal message outputs. This commit causes the major ones of
these to be converted.
Karl Williamson [Thu, 11 Feb 2016 03:31:13 +0000 (20:31 -0700)]
podcheck.t: Need to translate E<lt> and E<gt>
These can appear in links, and need to be translated into their correct
character.
Ricardo Signes [Thu, 11 Feb 2016 02:56:35 +0000 (21:56 -0500)]
release schedule: September 2016 is scheduled
Daniel Dragan [Wed, 10 Feb 2016 20:47:44 +0000 (15:47 -0500)]
add shortcut around syscalls when file not found in win32_stat
win32_stat on success makes ~7 system calls, some from perl, some from CRT,
but on failure, typically file not found, the perl syscalls fails, then the
CRT stat runs, and fails too, so 5 mostly failing system calls are done
for file not found. If the perl syscall says file not found, the
file wont magically come into existence in the next 10-1000 us for the
CRT's syscalls, so skip calling the CRT and the additional syscalls
if the perl didn't find the file. This patch reduces the number of syscalls
from 5 to 1 for file not found for a win32 perl stat. Benchmark and
profiling info is attached to RT ticket for this patch. Note CreateFile on
a dir fails with ERROR_ACCESS_DENIED so in some cases, a failed CreateFile
is still a successful CRT stat() which does things differently so dirs can
be opened.
Ricardo Signes [Wed, 10 Feb 2016 16:06:50 +0000 (11:06 -0500)]
update release schedule for beginnings of 5.25
Karl Williamson [Tue, 9 Feb 2016 18:50:04 +0000 (11:50 -0700)]
PATCH: [perl #8904] Revamp [:posix:] parsing
A problem with bracketed character classes, qr/[foo]/, is that there is
very little structure about them, so almost anything is legal, and so
typos just silently compile into something unintended. One of the
possible components are posix character classes. There are 14 of them,
and they have a very restricted structure, which is easy to get slightly
wrong, so that instead of the intended posix class being compiled,
something else silently is created. This commit causes the regex
compiler to look for slightly misspelled posix character classes and to
raise a warning when found. It does not change the results of the
compilation.
To do this, it introduces fuzzy parsing into the regex compiler, using
the Damerau-Levenshtein algorithm to find out how many single character
edits it would take to transform the input into one of the 14 classes.
If it is 1 or 2 off, it considers the input to have been intended to be
that class and raises the warning. If more edits would be needed, it
remains silent.
This is a heuristic, and someone could have made enough typos that this
thinks a class wasn't intended that was. Conversely it could raise a
warning when no class was intended, though warnings only happen when the
input very closely resembles a posix class of one of the 14 legal ones.
The algorithm can be tweaked if experience indicates it should. But the
bottom line is that many more cases of unintended results will now be
warned about.
Things like having blanks in the construct and having the '^' before the
colon are recognized as being intended posix classes (given that the
actual names are close to one of the 14), and raise warnings. Again
this commit does not change what gets compiled. This found a bug in
autodoc.pl which was fixed a few commits ago.
The [. .] and [= =] POSIX constructs cause perl to croak that they are
unimplemented. This commit improves the parsing of these two, and fixes
some false positives. See
http://nntp.perl.org/group/perl.perl5.porters/230975
The new code combines two functions in regcomp.c into one new one.
Karl Williamson [Tue, 9 Feb 2016 21:00:23 +0000 (14:00 -0700)]
regcomp.c: Fix recursive parsing bug
In certain cases, regex compilation will use a substitute input string
when parsing what it thinks is a bracketed character class /[ ... ] /.
The substitute automatically had a ']' appended to it, even if the
original didn't have one, leading to wrong results.
I did not add a test for this, as the next commit causes current tests
to fail if this one isn't done.
Karl Williamson [Tue, 9 Feb 2016 18:28:32 +0000 (11:28 -0700)]
regcomp.c: White-space, variable name-change only
This changes indents, and the names of two variables in preparation for
the next commit, so the difference listing won't be so large for that.
Karl Williamson [Tue, 9 Feb 2016 18:00:58 +0000 (11:00 -0700)]
autodoc.pl: Fix misspelled /[[:alpha:]]/
This typo was caught by the work for a couple of commits down the road.
Karl Williamson [Tue, 9 Feb 2016 17:49:29 +0000 (10:49 -0700)]
Add Nick Logan to AUTHORS
The previous commit used code written by him.
Karl Williamson [Tue, 9 Feb 2016 17:40:38 +0000 (10:40 -0700)]
regcomp.c: Add code to compute edit distance (Damerau–Levenshtein)
This will be used in a future commit.
This code is taken from CPAN Text::Levenshtein::Damerau::XS with the
author's knowledge. There have been white-space changes to make it
conform better to perl's core coding standards, and declaration changes
to make it more portable, such as using UV instead of 'unsigned int',
and PERL_STATIC_INLINE instead of a less portable form, but the logic is
unchanged. One variable was changed to signed from unsigned to avoid a
warning message from some compilers.
The author and I will decide later about keeping the cpan module and
this code in sync. It changes very rarely.
Tony Cook [Wed, 10 Feb 2016 05:07:42 +0000 (16:07 +1100)]
perldelta for
1bb1a3d6d35
Tony Cook [Wed, 10 Feb 2016 05:03:22 +0000 (16:03 +1100)]
[perl #127334] S_incline: avoid overrunning the end of the parse buffer
If the rest of the allocation up to the end addressable memory was
non-spaces, this loop could cause a segmentation fault.
Avoid that by ensuring we stop when we see a NUL.
Tony Cook [Wed, 10 Feb 2016 03:35:53 +0000 (14:35 +1100)]
Tony Cook [Wed, 10 Feb 2016 03:30:08 +0000 (14:30 +1100)]
[perl #127494] don't cache AUTOLOAD as DESTROY
Otherwise S_curse() would need to do all the work gv_autoload_pvn()
already does to set up to call AUTOLOAD() (setting $AUTOLOAD etc.)
Instead, by not caching it, we ensure gv_autoload_pvn() is called
each time to perform the required setup.
This has a performance cost over adding that code to S_curse(), but the
cost of actually running the AUTOLOAD sub is likely to drown that out,
and is easily avoided by adding "sub DESTROY {}" to the module.
Tony Cook [Wed, 10 Feb 2016 00:46:48 +0000 (11:46 +1100)]
[perl #127494] TODO test for $AUTOLOAD being set for DESTROY
000814da allowed the cached DESTROY method to be an AUTOLOAD method,
but didn't ensure that $AUTOLOAD (or the equivalent for XS AUTOLOADS)
was set when AUTOLOAD was called.
Add a TODO test for this behaviour.
James E Keenan [Sun, 7 Feb 2016 12:58:29 +0000 (07:58 -0500)]
Update guidance on naming of modules.
Delete reference to comp.lang.perl.misc. Add references to module-authors
list/newsgroup and to PAUSE.
For: RT # 127435
Sawyer X [Tue, 9 Feb 2016 18:57:58 +0000 (19:57 +0100)]
Remove outdated task in release:
I checked with Graham Barr, who said the list of PAUSE accounts
that can upload perl distributions is automated and taken from:
http://pause.perl.org/pause/query?ACTION=who_pumpkin;OF=YAML
This means that if you're already on the list, you do not need
to check again on search.cpan.org or to bug Graham. :)
Tom Hukins [Tue, 9 Feb 2016 11:15:53 +0000 (11:15 +0000)]
Time::HiRes moved from "cpan" to "dist" in 91ba54
Karl Williamson [Mon, 1 Feb 2016 18:08:29 +0000 (11:08 -0700)]
locale.c: Improve -DL debug info
This changes the debug info to include if the LC_NUMERIC decimal point
(radix) character string is UTF-8 encoded or not, and it uses the actual
value stored in Perl for that string instead of the POSIX origin of that
data, thus being more accurate should they ever get out-of-sync