perl5.git.perl.org Git - perl5.git/blame_incremental

... / ...

Commit	Line	Data
	1	=head1 NAME
	2
	3	perldebguts - Guts of Perl debugging
	4
	5	=head1 DESCRIPTION
	6
	7	This is not L<perldebug>, which tells you how to use
	8	the debugger. This manpage describes low-level details concerning
	9	the debugger's internals, which range from difficult to impossible
	10	to understand for anyone who isn't incredibly intimate with Perl's guts.
	11	Caveat lector.
	12
	13	=head1 Debugger Internals
	14
	15	Perl has special debugging hooks at compile-time and run-time used
	16	to create debugging environments. These hooks are not to be confused
	17	with the I<perl -Dxxx> command described in L<perlrun>, which is
	18	usable only if a special Perl is built per the instructions in the
	19	F<INSTALL> podpage in the Perl source tree.
	20
	21	For example, whenever you call Perl's built-in C<caller> function
	22	from the package C<DB>, the arguments that the corresponding stack
	23	frame was called with are copied to the C<@DB::args> array. These
	24	mechanisms are enabled by calling Perl with the B<-d> switch.
	25	Specifically, the following additional features are enabled
	26	(cf. L<perlvar/$^P>):
	27
	28	=over 4
	29
	30	=item *
	31
	32	Perl inserts the contents of C<$ENV{PERL5DB}> (or C<BEGIN {require
	33	'perl5db.pl'}> if not present) before the first line of your program.
	34
	35	=item *
	36
	37	Each array C<@{"_<$filename"}> holds the lines of $filename for a
	38	file compiled by Perl. The same is also true for C<eval>ed strings
	39	that contain subroutines, or which are currently being executed.
	40	The $filename for C<eval>ed strings looks like C<(eval 34)>.
	41
	42	Values in this array are magical in numeric context: they compare
	43	equal to zero only if the line is not breakable.
	44
	45	=item *
	46
	47	Each hash C<%{"_<$filename"}> contains breakpoints and actions keyed
	48	by line number. Individual entries (as opposed to the whole hash)
	49	are settable. Perl only cares about Boolean true here, although
	50	the values used by F<perl5db.pl> have the form
	51	C<"$break_condition\0$action">.
	52
	53	The same holds for evaluated strings that contain subroutines, or
	54	which are currently being executed. The $filename for C<eval>ed strings
	55	looks like C<(eval 34)>.
	56
	57	=item *
	58
	59	Each scalar C<${"_<$filename"}> contains C<"_<$filename">. This is
	60	also the case for evaluated strings that contain subroutines, or
	61	which are currently being executed. The $filename for C<eval>ed
	62	strings looks like C<(eval 34)>.
	63
	64	=item *
	65
	66	After each C<require>d file is compiled, but before it is executed,
	67	C<DB::postponed(*{"_<$filename"})> is called if the subroutine
	68	C<DB::postponed> exists. Here, the $filename is the expanded name of
	69	the C<require>d file, as found in the values of %INC.
	70
	71	=item *
	72
	73	After each subroutine C<subname> is compiled, the existence of
	74	C<$DB::postponed{subname}> is checked. If this key exists,
	75	C<DB::postponed(subname)> is called if the C<DB::postponed> subroutine
	76	also exists.
	77
	78	=item *
	79
	80	A hash C<%DB::sub> is maintained, whose keys are subroutine names
	81	and whose values have the form C<filename:startline-endline>.
	82	C<filename> has the form C<(eval 34)> for subroutines defined inside
	83	C<eval>s.
	84
	85	=item *
	86
	87	When the execution of your program reaches a point that can hold a
	88	breakpoint, the C<DB::DB()> subroutine is called if any of the variables
	89	C<$DB::trace>, C<$DB::single>, or C<$DB::signal> is true. These variables
	90	are not C<local>izable. This feature is disabled when executing
	91	inside C<DB::DB()>, including functions called from it
	92	unless C<< $^D & (1<<30) >> is true.
	93
	94	=item *
	95
	96	When execution of the program reaches a subroutine call, a call to
	97	C<&DB::sub>(I<args>) is made instead, with C<$DB::sub> set to identify
	98	the called subroutine. (This doesn't happen if the calling subroutine
	99	was compiled in the C<DB> package.) C<$DB::sub> normally holds the name
	100	of the called subroutine, if it has a name by which it can be looked up.
	101	Failing that, C<$DB::sub> will hold a reference to the called subroutine.
	102	Either way, the C<&DB::sub> subroutine can use C<$DB::sub> as a reference
	103	by which to call the called subroutine, which it will normally want to do.
	104
	105	X<&DB::lsub>If the call is to an lvalue subroutine, and C<&DB::lsub>
	106	is defined C<&DB::lsub>(I<args>) is called instead, otherwise falling
	107	back to C<&DB::sub>(I<args>).
	108
	109	=item *
	110
	111	When execution of the program uses C<goto> to enter a non-XS subroutine
	112	and the 0x80 bit is set in C<$^P>, a call to C<&DB::goto> is made, with
	113	C<$DB::sub> set to identify the subroutine being entered. The call to
	114	C<&DB::goto> does not replace the C<goto>; the requested subroutine will
	115	still be entered once C<&DB::goto> has returned. C<$DB::sub> normally
	116	holds the name of the subroutine being entered, if it has one. Failing
	117	that, C<$DB::sub> will hold a reference to the subroutine being entered.
	118	Unlike when C<&DB::sub> is called, it is not guaranteed that C<$DB::sub>
	119	can be used as a reference to operate on the subroutine being entered.
	120
	121	=back
	122
	123	Note that if C<&DB::sub> needs external data for it to work, no
	124	subroutine call is possible without it. As an example, the standard
	125	debugger's C<&DB::sub> depends on the C<$DB::deep> variable
	126	(it defines how many levels of recursion deep into the debugger you can go
	127	before a mandatory break). If C<$DB::deep> is not defined, subroutine
	128	calls are not possible, even though C<&DB::sub> exists.
	129
	130	=head2 Writing Your Own Debugger
	131
	132	=head3 Environment Variables
	133
	134	The C<PERL5DB> environment variable can be used to define a debugger.
	135	For example, the minimal "working" debugger (it actually doesn't do anything)
	136	consists of one line:
	137
	138	sub DB::DB {}
	139
	140	It can easily be defined like this:
	141
	142	$ PERL5DB="sub DB::DB {}" perl -d your-script
	143
	144	Another brief debugger, slightly more useful, can be created
	145	with only the line:
	146
	147	sub DB::DB {print ++$i; scalar <STDIN>}
	148
	149	This debugger prints a number which increments for each statement
	150	encountered and waits for you to hit a newline before continuing
	151	to the next statement.
	152
	153	The following debugger is actually useful:
	154
	155	{
	156	package DB;
	157	sub DB {}
	158	sub sub {print ++$i, " $sub\n"; &$sub}
	159	}
	160
	161	It prints the sequence number of each subroutine call and the name of the
	162	called subroutine. Note that C<&DB::sub> is being compiled into the
	163	package C<DB> through the use of the C<package> directive.
	164
	165	When it starts, the debugger reads your rc file (F<./.perldb> or
	166	F<~/.perldb> under Unix), which can set important options.
	167	(A subroutine (C<&afterinit>) can be defined here as well; it is executed
	168	after the debugger completes its own initialization.)
	169
	170	After the rc file is read, the debugger reads the PERLDB_OPTS
	171	environment variable and uses it to set debugger options. The
	172	contents of this variable are treated as if they were the argument
	173	of an C<o ...> debugger command (q.v. in L<perldebug/"Configurable Options">).
	174
	175	=head3 Debugger Internal Variables
	176
	177	In addition to the file and subroutine-related variables mentioned above,
	178	the debugger also maintains various magical internal variables.
	179
	180	=over 4
	181
	182	=item *
	183
	184	C<@DB::dbline> is an alias for C<@{"::_<current_file"}>, which
	185	holds the lines of the currently-selected file (compiled by Perl), either
	186	explicitly chosen with the debugger's C<f> command, or implicitly by flow
	187	of execution.
	188
	189	Values in this array are magical in numeric context: they compare
	190	equal to zero only if the line is not breakable.
	191
	192	=item *
	193
	194	C<%DB::dbline> is an alias for C<%{"::_<current_file"}>, which
	195	contains breakpoints and actions keyed by line number in
	196	the currently-selected file, either explicitly chosen with the
	197	debugger's C<f> command, or implicitly by flow of execution.
	198
	199	As previously noted, individual entries (as opposed to the whole hash)
	200	are settable. Perl only cares about Boolean true here, although
	201	the values used by F<perl5db.pl> have the form
	202	C<"$break_condition\0$action">.
	203
	204	=back
	205
	206	=head3 Debugger Customization Functions
	207
	208	Some functions are provided to simplify customization.
	209
	210	=over 4
	211
	212	=item *
	213
	214	See L<perldebug/"Configurable Options"> for a description of options parsed by
	215	C<DB::parse_options(string)>.
	216
	217	=item *
	218
	219	C<DB::dump_trace(skip[,count])> skips the specified number of frames
	220	and returns a list containing information about the calling frames (all
	221	of them, if C<count> is missing). Each entry is reference to a hash
	222	with keys C<context> (either C<.>, C<$>, or C<@>), C<sub> (subroutine
	223	name, or info about C<eval>), C<args> (C<undef> or a reference to
	224	an array), C<file>, and C<line>.
	225
	226	=item *
	227
	228	C<DB::print_trace(FH, skip[, count[, short]])> prints
	229	formatted info about caller frames. The last two functions may be
	230	convenient as arguments to C<< < >>, C<< << >> commands.
	231
	232	=back
	233
	234	Note that any variables and functions that are not documented in
	235	this manpages (or in L<perldebug>) are considered for internal
	236	use only, and as such are subject to change without notice.
	237
	238	=head1 Frame Listing Output Examples
	239
	240	The C<frame> option can be used to control the output of frame
	241	information. For example, contrast this expression trace:
	242
	243	$ perl -de 42
	244	Stack dump during die enabled outside of evals.
	245
	246	Loading DB routines from perl5db.pl patch level 0.94
	247	Emacs support available.
	248
	249	Enter h or 'h h' for help.
	250
	251	main::(-e:1): 0
	252	DB<1> sub foo { 14 }
	253
	254	DB<2> sub bar { 3 }
	255
	256	DB<3> t print foo() * bar()
	257	main::((eval 172):3): print foo() + bar();
	258	main::foo((eval 168):2):
	259	main::bar((eval 170):2):
	260	42
	261
	262	with this one, once the C<o>ption C<frame=2> has been set:
	263
	264	DB<4> o f=2
	265	frame = '2'
	266	DB<5> t print foo() * bar()
	267	3: foo() * bar()
	268	entering main::foo
	269	2: sub foo { 14 };
	270	exited main::foo
	271	entering main::bar
	272	2: sub bar { 3 };
	273	exited main::bar
	274	42
	275
	276	By way of demonstration, we present below a laborious listing
	277	resulting from setting your C<PERLDB_OPTS> environment variable to
	278	the value C<f=n N>, and running I<perl -d -V> from the command line.
	279	Examples using various values of C<n> are shown to give you a feel
	280	for the difference between settings. Long though it may be, this
	281	is not a complete listing, but only excerpts.
	282
	283	=over 4
	284
	285	=item 1
	286
	287	entering main::BEGIN
	288	entering Config::BEGIN
	289	Package lib/Exporter.pm.
	290	Package lib/Carp.pm.
	291	Package lib/Config.pm.
	292	entering Config::TIEHASH
	293	entering Exporter::import
	294	entering Exporter::export
	295	entering Config::myconfig
	296	entering Config::FETCH
	297	entering Config::FETCH
	298	entering Config::FETCH
	299	entering Config::FETCH
	300
	301	=item 2
	302
	303	entering main::BEGIN
	304	entering Config::BEGIN
	305	Package lib/Exporter.pm.
	306	Package lib/Carp.pm.
	307	exited Config::BEGIN
	308	Package lib/Config.pm.
	309	entering Config::TIEHASH
	310	exited Config::TIEHASH
	311	entering Exporter::import
	312	entering Exporter::export
	313	exited Exporter::export
	314	exited Exporter::import
	315	exited main::BEGIN
	316	entering Config::myconfig
	317	entering Config::FETCH
	318	exited Config::FETCH
	319	entering Config::FETCH
	320	exited Config::FETCH
	321	entering Config::FETCH
	322
	323	=item 3
	324
	325	in $=main::BEGIN() from /dev/null:0
	326	in $=Config::BEGIN() from lib/Config.pm:2
	327	Package lib/Exporter.pm.
	328	Package lib/Carp.pm.
	329	Package lib/Config.pm.
	330	in $=Config::TIEHASH('Config') from lib/Config.pm:644
	331	in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
	332	in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from li
	333	in @=Config::myconfig() from /dev/null:0
	334	in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
	335	in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
	336	in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
	337	in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574
	338	in $=Config::FETCH(ref(Config), 'osname') from lib/Config.pm:574
	339	in $=Config::FETCH(ref(Config), 'osvers') from lib/Config.pm:574
	340
	341	=item 4
	342
	343	in $=main::BEGIN() from /dev/null:0
	344	in $=Config::BEGIN() from lib/Config.pm:2
	345	Package lib/Exporter.pm.
	346	Package lib/Carp.pm.
	347	out $=Config::BEGIN() from lib/Config.pm:0
	348	Package lib/Config.pm.
	349	in $=Config::TIEHASH('Config') from lib/Config.pm:644
	350	out $=Config::TIEHASH('Config') from lib/Config.pm:644
	351	in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
	352	in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
	353	out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
	354	out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
	355	out $=main::BEGIN() from /dev/null:0
	356	in @=Config::myconfig() from /dev/null:0
	357	in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
	358	out $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
	359	in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
	360	out $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
	361	in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
	362	out $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
	363	in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574
	364
	365	=item 5
	366
	367	in $=main::BEGIN() from /dev/null:0
	368	in $=Config::BEGIN() from lib/Config.pm:2
	369	Package lib/Exporter.pm.
	370	Package lib/Carp.pm.
	371	out $=Config::BEGIN() from lib/Config.pm:0
	372	Package lib/Config.pm.
	373	in $=Config::TIEHASH('Config') from lib/Config.pm:644
	374	out $=Config::TIEHASH('Config') from lib/Config.pm:644
	375	in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
	376	in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
	377	out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
	378	out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
	379	out $=main::BEGIN() from /dev/null:0
	380	in @=Config::myconfig() from /dev/null:0
	381	in $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
	382	out $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
	383	in $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
	384	out $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
	385
	386	=item 6
	387
	388	in $=CODE(0x15eca4)() from /dev/null:0
	389	in $=CODE(0x182528)() from lib/Config.pm:2
	390	Package lib/Exporter.pm.
	391	out $=CODE(0x182528)() from lib/Config.pm:0
	392	scalar context return from CODE(0x182528): undef
	393	Package lib/Config.pm.
	394	in $=Config::TIEHASH('Config') from lib/Config.pm:628
	395	out $=Config::TIEHASH('Config') from lib/Config.pm:628
	396	scalar context return from Config::TIEHASH: empty hash
	397	in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
	398	in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
	399	out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
	400	scalar context return from Exporter::export: ''
	401	out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
	402	scalar context return from Exporter::import: ''
	403
	404	=back
	405
	406	In all cases shown above, the line indentation shows the call tree.
	407	If bit 2 of C<frame> is set, a line is printed on exit from a
	408	subroutine as well. If bit 4 is set, the arguments are printed
	409	along with the caller info. If bit 8 is set, the arguments are
	410	printed even if they are tied or references. If bit 16 is set, the
	411	return value is printed, too.
	412
	413	When a package is compiled, a line like this
	414
	415	Package lib/Carp.pm.
	416
	417	is printed with proper indentation.
	418
	419	=head1 Debugging Regular Expressions
	420
	421	There are two ways to enable debugging output for regular expressions.
	422
	423	If your perl is compiled with C<-DDEBUGGING>, you may use the
	424	B<-Dr> flag on the command line.
	425
	426	Otherwise, one can C<use re 'debug'>, which has effects at
	427	compile time and run time. Since Perl 5.9.5, this pragma is lexically
	428	scoped.
	429
	430	=head2 Compile-time Output
	431
	432	The debugging output at compile time looks like this:
	433
	434	Compiling REx '[bc]d(ef*g)+h[ij]k$'
	435	size 45 Got 364 bytes for offset annotations.
	436	first at 1
	437	rarest char g at 0
	438	rarest char d at 0
	439	1: ANYOF[bc](12)
	440	12: EXACT <d>(14)
	441	14: CURLYX[0] {1,32767}(28)
	442	16: OPEN1(18)
	443	18: EXACT <e>(20)
	444	20: STAR(23)
	445	21: EXACT <f>(0)
	446	23: EXACT <g>(25)
	447	25: CLOSE1(27)
	448	27: WHILEM[1/1](0)
	449	28: NOTHING(29)
	450	29: EXACT <h>(31)
	451	31: ANYOF[ij](42)
	452	42: EXACT <k>(44)
	453	44: EOL(45)
	454	45: END(0)
	455	anchored 'de' at 1 floating 'gh' at 3..2147483647 (checking floating)
	456	stclass 'ANYOF[bc]' minlen 7
	457	Offsets: [45]
	458	1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]
	459	0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]
	460	11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]
	461	0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0]
	462	Omitting $` $& $' support.
	463
	464	The first line shows the pre-compiled form of the regex. The second
	465	shows the size of the compiled form (in arbitrary units, usually
	466	4-byte words) and the total number of bytes allocated for the
	467	offset/length table, usually 4+C<size>*8. The next line shows the
	468	label I<id> of the first node that does a match.
	469
	470	The
	471
	472	anchored 'de' at 1 floating 'gh' at 3..2147483647 (checking floating)
	473	stclass 'ANYOF[bc]' minlen 7
	474
	475	line (split into two lines above) contains optimizer
	476	information. In the example shown, the optimizer found that the match
	477	should contain a substring C<de> at offset 1, plus substring C<gh>
	478	at some offset between 3 and infinity. Moreover, when checking for
	479	these substrings (to abandon impossible matches quickly), Perl will check
	480	for the substring C<gh> before checking for the substring C<de>. The
	481	optimizer may also use the knowledge that the match starts (at the
	482	C<first> I<id>) with a character class, and no string
	483	shorter than 7 characters can possibly match.
	484
	485	The fields of interest which may appear in this line are
	486
	487	=over 4
	488
	489	=item C<anchored> I<STRING> C<at> I<POS>
	490
	491	=item C<floating> I<STRING> C<at> I<POS1..POS2>
	492
	493	See above.
	494
	495	=item C<matching floating/anchored>
	496
	497	Which substring to check first.
	498
	499	=item C<minlen>
	500
	501	The minimal length of the match.
	502
	503	=item C<stclass> I<TYPE>
	504
	505	Type of first matching node.
	506
	507	=item C<noscan>
	508
	509	Don't scan for the found substrings.
	510
	511	=item C<isall>
	512
	513	Means that the optimizer information is all that the regular
	514	expression contains, and thus one does not need to enter the regex engine at
	515	all.
	516
	517	=item C<GPOS>
	518
	519	Set if the pattern contains C<\G>.
	520
	521	=item C<plus>
	522
	523	Set if the pattern starts with a repeated char (as in C<x+y>).
	524
	525	=item C<implicit>
	526
	527	Set if the pattern starts with C<.*>.
	528
	529	=item C<with eval>
	530
	531	Set if the pattern contain eval-groups, such as C<(?{ code })> and
	532	C<(??{ code })>.
	533
	534	=item C<anchored(TYPE)>
	535
	536	If the pattern may match only at a handful of places, with C<TYPE>
	537	being C<SBOL>, C<MBOL>, or C<GPOS>. See the table below.
	538
	539	=back
	540
	541	If a substring is known to match at end-of-line only, it may be
	542	followed by C<$>, as in C<floating 'k'$>.
	543
	544	The optimizer-specific information is used to avoid entering (a slow) regex
	545	engine on strings that will not definitely match. If the C<isall> flag
	546	is set, a call to the regex engine may be avoided even when the optimizer
	547	found an appropriate place for the match.
	548
	549	Above the optimizer section is the list of I<nodes> of the compiled
	550	form of the regex. Each line has format
	551
	552	C< >I<id>: I<TYPE> I<OPTIONAL-INFO> (I<next-id>)
	553
	554	=head2 Types of Nodes
	555
	556	Here are the current possible types, with short descriptions:
	557
	558	=for comment
	559	This table is generated by regen/regcomp.pl. Any changes made here
	560	will be lost.
	561
	562	=for regcomp.pl begin
	563
	564	# TYPE arg-description [num-args] [longjump-len] DESCRIPTION
	565
	566	# Exit points
	567
	568	END no End of program.
	569	SUCCEED no Return from a subroutine, basically.
	570
	571	# Line Start Anchors:
	572	SBOL no Match "" at beginning of line: /^/, /\A/
	573	MBOL no Same, assuming multiline: /^/m
	574
	575	# Line End Anchors:
	576	SEOL no Match "" at end of line: /$/
	577	MEOL no Same, assuming multiline: /$/m
	578	EOS no Match "" at end of string: /\z/
	579
	580	# Match Start Anchors:
	581	GPOS no Matches where last m//g left off.
	582
	583	# Word Boundary Opcodes:
	584	BOUND no Like BOUNDA for non-utf8, otherwise match
	585	"" between any Unicode \w\W or \W\w
	586	BOUNDL no Like BOUND/BOUNDU, but \w and \W are
	587	defined by current locale
	588	BOUNDU no Match "" at any boundary of a given type
	589	using Unicode rules
	590	BOUNDA no Match "" at any boundary between \w\W or
	591	\W\w, where \w is [_a-zA-Z0-9]
	592	NBOUND no Like NBOUNDA for non-utf8, otherwise match
	593	"" between any Unicode \w\w or \W\W
	594	NBOUNDL no Like NBOUND/NBOUNDU, but \w and \W are
	595	defined by current locale
	596	NBOUNDU no Match "" at any non-boundary of a given
	597	type using using Unicode rules
	598	NBOUNDA no Match "" betweeen any \w\w or \W\W, where
	599	\w is [_a-zA-Z0-9]
	600
	601	# [Special] alternatives:
	602	REG_ANY no Match any one character (except newline).
	603	SANY no Match any one character.
	604	ANYOF sv 1 Match character in (or not in) this class,
	605	single char match only
	606	ANYOFD sv 1 Like ANYOF, but /d is in effect
	607	ANYOFL sv 1 Like ANYOF, but /l is in effect
	608	ANYOFM byte 1 Like ANYOF, but matches an invariant byte
	609	as determined by the mask and arg
	610
	611	# POSIX Character Classes:
	612	POSIXD none Some [[:class:]] under /d; the FLAGS field
	613	gives which one
	614	POSIXL none Some [[:class:]] under /l; the FLAGS field
	615	gives which one
	616	POSIXU none Some [[:class:]] under /u; the FLAGS field
	617	gives which one
	618	POSIXA none Some [[:class:]] under /a; the FLAGS field
	619	gives which one
	620	NPOSIXD none complement of POSIXD, [[:^class:]]
	621	NPOSIXL none complement of POSIXL, [[:^class:]]
	622	NPOSIXU none complement of POSIXU, [[:^class:]]
	623	NPOSIXA none complement of POSIXA, [[:^class:]]
	624
	625	ASCII none [[:ascii:]]
	626	NASCII none [[:^ascii:]]
	627
	628	CLUMP no Match any extended grapheme cluster
	629	sequence
	630
	631	# Alternation
	632
	633	# BRANCH The set of branches constituting a single choice are
	634	# hooked together with their "next" pointers, since
	635	# precedence prevents anything being concatenated to
	636	# any individual branch. The "next" pointer of the last
	637	# BRANCH in a choice points to the thing following the
	638	# whole choice. This is also where the final "next"
	639	# pointer of each individual branch points; each branch
	640	# starts with the operand node of a BRANCH node.
	641	#
	642	BRANCH node Match this alternative, or the next...
	643
	644	# Literals
	645
	646	EXACT str Match this string (preceded by length).
	647	EXACTL str Like EXACT, but /l is in effect (used so
	648	locale-related warnings can be checked
	649	for).
	650	EXACTF str Match this non-UTF-8 string (not guaranteed
	651	to be folded) using /id rules (w/len).
	652	EXACTFL str Match this string (not guaranteed to be
	653	folded) using /il rules (w/len).
	654	EXACTFU str Match this string (folded iff in UTF-8,
	655	length in folding doesn't change if not in
	656	UTF-8) using /iu rules (w/len).
	657	EXACTFAA str Match this string (not guaranteed to be
	658	folded) using /iaa rules (w/len).
	659
	660	EXACTFU_SS str Match this string (folded iff in UTF-8,
	661	length in folding may change even if not in
	662	UTF-8) using /iu rules (w/len).
	663	EXACTFLU8 str Rare circumstances: like EXACTFU, but is
	664	under /l, UTF-8, folded, and everything in
	665	it is above 255.
	666	EXACTFAA_NO_TRIE str Match this string (which is not trie-able;
	667	not guaranteed to be folded) using /iaa
	668	rules (w/len).
	669
	670	# Do nothing types
	671
	672	NOTHING no Match empty string.
	673	# A variant of above which delimits a group, thus stops optimizations
	674	TAIL no Match empty string. Can jump here from
	675	outside.
	676
	677	# Loops
	678
	679	# STAR,PLUS '?', and complex '*' and '+', are implemented as
	680	# circular BRANCH structures. Simple cases
	681	# (one character per match) are implemented with STAR
	682	# and PLUS for speed and to minimize recursive plunges.
	683	#
	684	STAR node Match this (simple) thing 0 or more times.
	685	PLUS node Match this (simple) thing 1 or more times.
	686
	687	CURLY sv 2 Match this simple thing {n,m} times.
	688	CURLYN no 2 Capture next-after-this simple thing
	689	CURLYM no 2 Capture this medium-complex thing {n,m}
	690	times.
	691	CURLYX sv 2 Match this complex thing {n,m} times.
	692
	693	# This terminator creates a loop structure for CURLYX
	694	WHILEM no Do curly processing and see if rest
	695	matches.
	696
	697	# Buffer related
	698
	699	# OPEN,CLOSE,GROUPP ...are numbered at compile time.
	700	OPEN num 1 Mark this point in input as start of #n.
	701	CLOSE num 1 Close corresponding OPEN of #n.
	702	SROPEN none Same as OPEN, but for script run
	703	SRCLOSE none Close preceding SROPEN
	704
	705	REF num 1 Match some already matched string
	706	REFF num 1 Match already matched string, folded using
	707	native charset rules for non-utf8
	708	REFFL num 1 Match already matched string, folded in
	709	loc.
	710	REFFU num 1 Match already matched string, folded using
	711	unicode rules for non-utf8
	712	REFFA num 1 Match already matched string, folded using
	713	unicode rules for non-utf8, no mixing
	714	ASCII, non-ASCII
	715
	716	# Named references. Code in regcomp.c assumes that these all are after
	717	# the numbered references
	718	NREF no-sv 1 Match some already matched string
	719	NREFF no-sv 1 Match already matched string, folded using
	720	native charset rules for non-utf8
	721	NREFFL no-sv 1 Match already matched string, folded in
	722	loc.
	723	NREFFU num 1 Match already matched string, folded using
	724	unicode rules for non-utf8
	725	NREFFA num 1 Match already matched string, folded using
	726	unicode rules for non-utf8, no mixing
	727	ASCII, non-ASCII
	728
	729	# Support for long RE
	730	LONGJMP off 1 1 Jump far away.
	731	BRANCHJ off 1 1 BRANCH with long offset.
	732
	733	# Special Case Regops
	734	IFMATCH off 1 2 Succeeds if the following matches.
	735	UNLESSM off 1 2 Fails if the following matches.
	736	SUSPEND off 1 1 "Independent" sub-RE.
	737	IFTHEN off 1 1 Switch, should be preceded by switcher.
	738	GROUPP num 1 Whether the group matched.
	739
	740	# The heavy worker
	741
	742	EVAL evl/flags Execute some Perl code.
	743	2L
	744
	745	# Modifiers
	746
	747	MINMOD no Next operator is not greedy.
	748	LOGICAL no Next opcode should set the flag only.
	749
	750	# This is not used yet
	751	RENUM off 1 1 Group with independently numbered parens.
	752
	753	# Trie Related
	754
	755	# Behave the same as A\|LIST\|OF\|WORDS would. The '..C' variants
	756	# have inline charclass data (ascii only), the 'C' store it in the
	757	# structure.
	758
	759	TRIE trie 1 Match many EXACT(F[ALU]?)? at once.
	760	flags==type
	761	TRIEC trie Same as TRIE, but with embedded charclass
	762	charclass data
	763
	764	AHOCORASICK trie 1 Aho Corasick stclass. flags==type
	765	AHOCORASICKC trie Same as AHOCORASICK, but with embedded
	766	charclass charclass data
	767
	768	# Regex Subroutines
	769	GOSUB num/ofs 2L recurse to paren arg1 at (signed) ofs arg2
	770
	771	# Special conditionals
	772	NGROUPP no-sv 1 Whether the group matched.
	773	INSUBP num 1 Whether we are in a specific recurse.
	774	DEFINEP none 1 Never execute directly.
	775
	776	# Backtracking Verbs
	777	ENDLIKE none Used only for the type field of verbs
	778	OPFAIL no-sv 1 Same as (?!), but with verb arg
	779	ACCEPT no-sv/num Accepts the current matched string, with
	780	2L verbar
	781
	782	# Verbs With Arguments
	783	VERB no-sv 1 Used only for the type field of verbs
	784	PRUNE no-sv 1 Pattern fails at this startpoint if no-
	785	backtracking through this
	786	MARKPOINT no-sv 1 Push the current location for rollback by
	787	cut.
	788	SKIP no-sv 1 On failure skip forward (to the mark)
	789	before retrying
	790	COMMIT no-sv 1 Pattern fails outright if backtracking
	791	through this
	792	CUTGROUP no-sv 1 On failure go to the next alternation in
	793	the group
	794
	795	# Control what to keep in $&.
	796	KEEPS no $& begins here.
	797
	798	# New charclass like patterns
	799	LNBREAK none generic newline pattern
	800
	801	# SPECIAL REGOPS
	802
	803	# This is not really a node, but an optimized away piece of a "long"
	804	# node. To simplify debugging output, we mark it as if it were a node
	805	OPTIMIZED off Placeholder for dump.
	806
	807	# Special opcode with the property that no opcode in a compiled program
	808	# will ever be of this type. Thus it can be used as a flag value that
	809	# no other opcode has been seen. END is used similarly, in that an END
	810	# node cant be optimized. So END implies "unoptimizable" and PSEUDO
	811	# mean "not seen anything to optimize yet".
	812	PSEUDO off Pseudo opcode for internal use.
	813
	814	=for regcomp.pl end
	815
	816	=for unprinted-credits
	817	Next section M-J. Dominus (mjd-perl-patch+@plover.com) 20010421
	818
	819	Following the optimizer information is a dump of the offset/length
	820	table, here split across several lines:
	821
	822	Offsets: [45]
	823	1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]
	824	0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]
	825	11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]
	826	0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0]
	827
	828	The first line here indicates that the offset/length table contains 45
	829	entries. Each entry is a pair of integers, denoted by C<offset[length]>.
	830	Entries are numbered starting with 1, so entry #1 here is C<1[4]> and
	831	entry #12 is C<5[1]>. C<1[4]> indicates that the node labeled C<1:>
	832	(the C<1: ANYOF[bc]>) begins at character position 1 in the
	833	pre-compiled form of the regex, and has a length of 4 characters.
	834	C<5[1]> in position 12
	835	indicates that the node labeled C<12:>
	836	(the C<< 12: EXACT <d> >>) begins at character position 5 in the
	837	pre-compiled form of the regex, and has a length of 1 character.
	838	C<12[1]> in position 14
	839	indicates that the node labeled C<14:>
	840	(the C<< 14: CURLYX[0] {1,32767} >>) begins at character position 12 in the
	841	pre-compiled form of the regex, and has a length of 1 character---that
	842	is, it corresponds to the C<+> symbol in the precompiled regex.
	843
	844	C<0[0]> items indicate that there is no corresponding node.
	845
	846	=head2 Run-time Output
	847
	848	First of all, when doing a match, one may get no run-time output even
	849	if debugging is enabled. This means that the regex engine was never
	850	entered and that all of the job was therefore done by the optimizer.
	851
	852	If the regex engine was entered, the output may look like this:
	853
	854	Matching '[bc]d(ef*g)+h[ij]k$' against 'abcdefg__gh__'
	855	Setting an EVAL scope, savestack=3
	856	2 <ab> <cdefg__gh_> \| 1: ANYOF
	857	3 <abc> <defg__gh_> \| 11: EXACT <d>
	858	4 <abcd> <efg__gh_> \| 13: CURLYX {1,32767}
	859	4 <abcd> <efg__gh_> \| 26: WHILEM
	860	0 out of 1..32767 cc=effff31c
	861	4 <abcd> <efg__gh_> \| 15: OPEN1
	862	4 <abcd> <efg__gh_> \| 17: EXACT <e>
	863	5 <abcde> <fg__gh_> \| 19: STAR
	864	EXACT <f> can match 1 times out of 32767...
	865	Setting an EVAL scope, savestack=3
	866	6 <bcdef> <g__gh__> \| 22: EXACT <g>
	867	7 <bcdefg> <__gh__> \| 24: CLOSE1
	868	7 <bcdefg> <__gh__> \| 26: WHILEM
	869	1 out of 1..32767 cc=effff31c
	870	Setting an EVAL scope, savestack=12
	871	7 <bcdefg> <__gh__> \| 15: OPEN1
	872	7 <bcdefg> <__gh__> \| 17: EXACT <e>
	873	restoring \1 to 4(4)..7
	874	failed, try continuation...
	875	7 <bcdefg> <__gh__> \| 27: NOTHING
	876	7 <bcdefg> <__gh__> \| 28: EXACT <h>
	877	failed...
	878	failed...
	879
	880	The most significant information in the output is about the particular I<node>
	881	of the compiled regex that is currently being tested against the target string.
	882	The format of these lines is
	883
	884	C< >I<STRING-OFFSET> <I<PRE-STRING>> <I<POST-STRING>> \|I<ID>: I<TYPE>
	885
	886	The I<TYPE> info is indented with respect to the backtracking level.
	887	Other incidental information appears interspersed within.
	888
	889	=head1 Debugging Perl Memory Usage
	890
	891	Perl is a profligate wastrel when it comes to memory use. There
	892	is a saying that to estimate memory usage of Perl, assume a reasonable
	893	algorithm for memory allocation, multiply that estimate by 10, and
	894	while you still may miss the mark, at least you won't be quite so
	895	astonished. This is not absolutely true, but may provide a good
	896	grasp of what happens.
	897
	898	Assume that an integer cannot take less than 20 bytes of memory, a
	899	float cannot take less than 24 bytes, a string cannot take less
	900	than 32 bytes (all these examples assume 32-bit architectures, the
	901	result are quite a bit worse on 64-bit architectures). If a variable
	902	is accessed in two of three different ways (which require an integer,
	903	a float, or a string), the memory footprint may increase yet another
	904	20 bytes. A sloppy malloc(3) implementation can inflate these
	905	numbers dramatically.
	906
	907	On the opposite end of the scale, a declaration like
	908
	909	sub foo;
	910
	911	may take up to 500 bytes of memory, depending on which release of Perl
	912	you're running.
	913
	914	Anecdotal estimates of source-to-compiled code bloat suggest an
	915	eightfold increase. This means that the compiled form of reasonable
	916	(normally commented, properly indented etc.) code will take
	917	about eight times more space in memory than the code took
	918	on disk.
	919
	920	The B<-DL> command-line switch is obsolete since circa Perl 5.6.0
	921	(it was available only if Perl was built with C<-DDEBUGGING>).
	922	The switch was used to track Perl's memory allocations and possible
	923	memory leaks. These days the use of malloc debugging tools like
	924	F<Purify> or F<valgrind> is suggested instead. See also
	925	L<perlhacktips/PERL_MEM_LOG>.
	926
	927	One way to find out how much memory is being used by Perl data
	928	structures is to install the Devel::Size module from CPAN: it gives
	929	you the minimum number of bytes required to store a particular data
	930	structure. Please be mindful of the difference between the size()
	931	and total_size().
	932
	933	If Perl has been compiled using Perl's malloc you can analyze Perl
	934	memory usage by setting $ENV{PERL_DEBUG_MSTATS}.
	935
	936	=head2 Using C<$ENV{PERL_DEBUG_MSTATS}>
	937
	938	If your perl is using Perl's malloc() and was compiled with the
	939	necessary switches (this is the default), then it will print memory
	940	usage statistics after compiling your code when C<< $ENV{PERL_DEBUG_MSTATS}
	941	> 1 >>, and before termination of the program when C<<
	942	$ENV{PERL_DEBUG_MSTATS} >= 1 >>. The report format is similar to
	943	the following example:
	944
	945	$ PERL_DEBUG_MSTATS=2 perl -e "require Carp"
	946	Memory allocation statistics after compilation: (buckets 4(4)..8188(8192)
	947	14216 free: 130 117 28 7 9 0 2 2 1 0 0
	948	437 61 36 0 5
	949	60924 used: 125 137 161 55 7 8 6 16 2 0 1
	950	74 109 304 84 20
	951	Total sbrk(): 77824/21:119. Odd ends: pad+heads+chain+tail: 0+636+0+2048.
	952	Memory allocation statistics after execution: (buckets 4(4)..8188(8192)
	953	30888 free: 245 78 85 13 6 2 1 3 2 0 1
	954	315 162 39 42 11
	955	175816 used: 265 176 1112 111 26 22 11 27 2 1 1
	956	196 178 1066 798 39
	957	Total sbrk(): 215040/47:145. Odd ends: pad+heads+chain+tail: 0+2192+0+6144.
	958
	959	It is possible to ask for such a statistic at arbitrary points in
	960	your execution using the mstat() function out of the standard
	961	Devel::Peek module.
	962
	963	Here is some explanation of that format:
	964
	965	=over 4
	966
	967	=item C<buckets SMALLEST(APPROX)..GREATEST(APPROX)>
	968
	969	Perl's malloc() uses bucketed allocations. Every request is rounded
	970	up to the closest bucket size available, and a bucket is taken from
	971	the pool of buckets of that size.
	972
	973	The line above describes the limits of buckets currently in use.
	974	Each bucket has two sizes: memory footprint and the maximal size
	975	of user data that can fit into this bucket. Suppose in the above
	976	example that the smallest bucket were size 4. The biggest bucket
	977	would have usable size 8188, and the memory footprint would be 8192.
	978
	979	In a Perl built for debugging, some buckets may have negative usable
	980	size. This means that these buckets cannot (and will not) be used.
	981	For larger buckets, the memory footprint may be one page greater
	982	than a power of 2. If so, the corresponding power of two is
	983	printed in the C<APPROX> field above.
	984
	985	=item Free/Used
	986
	987	The 1 or 2 rows of numbers following that correspond to the number
	988	of buckets of each size between C<SMALLEST> and C<GREATEST>. In
	989	the first row, the sizes (memory footprints) of buckets are powers
	990	of two--or possibly one page greater. In the second row, if present,
	991	the memory footprints of the buckets are between the memory footprints
	992	of two buckets "above".
	993
	994	For example, suppose under the previous example, the memory footprints
	995	were
	996
	997	free: 8 16 32 64 128 256 512 1024 2048 4096 8192
	998	4 12 24 48 80
	999
	1000	With a non-C<DEBUGGING> perl, the buckets starting from C<128> have
	1001	a 4-byte overhead, and thus an 8192-long bucket may take up to
	1002	8188-byte allocations.
	1003
	1004	=item C<Total sbrk(): SBRKed/SBRKs:CONTINUOUS>
	1005
	1006	The first two fields give the total amount of memory perl sbrk(2)ed
	1007	(ess-broken? :-) and number of sbrk(2)s used. The third number is
	1008	what perl thinks about continuity of returned chunks. So long as
	1009	this number is positive, malloc() will assume that it is probable
	1010	that sbrk(2) will provide continuous memory.
	1011
	1012	Memory allocated by external libraries is not counted.
	1013
	1014	=item C<pad: 0>
	1015
	1016	The amount of sbrk(2)ed memory needed to keep buckets aligned.
	1017
	1018	=item C<heads: 2192>
	1019
	1020	Although memory overhead of bigger buckets is kept inside the bucket, for
	1021	smaller buckets, it is kept in separate areas. This field gives the
	1022	total size of these areas.
	1023
	1024	=item C<chain: 0>
	1025
	1026	malloc() may want to subdivide a bigger bucket into smaller buckets.
	1027	If only a part of the deceased bucket is left unsubdivided, the rest
	1028	is kept as an element of a linked list. This field gives the total
	1029	size of these chunks.
	1030
	1031	=item C<tail: 6144>
	1032
	1033	To minimize the number of sbrk(2)s, malloc() asks for more memory. This
	1034	field gives the size of the yet unused part, which is sbrk(2)ed, but
	1035	never touched.
	1036
	1037	=back
	1038
	1039	=head1 SEE ALSO
	1040
	1041	L<perldebug>,
	1042	L<perlguts>,
	1043	L<perlrun>
	1044	L<re>,
	1045	and
	1046	L<Devel::DProf>.