This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
Create inversion list for Assigned code points
[perl5.git] / Porting / bench.pl
CommitLineData
9e7973fa
DM
1#!/usr/bin/perl
2#
3# A tool for analysing the performance of the code snippets found in
4# t/perf/benchmarks or similar
5
6
7=head1 NAME
8
9bench.pl - Compare the performance of perl code snippets across multiple
10perls.
11
12=head1 SYNOPSIS
13
4a1358c2
FC
14 # Basic: run the tests in t/perf/benchmarks against two or
15 # more perls
9e7973fa 16
32dfbb33
JC
17 bench.pl [options] -- perlA[=labelA] perlB[=labelB] ...
18
19 # run the tests against same perlA 2x, with and without extra
20 # options
21
22 bench.pl [options] -- perlA=fast PerlA=slow -Mstrict -Dpsltoc
9e7973fa
DM
23
24 # Run bench.pl's own built-in sanity tests
25
26 bench.pl --action=selftest
27
4044748b
YO
28 # Run bench on blead, which is then modified and timed again
29
30 bench.pl [options] --write=blead.time -- ./perl=blead
31 # hack hack hack
c3bb902a 32 bench.pl --read=blead.time -- ./perl=hacked
4044748b
YO
33
34 # You can also combine --read with --write
c3bb902a 35 bench.pl --read=blead.time --write=last.time -- ./perl=hacked
4044748b 36
9e7973fa
DM
37=head1 DESCRIPTION
38
39By default, F<bench.pl> will run code snippets found in
40F<t/perf/benchmarks> (or similar) under cachegrind, in order to calculate
41how many instruction reads, data writes, branches, cache misses, etc. that
42one execution of the snippet uses. It will run them against two or more
43perl executables and show how much each test has gotten better or worse.
44
45It is modelled on the F<perlbench> tool, but since it measures instruction
46reads etc., rather than timings, it is much more precise and reproducible.
e34630bf 47It is also considerably faster, and is capable of running tests in
9e7973fa
DM
48parallel (with C<-j>). Rather than displaying a single relative
49percentage per test/perl combination, it displays values for 13 different
50measurements, such as instruction reads, conditional branch misses etc.
51
52There are options to write the raw data to a file, and to read it back.
53This means that you can view the same run data in different views with
4044748b
YO
54different selection and sort options. You can also use this mechanism
55to save the results of timing one perl, and then read it back while timing
56a modification, so that you dont have rerun the same tests on the same
57perl over and over, or have two perls built at the same time.
9e7973fa
DM
58
59The optional C<=label> after each perl executable is used in the display
4044748b
YO
60output. If you are doing a two step benchmark then you should provide
61a label for at least the "base" perl.
9e7973fa
DM
62
63=head1 OPTIONS
64
65=over 4
66
67=item *
68
69--action=I<foo>
70
71What action to perform. The default is I<grind>, which runs the benchmarks
72using I<cachegrind> as the back end. The only other action at the moment is
73I<selftest>, which runs some basic sanity checks and produces TAP output.
74
75=item *
76
77--average
78
79Only display the overall average, rather than the results for each
80individual test.
81
82=item *
83
84--benchfile=I<foo>
85
86The path of the file which contains the benchmarks (F<t/perf/benchmarks>
87by default).
88
89=item *
90
91--bisect=I<field,minval,maxval>
92
93Run a single test against one perl and exit with a zero status if the
94named field is in the specified range; exit 1 otherwise. It will complain
95if more than one test or perl has been specified. It is intended to be
96called as part of a bisect run, to determine when something changed.
97For example,
98
4a1358c2
FC
99 bench.pl -j 8 --tests=foo --bisect=Ir,100,105 --perlargs=-Ilib \
100 ./miniperl
9e7973fa
DM
101
102might be called from bisect to find when the number of instruction reads
103for test I<foo> falls outside the range 100..105.
104
105=item *
106
df3d7b3a
DM
107--compact=<Iperl>
108
109Display the results for a single perl executable in a compact form.
110Which perl to display is specified in the same manner as C<--norm>.
111
112=item *
113
9e7973fa
DM
114--debug
115
116Enable verbose debugging output.
117
118=item *
119
120--fields=I<a,b,c>
121
122Display only the specified fields; for example,
123
124 --fields=Ir,Ir_m,Ir_mm
125
126If only one field is selected, the output is in more compact form.
127
128=item *
129
130--grindargs=I<foo>
131
8a094fee
JC
132Optional command-line arguments to pass to all cachegrind invocations.
133
134This option is appended to those which bench.pl uses for its own
135purposes; so it can be used to override them (see --debug output
136below), and can also be 'abused' to add redirects into the valgrind
137command invocation.
138
139For example, this writes PERL_MEM_LOG activity to foobar.$$, because
1403>foobar.$$ redirects fd 3, then perl under PERL_MEM_LOG writes to fd 3.
141
142 $ perl Porting/bench.pl --jobs=2 --verbose --debug \
143 --tests=call::sub::amp_empty \
144 \
145 --grindargs='--cachegrind-out-file=junk.$$ 3>foobar.$$' \
146 -- \
147 perl5.24.0 perl5.24.0:+memlog:PERL_MEM_LOG=3mst
148
149for the +memlog tests, this executes as: (shown via --debug, then prettyfied)
150
151 Command: PERL_HASH_SEED=0 PERL_MEM_LOG=3mst
152 valgrind --tool=cachegrind --branch-sim=yes
153 --cachegrind-out-file=/dev/null --cachegrind-out-file=junk.$$
154 3>foobar.$$ perl5.24.0 - 10 2>&1
155
156The result is that a set of junk.$$ files containing raw cachegrind
157output are written, and foobar.$$ contains the expected memlog output.
158
159Notes:
160
161Theres no obvious utility for those junk.$$ and foobar.$$ files, but
162you can have them anyway.
163
164The 3 in PERL_MEM_LOG=3mst is needed because the output would
165otherwize go to STDERR, and cause parse_cachegrind() to reject the
166test and die.
167
168The --grindargs redirect is needed to capture the memlog output;
169without it, the memlog output is written to fd3, around
170parse_cachegrind and effectively into /dev/null
171
172PERL_MEM_LOG is expensive when used.
173
174call::sub::amp_empty
175&foo function call with no args or body
176
177 perl5.24.0 perl5.24.0+memlog
178 ---------- -----------------
179 Ir 394.0 543477.5
180 Dr 161.0 146814.1
181 Dw 72.0 122304.6
182 COND 58.0 66796.4
183 IND 5.0 5537.7
184
185COND_m 0.0 6743.1
186 IND_m 5.0 1490.2
187
188 Ir_m1 0.0 683.7
189 Dr_m1 0.0 65.9
190 Dw_m1 0.0 8.5
191
192 Ir_mm 0.0 11.6
193 Dr_mm 0.0 10.6
194 Dw_mm 0.0 4.7
195
9e7973fa
DM
196
197=item *
198
199---help
200
201Display basic usage information.
202
203=item *
204
205-j I<N>
206--jobs=I<N>
207
208Run I<N> jobs in parallel (default 1). This determines how many cachegrind
209process will running at a time, and should generally be set to the number
210of CPUs available.
211
212=item *
213
214--norm=I<foo>
215
216Specify which perl column in the output to treat as the 100% norm.
217It may be a column number (0..N-1) or a perl executable name or label.
218It defaults to the leftmost column.
219
220=item *
221
222--perlargs=I<foo>
223
32dfbb33
JC
224Optional command-line arguments to pass to each perl-under-test
225(perlA, perlB in synopsis) For example, C<--perlargs=-Ilib>.
9e7973fa
DM
226
227=item *
228
229--raw
230
231Display raw data counts rather than percentages in the outputs. This
232allows you to see the exact number of intruction reads, branch misses etc.
233for each test/perl combination. It also causes the C<AVERAGE> display
234per field to be calculated based on the average of each tests's count
235rather than average of each percentage. This means that tests with very
236high counts will dominate.
237
238=item *
239
65f0b1c9 240--sort=I<field:perl>
9e7973fa
DM
241
242Order the tests in the output based on the value of I<field> in the
243column I<perl>. The I<perl> value is as per C<--norm>. For example
244
4a1358c2
FC
245 bench.pl --sort=Dw:perl-5.20.0 \
246 perl-5.16.0 perl-5.18.0 perl-5.20.0
9e7973fa
DM
247
248=item *
249
250-r I<file>
251--read=I<file>
252
253Read in saved data from a previous C<--write> run from the specified file.
254
255Requires C<JSON::PP> to be available.
256
257=item *
258
259--tests=I<FOO>
260
261Specify a subset of tests to run (or in the case of C<--read>, to display).
262It may be either a comma-separated list of test names, or a regular
263expression. For example
264
265 --tests=expr::assign::scalar_lex,expr::assign::2list_lex
266 --tests=/^expr::/
267
268=item *
269
270--verbose
271
272Display progress information.
273
274=item *
275
276-w I<file>
277--write=I<file>
278
279Save the raw data to the specified file. It can be read back later with
4044748b
YO
280C<--read>. If combined with C<--read> then the output file will be
281the merge of the file read and any additional perls added on the command
282line.
9e7973fa
DM
283
284Requires C<JSON::PP> to be available.
285
286=back
287
288=cut
289
290
291
292use 5.010000;
293use warnings;
294use strict;
d54523c4 295use Getopt::Long qw(:config no_auto_abbrev require_order);
9e7973fa
DM
296use IPC::Open2 ();
297use IO::Select;
c2d21e7a 298use IO::File;
9e7973fa
DM
299use POSIX ":sys_wait_h";
300
301# The version of the file format used to save data. We refuse to process
302# the file if the integer component differs.
303
304my $FORMAT_VERSION = 1.0;
305
306# The fields we know about
307
308my %VALID_FIELDS = map { $_ => 1 }
309 qw(Ir Ir_m1 Ir_mm Dr Dr_m1 Dr_mm Dw Dw_m1 Dw_mm COND COND_m IND IND_m);
310
311sub usage {
312 die <<EOF;
d54523c4 313usage: $0 [options] -- perl[=label] ...
9e7973fa
DM
314 --action=foo What action to perform [default: grind].
315 --average Only display average, not individual test results.
316 --benchfile=foo File containing the benchmarks;
317 [default: t/perf/benchmarks].
318 --bisect=f,min,max run a single test against one perl and exit with a
319 zero status if the named field is in the specified
320 range; exit 1 otherwise.
df3d7b3a
DM
321 --compact=perl Display the results of a single perl in compact form.
322 Which perl specified like --norm
9e7973fa
DM
323 --debug Enable verbose debugging output.
324 --fields=a,b,c Display only the specified fields (e.g. Ir,Ir_m,Ir_mm).
325 --grindargs=foo Optional command-line args to pass to cachegrind.
326 --help Display this help.
327 -j|--jobs=N Run N jobs in parallel [default 1].
328 --norm=perl Which perl column to treat as 100%; may be a column
329 number (0..N-1) or a perl executable name or label;
330 [default: 0].
331 --perlargs=foo Optional command-line args to pass to each perl to run.
332 --raw Display raw data counts rather than percentages.
5825b6d4 333 --show Show results even though we are going to write results.
65f0b1c9 334 --sort=field:perl Sort the tests based on the value of 'field' in the
9e7973fa
DM
335 column 'perl'. The perl value is as per --norm.
336 -r|--read=file Read in previously saved data from the specified file.
337 --tests=FOO Select only the specified tests from the benchmarks file;
338 FOO may be either of the form 'foo,bar' or '/regex/';
339 [default: all tests].
340 --verbose Display progress information.
341 -w|--write=file Save the raw data to the specified file.
342
343--action is one of:
344 grind run the code under cachegrind
345 selftest perform a selftest; produce TAP output
346
347The command line ends with one or more specified perl executables,
348which will be searched for in the current \$PATH. Each binary name may
349have an optional =LABEL appended, which will be used rather than the
350executable name in output. E.g.
351
352 perl-5.20.1=PRE-BUGFIX perl-5.20.1-new=POST-BUGFIX
353EOF
354}
355
356my %OPTS = (
357 action => 'grind',
358 average => 0,
359 benchfile => 't/perf/benchmarks',
360 bisect => undef,
df3d7b3a 361 compact => undef,
9e7973fa
DM
362 debug => 0,
363 grindargs => '',
364 fields => undef,
365 jobs => 1,
366 norm => 0,
367 perlargs => '',
368 raw => 0,
369 read => undef,
5825b6d4 370 show => 0,
9e7973fa
DM
371 sort => undef,
372 tests => undef,
373 verbose => 0,
374 write => undef,
375);
376
377
378# process command-line args and call top-level action
379
380{
381 GetOptions(
382 'action=s' => \$OPTS{action},
383 'average' => \$OPTS{average},
384 'benchfile=s' => \$OPTS{benchfile},
385 'bisect=s' => \$OPTS{bisect},
df3d7b3a 386 'compact=s' => \$OPTS{compact},
9e7973fa
DM
387 'debug' => \$OPTS{debug},
388 'grindargs=s' => \$OPTS{grindargs},
389 'help' => \$OPTS{help},
390 'fields=s' => \$OPTS{fields},
391 'jobs|j=i' => \$OPTS{jobs},
392 'norm=s' => \$OPTS{norm},
393 'perlargs=s' => \$OPTS{perlargs},
394 'raw' => \$OPTS{raw},
395 'read|r=s' => \$OPTS{read},
5825b6d4 396 'show!' => \$OPTS{show},
9e7973fa
DM
397 'sort=s' => \$OPTS{sort},
398 'tests=s' => \$OPTS{tests},
399 'verbose' => \$OPTS{verbose},
400 'write|w=s' => \$OPTS{write},
401 ) or usage;
402
403 usage if $OPTS{help};
404
405
9e7973fa
DM
406 if (defined $OPTS{read} or defined $OPTS{write}) {
407 # fail early if it's not present
408 require JSON::PP;
409 }
410
411 if (defined $OPTS{fields}) {
412 my @f = split /,/, $OPTS{fields};
413 for (@f) {
414 die "Error: --fields: unknown field '$_'\n"
415 unless $VALID_FIELDS{$_};
416 }
417 my %f = map { $_ => 1 } @f;
418 $OPTS{fields} = \%f;
419 }
420
421 my %valid_actions = qw(grind 1 selftest 1);
422 unless ($valid_actions{$OPTS{action}}) {
423 die "Error: unrecognised action '$OPTS{action}'\n"
424 . "must be one of: " . join(', ', sort keys %valid_actions)."\n";
425 }
426
427 if (defined $OPTS{sort}) {
428 my @s = split /:/, $OPTS{sort};
429 if (@s != 2) {
430 die "Error: --sort argument should be of the form field:perl: "
431 . "'$OPTS{sort}'\n";
432 }
433 my ($field, $perl) = @s;
434 die "Error: --sort: unknown field '$field\n"
435 unless $VALID_FIELDS{$field};
436 # the 'perl' value will be validated later, after we have processed
437 # the perls
438 $OPTS{'sort-field'} = $field;
439 $OPTS{'sort-perl'} = $perl;
440 }
441
442 if ($OPTS{action} eq 'selftest') {
443 if (@ARGV) {
444 die "Error: no perl executables may be specified with --read\n"
445 }
446 }
447 elsif (defined $OPTS{bisect}) {
448 die "Error: exactly one perl executable must be specified for bisect\n"
449 unless @ARGV == 1;
450 die "Error: Can't specify both --bisect and --read\n"
451 if defined $OPTS{read};
452 die "Error: Can't specify both --bisect and --write\n"
453 if defined $OPTS{write};
454 }
9e7973fa
DM
455
456 if ($OPTS{action} eq 'grind') {
457 do_grind(\@ARGV);
458 }
459 elsif ($OPTS{action} eq 'selftest') {
460 do_selftest();
461 }
462}
463exit 0;
464
465
466# Given a hash ref keyed by test names, filter it by deleting unwanted
467# tests, based on $OPTS{tests}.
468
469sub filter_tests {
470 my ($tests) = @_;
471
472 my $opt = $OPTS{tests};
473 return unless defined $opt;
474
475 my @tests;
476
477 if ($opt =~ m{^/}) {
478 $opt =~ s{^/(.+)/$}{$1}
479 or die "Error: --tests regex must be of the form /.../\n";
480 for (keys %$tests) {
481 delete $tests->{$_} unless /$opt/;
482 }
483 }
484 else {
485 my %t;
486 for (split /,/, $opt) {
24293753
JC
487 die "Error: no such test found: '$_'\n"
488 . ($OPTS{verbose} ? " have: @{[ sort keys %$tests ]}\n" : "")
489 unless exists $tests->{$_};
9e7973fa
DM
490 $t{$_} = 1;
491 }
492 for (keys %$tests) {
493 delete $tests->{$_} unless exists $t{$_};
494 }
495 }
4044748b 496 die "Error: no tests to run\n" unless %$tests;
9e7973fa
DM
497}
498
499
500# Read in the test file, and filter out any tests excluded by $OPTS{tests}
957d8930
DM
501# return a hash ref { testname => { test }, ... }
502# and an array ref of the original test names order,
9e7973fa
DM
503
504sub read_tests_file {
505 my ($file) = @_;
506
507 my $ta = do $file;
508 unless ($ta) {
509 die "Error: can't parse '$file': $@\n" if $@;
510 die "Error: can't read '$file': $!\n";
511 }
512
957d8930
DM
513 my @orig_order;
514 for (my $i=0; $i < @$ta; $i += 2) {
515 push @orig_order, $ta->[$i];
516 }
517
9e7973fa
DM
518 my $t = { @$ta };
519 filter_tests($t);
957d8930 520 return $t, \@orig_order;
9e7973fa
DM
521}
522
523
524# Process the perl/column argument of options like --norm and --sort.
525# Return the index of the matching perl.
526
527sub select_a_perl {
528 my ($perl, $perls, $who) = @_;
5825b6d4 529 $perls||=[];
9e7973fa
DM
530 if ($perl =~ /^[0-9]$/) {
531 die "Error: $who value $perl outside range 0.." . $#$perls . "\n"
532 unless $perl < @$perls;
533 return $perl;
534 }
535 else {
536 my @perl = grep $perls->[$_][0] eq $perl
537 || $perls->[$_][1] eq $perl,
538 0..$#$perls;
539 die "Error: $who: unrecognised perl '$perl'\n"
540 unless @perl;
541 die "Error: $who: ambiguous perl '$perl'\n"
542 if @perl > 1;
543 return $perl[0];
544 }
545}
546
547
44b18357 548# Validate the list of perl=label (+ cmdline options) on the command line.
92683b8b
DM
549# Return a list of [ exe, label, cmdline-options ] tuples, i.e.
550# 'perl-under-test's (PUTs)
9e7973fa 551
d54523c4 552sub process_puts {
4044748b 553 my $read_perls= shift;
44b18357 554 my @res_puts; # returned, each item is [ perlexe, label, @putargs ]
4044748b 555 my %seen= map { $_->[1] => 1 } @$read_perls;
d54523c4
JC
556 my @putargs; # collect not-perls into args per PUT
557
558 for my $p (reverse @_) {
559 push @putargs, $p and next if $p =~ /^-/; # not-perl, dont send to qx//
560
60858fe8 561 my ($perl, $label, $env) = split /[=:,]/, $p, 3;
9e7973fa 562 $label //= $perl;
857716c9 563 $label = $perl.$label if $label =~ /^\+/;
92683b8b 564 die "$label cannot be used on 2 different perls under test\n" if $seen{$label}++;
955a736c 565
60858fe8
JC
566 my %env;
567 if ($env) {
568 %env = split /[=,]/, $env;
569 }
9e7973fa 570 my $r = qx($perl -e 'print qq(ok\n)' 2>&1);
d54523c4 571 if ($r eq "ok\n") {
60858fe8 572 push @res_puts, [ $perl, $label, \%env, reverse @putargs ];
d54523c4 573 @putargs = ();
44b18357
JC
574 warn "Added Perl-Under-Test: [ @{[@{$res_puts[-1]}]} ]\n"
575 if $OPTS{verbose};
d54523c4 576 } else {
92683b8b 577 warn "perl-under-test args: @putargs + a not-perl: $p $r\n"
44b18357 578 if $OPTS{verbose};
d54523c4
JC
579 push @putargs, $p; # not-perl
580 }
9e7973fa 581 }
44b18357 582 return reverse @res_puts;
9e7973fa
DM
583}
584
585
8fbd1c2c 586
9e7973fa
DM
587# Return a string containing perl test code wrapped in a loop
588# that runs $ARGV[0] times
589
590sub make_perl_prog {
591 my ($test, $desc, $setup, $code) = @_;
592
593 return <<EOF;
594# $desc
595package $test;
596BEGIN { srand(0) }
597$setup;
598for my \$__loop__ (1..\$ARGV[0]) {
599 $code;
600}
601EOF
602}
603
604
605# Parse the output from cachegrind. Return a hash ref.
606# See do_selftest() for examples of the output format.
607
608sub parse_cachegrind {
609 my ($output, $id, $perl) = @_;
610
611 my %res;
612
613 my @lines = split /\n/, $output;
614 for (@lines) {
615 unless (s/(==\d+==)|(--\d+--) //) {
616 die "Error: while executing $id:\n"
617 . "unexpected code or cachegrind output:\n$_\n";
618 }
619 if (/I refs:\s+([\d,]+)/) {
620 $res{Ir} = $1;
621 }
622 elsif (/I1 misses:\s+([\d,]+)/) {
623 $res{Ir_m1} = $1;
624 }
625 elsif (/LLi misses:\s+([\d,]+)/) {
626 $res{Ir_mm} = $1;
627 }
628 elsif (/D refs:\s+.*?([\d,]+) rd .*?([\d,]+) wr/) {
629 @res{qw(Dr Dw)} = ($1,$2);
630 }
631 elsif (/D1 misses:\s+.*?([\d,]+) rd .*?([\d,]+) wr/) {
632 @res{qw(Dr_m1 Dw_m1)} = ($1,$2);
633 }
634 elsif (/LLd misses:\s+.*?([\d,]+) rd .*?([\d,]+) wr/) {
635 @res{qw(Dr_mm Dw_mm)} = ($1,$2);
636 }
637 elsif (/Branches:\s+.*?([\d,]+) cond .*?([\d,]+) ind/) {
638 @res{qw(COND IND)} = ($1,$2);
639 }
640 elsif (/Mispredicts:\s+.*?([\d,]+) cond .*?([\d,]+) ind/) {
641 @res{qw(COND_m IND_m)} = ($1,$2);
642 }
643 }
644
645 for my $field (keys %VALID_FIELDS) {
646 die "Error: can't parse '$field' field from cachegrind output:\n$output"
647 unless exists $res{$field};
648 $res{$field} =~ s/,//g;
649 }
650
651 return \%res;
652}
653
654
655# Handle the 'grind' action
656
657sub do_grind {
658 my ($perl_args) = @_; # the residue of @ARGV after option processing
659
957d8930 660 my ($loop_counts, $perls, $results, $tests, $order);
9e7973fa
DM
661 my ($bisect_field, $bisect_min, $bisect_max);
662
663 if (defined $OPTS{bisect}) {
664 ($bisect_field, $bisect_min, $bisect_max) = split /,/, $OPTS{bisect}, 3;
665 die "Error: --bisect option must be of form 'field,integer,integer'\n"
666 unless
667 defined $bisect_max
668 and $bisect_min =~ /^[0-9]+$/
669 and $bisect_max =~ /^[0-9]+$/;
670
671 die "Error: unrecognised field '$bisect_field' in --bisect option\n"
672 unless $VALID_FIELDS{$bisect_field};
673
674 die "Error: --bisect min ($bisect_min) must be <= max ($bisect_max)\n"
675 if $bisect_min > $bisect_max;
676 }
677
4044748b 678 if ($OPTS{read}) {
9e7973fa 679 open my $in, '<:encoding(UTF-8)', $OPTS{read}
4044748b 680 or die " Error: can't open '$OPTS{read}' for reading: $!\n";
9e7973fa
DM
681 my $data = do { local $/; <$in> };
682 close $in;
683
684 my $hash = JSON::PP::decode_json($data);
685 if (int($FORMAT_VERSION) < int($hash->{version})) {
686 die "Error: unsupported version $hash->{version} in file"
687 . "'$OPTS{read}' (too new)\n";
688 }
957d8930
DM
689 ($loop_counts, $perls, $results, $tests, $order) =
690 @$hash{qw(loop_counts perls results tests order)};
9e7973fa
DM
691
692 filter_tests($results);
693 filter_tests($tests);
957d8930
DM
694
695 if (!$order) {
696 $order = [ sort keys %$tests ];
697 }
9e7973fa 698 }
9e7973fa 699
4044748b
YO
700 if (@$perl_args) {
701 unless ($loop_counts) {
702 # How many times to execute the loop for the two trials. The lower
703 # value is intended to do the loop enough times that branch
704 # prediction has taken hold; the higher loop allows us to see the
705 # branch misses after that
706 $loop_counts = [10, 20];
8fbd1c2c 707
4044748b
YO
708 ($tests, $order) = read_tests_file($OPTS{benchfile});
709 die "Error: only a single test may be specified with --bisect\n"
710 if defined $OPTS{bisect} and keys %$tests != 1;
711 }
8fbd1c2c 712
4044748b
YO
713 my @run_perls= process_puts($perls, @$perl_args);
714 push @$perls, @run_perls;
5825b6d4
YO
715 die "Error: Not enough perls to run a report, and --write not specified.\n"
716 if @$perls < 2 and !$OPTS{write};
4044748b 717 $results = grind_run($tests, $order, \@run_perls, $loop_counts, $results);
9e7973fa
DM
718 }
719
5825b6d4
YO
720 if (!$perls or !@$perls) {
721 die "Error: nothing to do: no perls to run, no data to read.\n";
722 }
8fbd1c2c
DM
723 # now that we have a list of perls, use it to process the
724 # 'perl' component of the --norm and --sort args
725
726 $OPTS{norm} = select_a_perl($OPTS{norm}, $perls, "--norm");
727 if (defined $OPTS{'sort-perl'}) {
728 $OPTS{'sort-perl'} =
729 select_a_perl($OPTS{'sort-perl'}, $perls, "--sort");
730 }
731
df3d7b3a
DM
732 if (defined $OPTS{'compact'}) {
733 $OPTS{'compact'} =
734 select_a_perl($OPTS{'compact'}, $perls, "--compact");
735 }
9e7973fa
DM
736 if (defined $OPTS{write}) {
737 my $json = JSON::PP::encode_json({
738 version => $FORMAT_VERSION,
739 loop_counts => $loop_counts,
740 perls => $perls,
741 results => $results,
742 tests => $tests,
957d8930 743 order => $order,
9e7973fa
DM
744 });
745
746 open my $out, '>:encoding(UTF-8)', $OPTS{write}
5825b6d4 747 or die "Error: can't open '$OPTS{write}' for writing: $!\n";
9e7973fa
DM
748 print $out $json or die "Error: writing to file '$OPTS{write}': $!\n";
749 close $out or die "Error: closing file '$OPTS{write}': $!\n";
750 }
5825b6d4
YO
751 if (!$OPTS{write} or $OPTS{show}) {
752 if (@$perls < 2) {
753 die "Error: need more than one perl to do a report.\n";
754 }
9e7973fa
DM
755 my ($processed, $averages) =
756 grind_process($results, $perls, $loop_counts);
757
758 if (defined $OPTS{bisect}) {
759 my @r = values %$results;
760 die "Panic: expected exactly one test result in bisect\n"
761 if @r != 1;
762 @r = values %{$r[0]};
763 die "Panic: expected exactly one perl result in bisect\n"
764 if @r != 1;
765 my $c = $r[0]{$bisect_field};
766 die "Panic: no result in bisect for field '$bisect_field'\n"
767 unless defined $c;
768 exit 0 if $bisect_min <= $c and $c <= $bisect_max;
769 exit 1;
770 }
df3d7b3a
DM
771 elsif (defined $OPTS{compact}) {
772 grind_print_compact($processed, $averages, $OPTS{compact},
773 $perls, $tests, $order);
774 }
9e7973fa 775 else {
957d8930 776 grind_print($processed, $averages, $perls, $tests, $order);
9e7973fa
DM
777 }
778 }
779}
780
781
782# Run cachegrind for every test/perl combo.
783# It may run several processes in parallel when -j is specified.
784# Return a hash ref suitable for input to grind_process()
785
786sub grind_run {
4044748b 787 my ($tests, $order, $perls, $counts, $results) = @_;
9e7973fa
DM
788
789 # Build a list of all the jobs to run
790
791 my @jobs;
792
957d8930 793 for my $test (grep $tests->{$_}, @$order) {
9e7973fa
DM
794
795 # Create two test progs: one with an empty loop and one with code.
796 # Note that the empty loop is actually '{1;}' rather than '{}';
797 # this causes the loop to have a single nextstate rather than a
798 # stub op, so more closely matches the active loop; e.g.:
799 # {1;} => nextstate; unstack
800 # {$x=1;} => nextstate; const; gvsv; sassign; unstack
801 my @prog = (
802 make_perl_prog($test, @{$tests->{$test}}{qw(desc setup)}, '1'),
803 make_perl_prog($test, @{$tests->{$test}}{qw(desc setup code)}),
804 );
805
806 for my $p (@$perls) {
60858fe8 807 my ($perl, $label, $env, @putargs) = @$p;
9e7973fa
DM
808
809 # Run both the empty loop and the active loop
810 # $counts->[0] and $counts->[1] times.
811
812 for my $i (0,1) {
813 for my $j (0,1) {
60858fe8
JC
814 my $envstr = '';
815 if (ref $env) {
816 $envstr .= "$_=$env->{$_} " for sort keys %$env;
817 }
818 my $cmd = "PERL_HASH_SEED=0 $envstr"
9e7973fa
DM
819 . "valgrind --tool=cachegrind --branch-sim=yes "
820 . "--cachegrind-out-file=/dev/null "
821 . "$OPTS{grindargs} "
a9eceb2d 822 . "$perl $OPTS{perlargs} @putargs - $counts->[$j] 2>&1";
9e7973fa 823 # for debugging and error messages
c385646f 824 my $id = "$test/$label "
9e7973fa
DM
825 . ($i ? "active" : "empty") . "/"
826 . ($j ? "long" : "short") . " loop";
827
828 push @jobs, {
829 test => $test,
830 perl => $perl,
831 plabel => $label,
832 cmd => $cmd,
833 prog => $prog[$i],
834 active => $i,
835 loopix => $j,
836 id => $id,
837 };
838 }
839 }
840 }
841 }
842
843 # Execute each cachegrind and store the results in %results.
844
845 local $SIG{PIPE} = 'IGNORE';
846
847 my $max_jobs = $OPTS{jobs};
848 my $running = 0; # count of executing jobs
849 my %pids; # map pids to jobs
850 my %fds; # map fds to jobs
9e7973fa
DM
851 my $select = IO::Select->new();
852
853 while (@jobs or $running) {
854
855 if ($OPTS{debug}) {
856 printf "Main loop: pending=%d running=%d\n",
857 scalar(@jobs), $running;
858 }
859
860 # Start new jobs
861
862 while (@jobs && $running < $max_jobs) {
863 my $job = shift @jobs;
864 my ($id, $cmd) =@$job{qw(id cmd)};
865
866 my ($in, $out, $pid);
867 warn "Starting $id\n" if $OPTS{verbose};
868 eval { $pid = IPC::Open2::open2($out, $in, $cmd); 1; }
869 or die "Error: while starting cachegrind subprocess"
870 ." for $id:\n$@";
871 $running++;
872 $pids{$pid} = $job;
873 $fds{"$out"} = $job;
874 $job->{out_fd} = $out;
875 $job->{output} = '';
876 $job->{pid} = $pid;
877
878 $out->blocking(0);
879 $select->add($out);
880
881 if ($OPTS{debug}) {
882 print "Started pid $pid for $id\n";
883 }
884
885 # Note:
886 # In principle we should write to $in in the main select loop,
887 # since it may block. In reality,
888 # a) the code we write to the perl process's stdin is likely
889 # to be less than the OS's pipe buffer size;
890 # b) by the time the perl process has read in all its stdin,
891 # the only output it should have generated is a few lines
892 # of cachegrind output preamble.
893 # If these assumptions change, then perform the following print
894 # in the select loop instead.
895
896 print $in $job->{prog};
897 close $in;
898 }
899
900 # Get output of running jobs
901
902 if ($OPTS{debug}) {
903 printf "Select: waiting on (%s)\n",
904 join ', ', sort { $a <=> $b } map $fds{$_}{pid},
905 $select->handles;
906 }
907
908 my @ready = $select->can_read;
909
910 if ($OPTS{debug}) {
911 printf "Select: pids (%s) ready\n",
912 join ', ', sort { $a <=> $b } map $fds{$_}{pid}, @ready;
913 }
914
915 unless (@ready) {
916 die "Panic: select returned no file handles\n";
917 }
918
919 for my $fd (@ready) {
920 my $j = $fds{"$fd"};
921 my $r = sysread $fd, $j->{output}, 8192, length($j->{output});
922 unless (defined $r) {
923 die "Panic: Read from process running $j->{id} gave:\n$!";
924 }
925 next if $r;
926
927 # EOF
928
929 if ($OPTS{debug}) {
930 print "Got eof for pid $fds{$fd}{pid} ($j->{id})\n";
931 }
932
933 $select->remove($j->{out_fd});
934 close($j->{out_fd})
935 or die "Panic: closing output fh on $j->{id} gave:\n$!\n";
936 $running--;
937 delete $fds{"$j->{out_fd}"};
938 my $output = $j->{output};
939
940 if ($OPTS{debug}) {
941 my $p = $j->{prog};
942 $p =~ s/^/ : /mg;
943 my $o = $output;
944 $o =~ s/^/ : /mg;
945
946 print "\n$j->{id}/\nCommand: $j->{cmd}\n"
947 . "Input:\n$p"
948 . "Output\n$o";
949 }
950
4044748b 951 $results->{$j->{test}}{$j->{plabel}}[$j->{active}][$j->{loopix}]
9e7973fa
DM
952 = parse_cachegrind($output, $j->{id}, $j->{perl});
953 }
954
955 # Reap finished jobs
956
957 while (1) {
958 my $kid = waitpid(-1, WNOHANG);
959 my $ret = $?;
960 last if $kid <= 0;
961
962 unless (exists $pids{$kid}) {
963 die "Panic: reaped unexpected child $kid";
964 }
965 my $j = $pids{$kid};
966 if ($ret) {
967 die sprintf("Error: $j->{id} gave return status 0x%04x\n", $ret)
968 . "with the following output\n:$j->{output}\n";
969 }
970 delete $pids{$kid};
971 }
972 }
973
4044748b 974 return $results;
9e7973fa
DM
975}
976
977
978
979
980# grind_process(): process the data that has been extracted from
981# cachgegrind's output.
982#
983# $res is of the form ->{benchmark_name}{perl_name}[active][count]{field_name},
984# where active is 0 or 1 indicating an empty or active loop,
985# count is 0 or 1 indicating a short or long loop. E.g.
986#
987# $res->{'expr::assign::scalar_lex'}{perl-5.21.1}[0][10]{Dw_mm}
988#
989# The $res data structure is modified in-place by this sub.
990#
991# $perls is [ [ perl-exe, perl-label], .... ].
992#
993# $counts is [ N, M ] indicating the counts for the short and long loops.
994#
995#
996# return \%output, \%averages, where
997#
998# $output{benchmark_name}{perl_name}{field_name} = N
999# $averages{perl_name}{field_name} = M
1000#
1001# where N is the raw count ($OPTS{raw}), or count_perl0/count_perlI otherwise;
1002# M is the average raw count over all tests ($OPTS{raw}), or
1003# 1/(sum(count_perlI/count_perl0)/num_tests) otherwise.
1004
1005sub grind_process {
1006 my ($res, $perls, $counts) = @_;
1007
1008 # Process the four results for each test/perf combo:
1009 # Convert
1010 # $res->{benchmark_name}{perl_name}[active][count]{field_name} = n
1011 # to
1012 # $res->{benchmark_name}{perl_name}{field_name} = averaged_n
1013 #
1014 # $r[0][1] - $r[0][0] is the time to do ($counts->[1]-$counts->[0])
1015 # empty loops, eliminating startup time
1016 # $r[1][1] - $r[1][0] is the time to do ($counts->[1]-$counts->[0])
1017 # active loops, eliminating startup time
1018 # (the two startup times may be different because different code
1019 # is being compiled); the difference of the two results above
1020 # divided by the count difference is the time to execute the
1021 # active code once, eliminating both startup and loop overhead.
1022
1023 for my $tests (values %$res) {
1024 for my $r (values %$tests) {
1025 my $r2;
1026 for (keys %{$r->[0][0]}) {
1027 my $n = ( ($r->[1][1]{$_} - $r->[1][0]{$_})
1028 - ($r->[0][1]{$_} - $r->[0][0]{$_})
1029 ) / ($counts->[1] - $counts->[0]);
1030 $r2->{$_} = $n;
1031 }
1032 $r = $r2;
1033 }
1034 }
1035
1036 my %totals;
1037 my %counts;
1038 my %data;
1039
1a961f9f 1040 my $perl_norm = $perls->[$OPTS{norm}][1]; # the label of the reference perl
9e7973fa
DM
1041
1042 for my $test_name (keys %$res) {
1043 my $res1 = $res->{$test_name};
1044 my $res2_norm = $res1->{$perl_norm};
1045 for my $perl (keys %$res1) {
1046 my $res2 = $res1->{$perl};
1047 for my $field (keys %$res2) {
1048 my ($p, $q) = ($res2_norm->{$field}, $res2->{$field});
1049
1050 if ($OPTS{raw}) {
1051 # Avoid annoying '-0.0' displays. Ideally this number
1052 # should never be negative, but fluctuations in
1053 # startup etc can theoretically make this happen
1054 $q = 0 if ($q <= 0 && $q > -0.1);
1055 $totals{$perl}{$field} += $q;
1056 $counts{$perl}{$field}++;
1057 $data{$test_name}{$perl}{$field} = $q;
1058 next;
1059 }
1060
1061 # $p and $q are notionally integer counts, but
1062 # due to variations in startup etc, it's possible for a
1063 # count which is supposedly zero to be calculated as a
1064 # small positive or negative value.
1065 # In this case, set it to zero. Further below we
1066 # special-case zeros to avoid division by zero errors etc.
1067
1068 $p = 0.0 if $p < 0.01;
1069 $q = 0.0 if $q < 0.01;
1070
1071 if ($p == 0.0 && $q == 0.0) {
1072 # Both perls gave a count of zero, so no change:
1073 # treat as 100%
1074 $totals{$perl}{$field} += 1;
1075 $counts{$perl}{$field}++;
1076 $data{$test_name}{$perl}{$field} = 1;
1077 }
1078 elsif ($p == 0.0 || $q == 0.0) {
1079 # If either count is zero, there were too few events
1080 # to give a meaningful ratio (and we will end up with
1081 # division by zero if we try). Mark the result undef,
1082 # indicating that it shouldn't be displayed; and skip
1083 # adding to the average
1084 $data{$test_name}{$perl}{$field} = undef;
1085 }
1086 else {
1087 # For averages, we record q/p rather than p/q.
1088 # Consider a test where perl_norm took 1000 cycles
1089 # and perlN took 800 cycles. For the individual
1090 # results we display p/q, or 1.25; i.e. a quarter
1091 # quicker. For the averages, we instead sum all
1092 # the 0.8's, which gives the total cycles required to
1093 # execute all tests, with all tests given equal
1094 # weight. Later we reciprocate the final result,
1095 # i.e. 1/(sum(qi/pi)/n)
1096
1097 $totals{$perl}{$field} += $q/$p;
1098 $counts{$perl}{$field}++;
1099 $data{$test_name}{$perl}{$field} = $p/$q;
1100 }
1101 }
1102 }
1103 }
1104
1105 # Calculate averages based on %totals and %counts accumulated earlier.
1106
1107 my %averages;
1108 for my $perl (keys %totals) {
1109 my $t = $totals{$perl};
1110 for my $field (keys %$t) {
1111 $averages{$perl}{$field} = $OPTS{raw}
1112 ? $t->{$field} / $counts{$perl}{$field}
1113 # reciprocal - see comments above
1114 : $counts{$perl}{$field} / $t->{$field};
1115 }
1116 }
1117
1118 return \%data, \%averages;
1119}
1120
1121
9e7973fa 1122
df3d7b3a 1123# print a standard blurb at the start of the grind display
9e7973fa 1124
df3d7b3a
DM
1125sub grind_blurb {
1126 my ($perls) = @_;
9e7973fa
DM
1127
1128 print <<EOF;
1129Key:
1130 Ir Instruction read
1131 Dr Data read
1132 Dw Data write
1133 COND conditional branches
1134 IND indirect branches
1135 _m branch predict miss
1136 _m1 level 1 cache miss
1137 _mm last cache (e.g. L3) miss
1138 - indeterminate percentage (e.g. 1/0)
1139
1140EOF
1141
1142 if ($OPTS{raw}) {
1143 print "The numbers represent raw counts per loop iteration.\n";
1144 }
1145 else {
1146 print <<EOF;
1147The numbers represent relative counts per loop iteration, compared to
df3d7b3a 1148$perls->[$OPTS{norm}][1] at 100.0%.
9e7973fa
DM
1149Higher is better: for example, using half as many instructions gives 200%,
1150while using twice as many gives 50%.
1151EOF
1152 }
df3d7b3a
DM
1153}
1154
1155
1156# return a sorted list of the test names, plus 'AVERAGE'
9e7973fa 1157
df3d7b3a
DM
1158sub sorted_test_names {
1159 my ($results, $order, $perls) = @_;
9e7973fa 1160
df3d7b3a 1161 my @names;
9e7973fa
DM
1162 unless ($OPTS{average}) {
1163 if (defined $OPTS{'sort-field'}) {
1164 my ($field, $perlix) = @OPTS{'sort-field', 'sort-perl'};
beb8db25 1165 my $perl = $perls->[$perlix][1];
df3d7b3a 1166 @names = sort
9e7973fa
DM
1167 {
1168 $results->{$a}{$perl}{$field}
1169 <=> $results->{$b}{$perl}{$field}
1170 }
1171 keys %$results;
1172 }
1173 else {
df3d7b3a 1174 @names = grep $results->{$_}, @$order;
9e7973fa
DM
1175 }
1176 }
1177
1178 # No point in displaying average for only one test.
df3d7b3a
DM
1179 push @names, 'AVERAGE' unless @names == 1;
1180 @names;
1181}
1182
1183
1184# grind_print(): display the tabulated results of all the cachegrinds.
1185#
1186# Arguments are of the form:
1187# $results->{benchmark_name}{perl_name}{field_name} = N
1188# $averages->{perl_name}{field_name} = M
1189# $perls = [ [ perl-exe, perl-label ], ... ]
1190# $tests->{test_name}{desc => ..., ...}
1191
1192sub grind_print {
1193 my ($results, $averages, $perls, $tests, $order) = @_;
1194
1195 my @perl_names = map $_->[0], @$perls;
1a961f9f 1196 my @perl_labels = map $_->[1], @$perls;
df3d7b3a
DM
1197 my %perl_labels;
1198 $perl_labels{$_->[0]} = $_->[1] for @$perls;
1199
1200 my $field_label_width = 6;
1201 # Calculate the width to display for each column.
1202 my $min_width = $OPTS{raw} ? 8 : 6;
1203 my @widths = map { length($_) < $min_width ? $min_width : length($_) }
1a961f9f 1204 @perl_labels;
df3d7b3a
DM
1205
1206 # Print standard header.
1207 grind_blurb($perls);
1208
1209 my @test_names = sorted_test_names($results, $order, $perls);
9e7973fa
DM
1210
1211 # If only a single field is to be displayed, use a more compact
1212 # format with only a single line of output per test.
1213
1214 my $one_field = defined $OPTS{fields} && keys(%{$OPTS{fields}}) == 1;
1215
1216 if ($one_field) {
91cde97c 1217 print "Results for field " . (keys(%{$OPTS{fields}}))[0] . ".\n";
9e7973fa
DM
1218
1219 # The first column will now contain test names rather than
1220 # field names; Calculate the max width.
1221
1222 $field_label_width = 0;
1223 for (@test_names) {
1224 $field_label_width = length if length > $field_label_width;
1225 }
1226
1227 # Print the perl executables header.
1228
1229 print "\n";
1230 for my $i (0,1) {
1231 print " " x $field_label_width;
1232 for (0..$#widths) {
1233 printf " %*s", $widths[$_],
1a961f9f 1234 $i ? ('-' x$widths[$_]) : $perl_labels[$_];
9e7973fa
DM
1235 }
1236 print "\n";
1237 }
1238 }
1239
1240 # Dump the results for each test.
1241
1242 for my $test_name (@test_names) {
1243 my $doing_ave = ($test_name eq 'AVERAGE');
1244 my $res1 = $doing_ave ? $averages : $results->{$test_name};
1245
1246 unless ($one_field) {
1247 print "\n$test_name";
1248 print "\n$tests->{$test_name}{desc}" unless $doing_ave;
1249 print "\n\n";
1250
1251 # Print the perl executables header.
1252 for my $i (0,1) {
1253 print " " x $field_label_width;
1254 for (0..$#widths) {
1255 printf " %*s", $widths[$_],
1a961f9f 1256 $i ? ('-' x$widths[$_]) : $perl_labels[$_];
9e7973fa
DM
1257 }
1258 print "\n";
1259 }
1260 }
1261
1262 for my $field (qw(Ir Dr Dw COND IND
1263 N
1264 COND_m IND_m
1265 N
1266 Ir_m1 Dr_m1 Dw_m1
1267 N
1268 Ir_mm Dr_mm Dw_mm
1269 ))
1270 {
1271 next if $OPTS{fields} and ! exists $OPTS{fields}{$field};
1272
1273 if ($field eq 'N') {
1274 print "\n";
1275 next;
1276 }
1277
91cde97c
DM
1278 if ($one_field) {
1279 printf "%-*s", $field_label_width, $test_name;
1280 }
1281 else {
1282 printf "%*s", $field_label_width, $field;
1283 }
9e7973fa
DM
1284
1285 for my $i (0..$#widths) {
1a961f9f 1286 my $res2 = $res1->{$perl_labels[$i]};
9e7973fa
DM
1287 my $p = $res2->{$field};
1288 if (!defined $p) {
1289 printf " %*s", $widths[$i], '-';
1290 }
1291 elsif ($OPTS{raw}) {
1292 printf " %*.1f", $widths[$i], $p;
1293 }
1294 else {
1295 printf " %*.2f", $widths[$i], $p * 100;
1296 }
1297 }
1298 print "\n";
1299 }
1300 }
1301}
1302
1303
df3d7b3a
DM
1304
1305# grind_print_compact(): like grind_print(), but display a single perl
1306# in a compact form. Has an additional arg, $which_perl, which specifies
1307# which perl to display.
1308#
1309# Arguments are of the form:
1310# $results->{benchmark_name}{perl_name}{field_name} = N
1311# $averages->{perl_name}{field_name} = M
1312# $perls = [ [ perl-exe, perl-label ], ... ]
1313# $tests->{test_name}{desc => ..., ...}
1314
1315sub grind_print_compact {
1316 my ($results, $averages, $which_perl, $perls, $tests, $order) = @_;
1317
1318
1319 # the width to display for each column.
1320 my $width = $OPTS{raw} ? 7 : 6;
1321
1322 # Print standard header.
1323 grind_blurb($perls);
1324
1325 print "\nResults for $perls->[$which_perl][1]\n\n";
1326
1327 my @test_names = sorted_test_names($results, $order, $perls);
1328
1329 # Dump the results for each test.
1330
1331 my @fields = qw( Ir Dr Dw
1332 COND IND
1333 COND_m IND_m
1334 Ir_m1 Dr_m1 Dw_m1
1335 Ir_mm Dr_mm Dw_mm
1336 );
1337 if ($OPTS{fields}) {
1338 @fields = grep exists $OPTS{fields}{$_}, @fields;
1339 }
1340
1341 printf " %*s", $width, $_ for @fields;
1342 print "\n";
1343 printf " %*s", $width, '------' for @fields;
1344 print "\n";
1345
1346 for my $test_name (@test_names) {
1347 my $doing_ave = ($test_name eq 'AVERAGE');
1348 my $res = $doing_ave ? $averages : $results->{$test_name};
beb8db25 1349 $res = $res->{$perls->[$which_perl][1]};
df3d7b3a
DM
1350
1351 for my $field (@fields) {
1352 my $p = $res->{$field};
1353 if (!defined $p) {
1354 printf " %*s", $width, '-';
1355 }
1356 elsif ($OPTS{raw}) {
1357 printf " %*.1f", $width, $p;
1358 }
1359 else {
1360 printf " %*.2f", $width, $p * 100;
1361 }
1362
1363 }
1364
1365 print " $test_name\n";
1366 }
1367}
1368
1369
9e7973fa
DM
1370# do_selftest(): check that we can parse known cachegrind()
1371# output formats. If the output of cachegrind changes, add a *new*
1372# test here; keep the old tests to make sure we continue to parse
1373# old cachegrinds
1374
1375sub do_selftest {
1376
1377 my @tests = (
1378 'standard',
1379 <<'EOF',
1380==32350== Cachegrind, a cache and branch-prediction profiler
1381==32350== Copyright (C) 2002-2013, and GNU GPL'd, by Nicholas Nethercote et al.
1382==32350== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
1383==32350== Command: perl5211o /tmp/uiS2gjdqe5 1
1384==32350==
1385--32350-- warning: L3 cache found, using its data for the LL simulation.
1386==32350==
1387==32350== I refs: 1,124,055
1388==32350== I1 misses: 5,573
1389==32350== LLi misses: 3,338
1390==32350== I1 miss rate: 0.49%
1391==32350== LLi miss rate: 0.29%
1392==32350==
1393==32350== D refs: 404,275 (259,191 rd + 145,084 wr)
1394==32350== D1 misses: 9,608 ( 6,098 rd + 3,510 wr)
1395==32350== LLd misses: 5,794 ( 2,781 rd + 3,013 wr)
1396==32350== D1 miss rate: 2.3% ( 2.3% + 2.4% )
1397==32350== LLd miss rate: 1.4% ( 1.0% + 2.0% )
1398==32350==
1399==32350== LL refs: 15,181 ( 11,671 rd + 3,510 wr)
1400==32350== LL misses: 9,132 ( 6,119 rd + 3,013 wr)
1401==32350== LL miss rate: 0.5% ( 0.4% + 2.0% )
1402==32350==
1403==32350== Branches: 202,372 (197,050 cond + 5,322 ind)
1404==32350== Mispredicts: 19,153 ( 17,742 cond + 1,411 ind)
1405==32350== Mispred rate: 9.4% ( 9.0% + 26.5% )
1406EOF
1407 {
1408 COND => 197050,
1409 COND_m => 17742,
1410 Dr => 259191,
1411 Dr_m1 => 6098,
1412 Dr_mm => 2781,
1413 Dw => 145084,
1414 Dw_m1 => 3510,
1415 Dw_mm => 3013,
1416 IND => 5322,
1417 IND_m => 1411,
1418 Ir => 1124055,
1419 Ir_m1 => 5573,
1420 Ir_mm => 3338,
1421 },
1422 );
1423
1424 for ('t', '.') {
1425 last if require "$_/test.pl";
1426 }
1427 plan(@tests / 3 * keys %VALID_FIELDS);
1428
1429 while (@tests) {
1430 my $desc = shift @tests;
1431 my $output = shift @tests;
1432 my $expected = shift @tests;
1433 my $p = parse_cachegrind($output);
1434 for (sort keys %VALID_FIELDS) {
1435 is($p->{$_}, $expected->{$_}, "$desc, $_");
1436 }
1437 }
1438}