This is a live mirror of the Perl 5 development currently hosted at https://github.com/perl/perl5
prevent mro.pm from loading DynaLoader
[perl5.git] / Porting / bench.pl
CommitLineData
9e7973fa
DM
1#!/usr/bin/perl
2#
3# A tool for analysing the performance of the code snippets found in
4# t/perf/benchmarks or similar
5
6
7=head1 NAME
8
9bench.pl - Compare the performance of perl code snippets across multiple
10perls.
11
12=head1 SYNOPSIS
13
4a1358c2
FC
14 # Basic: run the tests in t/perf/benchmarks against two or
15 # more perls
9e7973fa 16
32dfbb33
JC
17 bench.pl [options] -- perlA[=labelA] perlB[=labelB] ...
18
19 # run the tests against same perlA 2x, with and without extra
20 # options
21
22 bench.pl [options] -- perlA=fast PerlA=slow -Mstrict -Dpsltoc
9e7973fa
DM
23
24 # Run bench.pl's own built-in sanity tests
25
26 bench.pl --action=selftest
27
4044748b
YO
28 # Run bench on blead, which is then modified and timed again
29
30 bench.pl [options] --write=blead.time -- ./perl=blead
31 # hack hack hack
c3bb902a 32 bench.pl --read=blead.time -- ./perl=hacked
4044748b
YO
33
34 # You can also combine --read with --write
c3bb902a 35 bench.pl --read=blead.time --write=last.time -- ./perl=hacked
4044748b 36
9e7973fa
DM
37=head1 DESCRIPTION
38
39By default, F<bench.pl> will run code snippets found in
40F<t/perf/benchmarks> (or similar) under cachegrind, in order to calculate
41how many instruction reads, data writes, branches, cache misses, etc. that
42one execution of the snippet uses. It will run them against two or more
43perl executables and show how much each test has gotten better or worse.
44
45It is modelled on the F<perlbench> tool, but since it measures instruction
46reads etc., rather than timings, it is much more precise and reproducible.
e34630bf 47It is also considerably faster, and is capable of running tests in
9e7973fa
DM
48parallel (with C<-j>). Rather than displaying a single relative
49percentage per test/perl combination, it displays values for 13 different
50measurements, such as instruction reads, conditional branch misses etc.
51
52There are options to write the raw data to a file, and to read it back.
53This means that you can view the same run data in different views with
4044748b
YO
54different selection and sort options. You can also use this mechanism
55to save the results of timing one perl, and then read it back while timing
56a modification, so that you dont have rerun the same tests on the same
57perl over and over, or have two perls built at the same time.
9e7973fa
DM
58
59The optional C<=label> after each perl executable is used in the display
4044748b
YO
60output. If you are doing a two step benchmark then you should provide
61a label for at least the "base" perl.
9e7973fa
DM
62
63=head1 OPTIONS
64
65=over 4
66
67=item *
68
69--action=I<foo>
70
71What action to perform. The default is I<grind>, which runs the benchmarks
72using I<cachegrind> as the back end. The only other action at the moment is
73I<selftest>, which runs some basic sanity checks and produces TAP output.
74
75=item *
76
77--average
78
79Only display the overall average, rather than the results for each
80individual test.
81
82=item *
83
84--benchfile=I<foo>
85
86The path of the file which contains the benchmarks (F<t/perf/benchmarks>
87by default).
88
89=item *
90
91--bisect=I<field,minval,maxval>
92
93Run a single test against one perl and exit with a zero status if the
94named field is in the specified range; exit 1 otherwise. It will complain
95if more than one test or perl has been specified. It is intended to be
96called as part of a bisect run, to determine when something changed.
97For example,
98
4a1358c2
FC
99 bench.pl -j 8 --tests=foo --bisect=Ir,100,105 --perlargs=-Ilib \
100 ./miniperl
9e7973fa
DM
101
102might be called from bisect to find when the number of instruction reads
103for test I<foo> falls outside the range 100..105.
104
105=item *
106
df3d7b3a
DM
107--compact=<Iperl>
108
109Display the results for a single perl executable in a compact form.
110Which perl to display is specified in the same manner as C<--norm>.
111
112=item *
113
9e7973fa
DM
114--debug
115
116Enable verbose debugging output.
117
118=item *
119
120--fields=I<a,b,c>
121
122Display only the specified fields; for example,
123
124 --fields=Ir,Ir_m,Ir_mm
125
126If only one field is selected, the output is in more compact form.
127
128=item *
129
130--grindargs=I<foo>
131
8a094fee
JC
132Optional command-line arguments to pass to all cachegrind invocations.
133
134This option is appended to those which bench.pl uses for its own
135purposes; so it can be used to override them (see --debug output
136below), and can also be 'abused' to add redirects into the valgrind
137command invocation.
138
139For example, this writes PERL_MEM_LOG activity to foobar.$$, because
1403>foobar.$$ redirects fd 3, then perl under PERL_MEM_LOG writes to fd 3.
141
142 $ perl Porting/bench.pl --jobs=2 --verbose --debug \
143 --tests=call::sub::amp_empty \
144 \
145 --grindargs='--cachegrind-out-file=junk.$$ 3>foobar.$$' \
146 -- \
147 perl5.24.0 perl5.24.0:+memlog:PERL_MEM_LOG=3mst
148
149for the +memlog tests, this executes as: (shown via --debug, then prettyfied)
150
151 Command: PERL_HASH_SEED=0 PERL_MEM_LOG=3mst
152 valgrind --tool=cachegrind --branch-sim=yes
153 --cachegrind-out-file=/dev/null --cachegrind-out-file=junk.$$
154 3>foobar.$$ perl5.24.0 - 10 2>&1
155
156The result is that a set of junk.$$ files containing raw cachegrind
157output are written, and foobar.$$ contains the expected memlog output.
158
159Notes:
160
161Theres no obvious utility for those junk.$$ and foobar.$$ files, but
162you can have them anyway.
163
164The 3 in PERL_MEM_LOG=3mst is needed because the output would
165otherwize go to STDERR, and cause parse_cachegrind() to reject the
166test and die.
167
168The --grindargs redirect is needed to capture the memlog output;
169without it, the memlog output is written to fd3, around
170parse_cachegrind and effectively into /dev/null
171
172PERL_MEM_LOG is expensive when used.
173
174call::sub::amp_empty
175&foo function call with no args or body
176
177 perl5.24.0 perl5.24.0+memlog
178 ---------- -----------------
179 Ir 394.0 543477.5
180 Dr 161.0 146814.1
181 Dw 72.0 122304.6
182 COND 58.0 66796.4
183 IND 5.0 5537.7
184
185COND_m 0.0 6743.1
186 IND_m 5.0 1490.2
187
188 Ir_m1 0.0 683.7
189 Dr_m1 0.0 65.9
190 Dw_m1 0.0 8.5
191
192 Ir_mm 0.0 11.6
193 Dr_mm 0.0 10.6
194 Dw_mm 0.0 4.7
195
9e7973fa
DM
196
197=item *
198
199---help
200
201Display basic usage information.
202
203=item *
204
205-j I<N>
206--jobs=I<N>
207
208Run I<N> jobs in parallel (default 1). This determines how many cachegrind
209process will running at a time, and should generally be set to the number
210of CPUs available.
211
212=item *
213
214--norm=I<foo>
215
216Specify which perl column in the output to treat as the 100% norm.
217It may be a column number (0..N-1) or a perl executable name or label.
218It defaults to the leftmost column.
219
220=item *
221
222--perlargs=I<foo>
223
32dfbb33
JC
224Optional command-line arguments to pass to each perl-under-test
225(perlA, perlB in synopsis) For example, C<--perlargs=-Ilib>.
9e7973fa
DM
226
227=item *
228
229--raw
230
231Display raw data counts rather than percentages in the outputs. This
232allows you to see the exact number of intruction reads, branch misses etc.
233for each test/perl combination. It also causes the C<AVERAGE> display
234per field to be calculated based on the average of each tests's count
235rather than average of each percentage. This means that tests with very
236high counts will dominate.
237
238=item *
239
65f0b1c9 240--sort=I<field:perl>
9e7973fa
DM
241
242Order the tests in the output based on the value of I<field> in the
243column I<perl>. The I<perl> value is as per C<--norm>. For example
244
4a1358c2
FC
245 bench.pl --sort=Dw:perl-5.20.0 \
246 perl-5.16.0 perl-5.18.0 perl-5.20.0
9e7973fa
DM
247
248=item *
249
250-r I<file>
251--read=I<file>
252
253Read in saved data from a previous C<--write> run from the specified file.
254
255Requires C<JSON::PP> to be available.
256
257=item *
258
259--tests=I<FOO>
260
261Specify a subset of tests to run (or in the case of C<--read>, to display).
262It may be either a comma-separated list of test names, or a regular
263expression. For example
264
265 --tests=expr::assign::scalar_lex,expr::assign::2list_lex
266 --tests=/^expr::/
267
268=item *
269
270--verbose
271
272Display progress information.
273
274=item *
275
276-w I<file>
277--write=I<file>
278
279Save the raw data to the specified file. It can be read back later with
4044748b
YO
280C<--read>. If combined with C<--read> then the output file will be
281the merge of the file read and any additional perls added on the command
282line.
9e7973fa
DM
283
284Requires C<JSON::PP> to be available.
285
286=back
287
288=cut
289
290
291
292use 5.010000;
293use warnings;
294use strict;
d54523c4 295use Getopt::Long qw(:config no_auto_abbrev require_order);
9e7973fa
DM
296use IPC::Open2 ();
297use IO::Select;
c2d21e7a 298use IO::File;
9e7973fa
DM
299use POSIX ":sys_wait_h";
300
301# The version of the file format used to save data. We refuse to process
302# the file if the integer component differs.
303
304my $FORMAT_VERSION = 1.0;
305
306# The fields we know about
307
308my %VALID_FIELDS = map { $_ => 1 }
309 qw(Ir Ir_m1 Ir_mm Dr Dr_m1 Dr_mm Dw Dw_m1 Dw_mm COND COND_m IND IND_m);
310
311sub usage {
312 die <<EOF;
d54523c4 313usage: $0 [options] -- perl[=label] ...
9e7973fa
DM
314 --action=foo What action to perform [default: grind].
315 --average Only display average, not individual test results.
316 --benchfile=foo File containing the benchmarks;
317 [default: t/perf/benchmarks].
318 --bisect=f,min,max run a single test against one perl and exit with a
319 zero status if the named field is in the specified
320 range; exit 1 otherwise.
df3d7b3a
DM
321 --compact=perl Display the results of a single perl in compact form.
322 Which perl specified like --norm
9e7973fa
DM
323 --debug Enable verbose debugging output.
324 --fields=a,b,c Display only the specified fields (e.g. Ir,Ir_m,Ir_mm).
325 --grindargs=foo Optional command-line args to pass to cachegrind.
326 --help Display this help.
327 -j|--jobs=N Run N jobs in parallel [default 1].
328 --norm=perl Which perl column to treat as 100%; may be a column
329 number (0..N-1) or a perl executable name or label;
330 [default: 0].
331 --perlargs=foo Optional command-line args to pass to each perl to run.
332 --raw Display raw data counts rather than percentages.
5825b6d4 333 --show Show results even though we are going to write results.
65f0b1c9 334 --sort=field:perl Sort the tests based on the value of 'field' in the
9e7973fa
DM
335 column 'perl'. The perl value is as per --norm.
336 -r|--read=file Read in previously saved data from the specified file.
337 --tests=FOO Select only the specified tests from the benchmarks file;
338 FOO may be either of the form 'foo,bar' or '/regex/';
339 [default: all tests].
340 --verbose Display progress information.
341 -w|--write=file Save the raw data to the specified file.
342
343--action is one of:
344 grind run the code under cachegrind
345 selftest perform a selftest; produce TAP output
346
347The command line ends with one or more specified perl executables,
348which will be searched for in the current \$PATH. Each binary name may
349have an optional =LABEL appended, which will be used rather than the
350executable name in output. E.g.
351
352 perl-5.20.1=PRE-BUGFIX perl-5.20.1-new=POST-BUGFIX
353EOF
354}
355
356my %OPTS = (
357 action => 'grind',
358 average => 0,
359 benchfile => 't/perf/benchmarks',
360 bisect => undef,
df3d7b3a 361 compact => undef,
9e7973fa
DM
362 debug => 0,
363 grindargs => '',
364 fields => undef,
365 jobs => 1,
366 norm => 0,
367 perlargs => '',
368 raw => 0,
369 read => undef,
5825b6d4 370 show => 0,
9e7973fa
DM
371 sort => undef,
372 tests => undef,
373 verbose => 0,
374 write => undef,
375);
376
377
378# process command-line args and call top-level action
379
380{
381 GetOptions(
382 'action=s' => \$OPTS{action},
383 'average' => \$OPTS{average},
384 'benchfile=s' => \$OPTS{benchfile},
385 'bisect=s' => \$OPTS{bisect},
df3d7b3a 386 'compact=s' => \$OPTS{compact},
9e7973fa
DM
387 'debug' => \$OPTS{debug},
388 'grindargs=s' => \$OPTS{grindargs},
389 'help' => \$OPTS{help},
390 'fields=s' => \$OPTS{fields},
391 'jobs|j=i' => \$OPTS{jobs},
392 'norm=s' => \$OPTS{norm},
393 'perlargs=s' => \$OPTS{perlargs},
394 'raw' => \$OPTS{raw},
395 'read|r=s' => \$OPTS{read},
5825b6d4 396 'show!' => \$OPTS{show},
9e7973fa
DM
397 'sort=s' => \$OPTS{sort},
398 'tests=s' => \$OPTS{tests},
399 'verbose' => \$OPTS{verbose},
400 'write|w=s' => \$OPTS{write},
401 ) or usage;
402
403 usage if $OPTS{help};
404
405
9e7973fa
DM
406 if (defined $OPTS{read} or defined $OPTS{write}) {
407 # fail early if it's not present
408 require JSON::PP;
409 }
410
411 if (defined $OPTS{fields}) {
412 my @f = split /,/, $OPTS{fields};
413 for (@f) {
414 die "Error: --fields: unknown field '$_'\n"
415 unless $VALID_FIELDS{$_};
416 }
417 my %f = map { $_ => 1 } @f;
418 $OPTS{fields} = \%f;
419 }
420
421 my %valid_actions = qw(grind 1 selftest 1);
422 unless ($valid_actions{$OPTS{action}}) {
423 die "Error: unrecognised action '$OPTS{action}'\n"
424 . "must be one of: " . join(', ', sort keys %valid_actions)."\n";
425 }
426
427 if (defined $OPTS{sort}) {
428 my @s = split /:/, $OPTS{sort};
429 if (@s != 2) {
430 die "Error: --sort argument should be of the form field:perl: "
431 . "'$OPTS{sort}'\n";
432 }
433 my ($field, $perl) = @s;
434 die "Error: --sort: unknown field '$field\n"
435 unless $VALID_FIELDS{$field};
436 # the 'perl' value will be validated later, after we have processed
437 # the perls
438 $OPTS{'sort-field'} = $field;
439 $OPTS{'sort-perl'} = $perl;
440 }
441
442 if ($OPTS{action} eq 'selftest') {
443 if (@ARGV) {
444 die "Error: no perl executables may be specified with --read\n"
445 }
446 }
447 elsif (defined $OPTS{bisect}) {
448 die "Error: exactly one perl executable must be specified for bisect\n"
449 unless @ARGV == 1;
450 die "Error: Can't specify both --bisect and --read\n"
451 if defined $OPTS{read};
452 die "Error: Can't specify both --bisect and --write\n"
453 if defined $OPTS{write};
454 }
9e7973fa
DM
455
456 if ($OPTS{action} eq 'grind') {
457 do_grind(\@ARGV);
458 }
459 elsif ($OPTS{action} eq 'selftest') {
460 do_selftest();
461 }
462}
463exit 0;
464
465
466# Given a hash ref keyed by test names, filter it by deleting unwanted
467# tests, based on $OPTS{tests}.
468
469sub filter_tests {
470 my ($tests) = @_;
471
472 my $opt = $OPTS{tests};
473 return unless defined $opt;
474
475 my @tests;
476
477 if ($opt =~ m{^/}) {
478 $opt =~ s{^/(.+)/$}{$1}
479 or die "Error: --tests regex must be of the form /.../\n";
480 for (keys %$tests) {
481 delete $tests->{$_} unless /$opt/;
482 }
483 }
484 else {
485 my %t;
486 for (split /,/, $opt) {
24293753
JC
487 die "Error: no such test found: '$_'\n"
488 . ($OPTS{verbose} ? " have: @{[ sort keys %$tests ]}\n" : "")
489 unless exists $tests->{$_};
9e7973fa
DM
490 $t{$_} = 1;
491 }
492 for (keys %$tests) {
493 delete $tests->{$_} unless exists $t{$_};
494 }
495 }
4044748b 496 die "Error: no tests to run\n" unless %$tests;
9e7973fa
DM
497}
498
499
500# Read in the test file, and filter out any tests excluded by $OPTS{tests}
957d8930
DM
501# return a hash ref { testname => { test }, ... }
502# and an array ref of the original test names order,
9e7973fa
DM
503
504sub read_tests_file {
505 my ($file) = @_;
506
ea572010
DM
507 my $ta;
508 {
509 local @INC = ('.');
510 $ta = do $file;
511 }
9e7973fa
DM
512 unless ($ta) {
513 die "Error: can't parse '$file': $@\n" if $@;
514 die "Error: can't read '$file': $!\n";
515 }
516
957d8930
DM
517 my @orig_order;
518 for (my $i=0; $i < @$ta; $i += 2) {
519 push @orig_order, $ta->[$i];
520 }
521
9e7973fa
DM
522 my $t = { @$ta };
523 filter_tests($t);
957d8930 524 return $t, \@orig_order;
9e7973fa
DM
525}
526
527
528# Process the perl/column argument of options like --norm and --sort.
529# Return the index of the matching perl.
530
531sub select_a_perl {
532 my ($perl, $perls, $who) = @_;
5825b6d4 533 $perls||=[];
9e7973fa
DM
534 if ($perl =~ /^[0-9]$/) {
535 die "Error: $who value $perl outside range 0.." . $#$perls . "\n"
536 unless $perl < @$perls;
537 return $perl;
538 }
539 else {
540 my @perl = grep $perls->[$_][0] eq $perl
541 || $perls->[$_][1] eq $perl,
542 0..$#$perls;
543 die "Error: $who: unrecognised perl '$perl'\n"
544 unless @perl;
545 die "Error: $who: ambiguous perl '$perl'\n"
546 if @perl > 1;
547 return $perl[0];
548 }
549}
550
551
44b18357 552# Validate the list of perl=label (+ cmdline options) on the command line.
92683b8b
DM
553# Return a list of [ exe, label, cmdline-options ] tuples, i.e.
554# 'perl-under-test's (PUTs)
9e7973fa 555
d54523c4 556sub process_puts {
4044748b 557 my $read_perls= shift;
44b18357 558 my @res_puts; # returned, each item is [ perlexe, label, @putargs ]
4044748b 559 my %seen= map { $_->[1] => 1 } @$read_perls;
d54523c4
JC
560 my @putargs; # collect not-perls into args per PUT
561
562 for my $p (reverse @_) {
563 push @putargs, $p and next if $p =~ /^-/; # not-perl, dont send to qx//
564
60858fe8 565 my ($perl, $label, $env) = split /[=:,]/, $p, 3;
9e7973fa 566 $label //= $perl;
857716c9 567 $label = $perl.$label if $label =~ /^\+/;
92683b8b 568 die "$label cannot be used on 2 different perls under test\n" if $seen{$label}++;
955a736c 569
60858fe8
JC
570 my %env;
571 if ($env) {
572 %env = split /[=,]/, $env;
573 }
9e7973fa 574 my $r = qx($perl -e 'print qq(ok\n)' 2>&1);
d54523c4 575 if ($r eq "ok\n") {
60858fe8 576 push @res_puts, [ $perl, $label, \%env, reverse @putargs ];
d54523c4 577 @putargs = ();
44b18357
JC
578 warn "Added Perl-Under-Test: [ @{[@{$res_puts[-1]}]} ]\n"
579 if $OPTS{verbose};
d54523c4 580 } else {
92683b8b 581 warn "perl-under-test args: @putargs + a not-perl: $p $r\n"
44b18357 582 if $OPTS{verbose};
d54523c4
JC
583 push @putargs, $p; # not-perl
584 }
9e7973fa 585 }
44b18357 586 return reverse @res_puts;
9e7973fa
DM
587}
588
589
8fbd1c2c 590
9e7973fa
DM
591# Return a string containing perl test code wrapped in a loop
592# that runs $ARGV[0] times
593
594sub make_perl_prog {
595 my ($test, $desc, $setup, $code) = @_;
596
597 return <<EOF;
598# $desc
599package $test;
600BEGIN { srand(0) }
601$setup;
602for my \$__loop__ (1..\$ARGV[0]) {
603 $code;
604}
605EOF
606}
607
608
609# Parse the output from cachegrind. Return a hash ref.
610# See do_selftest() for examples of the output format.
611
612sub parse_cachegrind {
613 my ($output, $id, $perl) = @_;
614
615 my %res;
616
617 my @lines = split /\n/, $output;
618 for (@lines) {
619 unless (s/(==\d+==)|(--\d+--) //) {
620 die "Error: while executing $id:\n"
621 . "unexpected code or cachegrind output:\n$_\n";
622 }
623 if (/I refs:\s+([\d,]+)/) {
624 $res{Ir} = $1;
625 }
626 elsif (/I1 misses:\s+([\d,]+)/) {
627 $res{Ir_m1} = $1;
628 }
629 elsif (/LLi misses:\s+([\d,]+)/) {
630 $res{Ir_mm} = $1;
631 }
632 elsif (/D refs:\s+.*?([\d,]+) rd .*?([\d,]+) wr/) {
633 @res{qw(Dr Dw)} = ($1,$2);
634 }
635 elsif (/D1 misses:\s+.*?([\d,]+) rd .*?([\d,]+) wr/) {
636 @res{qw(Dr_m1 Dw_m1)} = ($1,$2);
637 }
638 elsif (/LLd misses:\s+.*?([\d,]+) rd .*?([\d,]+) wr/) {
639 @res{qw(Dr_mm Dw_mm)} = ($1,$2);
640 }
641 elsif (/Branches:\s+.*?([\d,]+) cond .*?([\d,]+) ind/) {
642 @res{qw(COND IND)} = ($1,$2);
643 }
644 elsif (/Mispredicts:\s+.*?([\d,]+) cond .*?([\d,]+) ind/) {
645 @res{qw(COND_m IND_m)} = ($1,$2);
646 }
647 }
648
649 for my $field (keys %VALID_FIELDS) {
650 die "Error: can't parse '$field' field from cachegrind output:\n$output"
651 unless exists $res{$field};
652 $res{$field} =~ s/,//g;
653 }
654
655 return \%res;
656}
657
658
659# Handle the 'grind' action
660
661sub do_grind {
662 my ($perl_args) = @_; # the residue of @ARGV after option processing
663
957d8930 664 my ($loop_counts, $perls, $results, $tests, $order);
9e7973fa
DM
665 my ($bisect_field, $bisect_min, $bisect_max);
666
667 if (defined $OPTS{bisect}) {
668 ($bisect_field, $bisect_min, $bisect_max) = split /,/, $OPTS{bisect}, 3;
669 die "Error: --bisect option must be of form 'field,integer,integer'\n"
670 unless
671 defined $bisect_max
672 and $bisect_min =~ /^[0-9]+$/
673 and $bisect_max =~ /^[0-9]+$/;
674
675 die "Error: unrecognised field '$bisect_field' in --bisect option\n"
676 unless $VALID_FIELDS{$bisect_field};
677
678 die "Error: --bisect min ($bisect_min) must be <= max ($bisect_max)\n"
679 if $bisect_min > $bisect_max;
680 }
681
4044748b 682 if ($OPTS{read}) {
9e7973fa 683 open my $in, '<:encoding(UTF-8)', $OPTS{read}
4044748b 684 or die " Error: can't open '$OPTS{read}' for reading: $!\n";
9e7973fa
DM
685 my $data = do { local $/; <$in> };
686 close $in;
687
688 my $hash = JSON::PP::decode_json($data);
689 if (int($FORMAT_VERSION) < int($hash->{version})) {
690 die "Error: unsupported version $hash->{version} in file"
691 . "'$OPTS{read}' (too new)\n";
692 }
957d8930
DM
693 ($loop_counts, $perls, $results, $tests, $order) =
694 @$hash{qw(loop_counts perls results tests order)};
9e7973fa
DM
695
696 filter_tests($results);
697 filter_tests($tests);
957d8930
DM
698
699 if (!$order) {
700 $order = [ sort keys %$tests ];
701 }
9e7973fa 702 }
9e7973fa 703
4044748b
YO
704 if (@$perl_args) {
705 unless ($loop_counts) {
706 # How many times to execute the loop for the two trials. The lower
707 # value is intended to do the loop enough times that branch
708 # prediction has taken hold; the higher loop allows us to see the
709 # branch misses after that
710 $loop_counts = [10, 20];
8fbd1c2c 711
4044748b
YO
712 ($tests, $order) = read_tests_file($OPTS{benchfile});
713 die "Error: only a single test may be specified with --bisect\n"
714 if defined $OPTS{bisect} and keys %$tests != 1;
715 }
8fbd1c2c 716
4044748b
YO
717 my @run_perls= process_puts($perls, @$perl_args);
718 push @$perls, @run_perls;
5825b6d4
YO
719 die "Error: Not enough perls to run a report, and --write not specified.\n"
720 if @$perls < 2 and !$OPTS{write};
4044748b 721 $results = grind_run($tests, $order, \@run_perls, $loop_counts, $results);
9e7973fa
DM
722 }
723
5825b6d4
YO
724 if (!$perls or !@$perls) {
725 die "Error: nothing to do: no perls to run, no data to read.\n";
726 }
8fbd1c2c
DM
727 # now that we have a list of perls, use it to process the
728 # 'perl' component of the --norm and --sort args
729
730 $OPTS{norm} = select_a_perl($OPTS{norm}, $perls, "--norm");
731 if (defined $OPTS{'sort-perl'}) {
732 $OPTS{'sort-perl'} =
733 select_a_perl($OPTS{'sort-perl'}, $perls, "--sort");
734 }
735
df3d7b3a
DM
736 if (defined $OPTS{'compact'}) {
737 $OPTS{'compact'} =
738 select_a_perl($OPTS{'compact'}, $perls, "--compact");
739 }
9e7973fa
DM
740 if (defined $OPTS{write}) {
741 my $json = JSON::PP::encode_json({
742 version => $FORMAT_VERSION,
743 loop_counts => $loop_counts,
744 perls => $perls,
745 results => $results,
746 tests => $tests,
957d8930 747 order => $order,
9e7973fa
DM
748 });
749
750 open my $out, '>:encoding(UTF-8)', $OPTS{write}
5825b6d4 751 or die "Error: can't open '$OPTS{write}' for writing: $!\n";
9e7973fa
DM
752 print $out $json or die "Error: writing to file '$OPTS{write}': $!\n";
753 close $out or die "Error: closing file '$OPTS{write}': $!\n";
754 }
5825b6d4
YO
755 if (!$OPTS{write} or $OPTS{show}) {
756 if (@$perls < 2) {
757 die "Error: need more than one perl to do a report.\n";
758 }
9e7973fa
DM
759 my ($processed, $averages) =
760 grind_process($results, $perls, $loop_counts);
761
762 if (defined $OPTS{bisect}) {
763 my @r = values %$results;
764 die "Panic: expected exactly one test result in bisect\n"
765 if @r != 1;
766 @r = values %{$r[0]};
767 die "Panic: expected exactly one perl result in bisect\n"
768 if @r != 1;
769 my $c = $r[0]{$bisect_field};
770 die "Panic: no result in bisect for field '$bisect_field'\n"
771 unless defined $c;
772 exit 0 if $bisect_min <= $c and $c <= $bisect_max;
773 exit 1;
774 }
df3d7b3a
DM
775 elsif (defined $OPTS{compact}) {
776 grind_print_compact($processed, $averages, $OPTS{compact},
777 $perls, $tests, $order);
778 }
9e7973fa 779 else {
957d8930 780 grind_print($processed, $averages, $perls, $tests, $order);
9e7973fa
DM
781 }
782 }
783}
784
785
786# Run cachegrind for every test/perl combo.
787# It may run several processes in parallel when -j is specified.
788# Return a hash ref suitable for input to grind_process()
789
790sub grind_run {
4044748b 791 my ($tests, $order, $perls, $counts, $results) = @_;
9e7973fa
DM
792
793 # Build a list of all the jobs to run
794
795 my @jobs;
796
957d8930 797 for my $test (grep $tests->{$_}, @$order) {
9e7973fa
DM
798
799 # Create two test progs: one with an empty loop and one with code.
800 # Note that the empty loop is actually '{1;}' rather than '{}';
801 # this causes the loop to have a single nextstate rather than a
802 # stub op, so more closely matches the active loop; e.g.:
803 # {1;} => nextstate; unstack
804 # {$x=1;} => nextstate; const; gvsv; sassign; unstack
805 my @prog = (
806 make_perl_prog($test, @{$tests->{$test}}{qw(desc setup)}, '1'),
807 make_perl_prog($test, @{$tests->{$test}}{qw(desc setup code)}),
808 );
809
810 for my $p (@$perls) {
60858fe8 811 my ($perl, $label, $env, @putargs) = @$p;
9e7973fa
DM
812
813 # Run both the empty loop and the active loop
814 # $counts->[0] and $counts->[1] times.
815
816 for my $i (0,1) {
817 for my $j (0,1) {
60858fe8
JC
818 my $envstr = '';
819 if (ref $env) {
820 $envstr .= "$_=$env->{$_} " for sort keys %$env;
821 }
822 my $cmd = "PERL_HASH_SEED=0 $envstr"
9e7973fa
DM
823 . "valgrind --tool=cachegrind --branch-sim=yes "
824 . "--cachegrind-out-file=/dev/null "
825 . "$OPTS{grindargs} "
a9eceb2d 826 . "$perl $OPTS{perlargs} @putargs - $counts->[$j] 2>&1";
9e7973fa 827 # for debugging and error messages
c385646f 828 my $id = "$test/$label "
9e7973fa
DM
829 . ($i ? "active" : "empty") . "/"
830 . ($j ? "long" : "short") . " loop";
831
832 push @jobs, {
833 test => $test,
834 perl => $perl,
835 plabel => $label,
836 cmd => $cmd,
837 prog => $prog[$i],
838 active => $i,
839 loopix => $j,
840 id => $id,
841 };
842 }
843 }
844 }
845 }
846
847 # Execute each cachegrind and store the results in %results.
848
849 local $SIG{PIPE} = 'IGNORE';
850
851 my $max_jobs = $OPTS{jobs};
852 my $running = 0; # count of executing jobs
853 my %pids; # map pids to jobs
854 my %fds; # map fds to jobs
9e7973fa
DM
855 my $select = IO::Select->new();
856
857 while (@jobs or $running) {
858
859 if ($OPTS{debug}) {
860 printf "Main loop: pending=%d running=%d\n",
861 scalar(@jobs), $running;
862 }
863
864 # Start new jobs
865
866 while (@jobs && $running < $max_jobs) {
867 my $job = shift @jobs;
868 my ($id, $cmd) =@$job{qw(id cmd)};
869
870 my ($in, $out, $pid);
871 warn "Starting $id\n" if $OPTS{verbose};
872 eval { $pid = IPC::Open2::open2($out, $in, $cmd); 1; }
873 or die "Error: while starting cachegrind subprocess"
874 ." for $id:\n$@";
875 $running++;
876 $pids{$pid} = $job;
877 $fds{"$out"} = $job;
878 $job->{out_fd} = $out;
879 $job->{output} = '';
880 $job->{pid} = $pid;
881
882 $out->blocking(0);
883 $select->add($out);
884
885 if ($OPTS{debug}) {
886 print "Started pid $pid for $id\n";
887 }
888
889 # Note:
890 # In principle we should write to $in in the main select loop,
891 # since it may block. In reality,
892 # a) the code we write to the perl process's stdin is likely
893 # to be less than the OS's pipe buffer size;
894 # b) by the time the perl process has read in all its stdin,
895 # the only output it should have generated is a few lines
896 # of cachegrind output preamble.
897 # If these assumptions change, then perform the following print
898 # in the select loop instead.
899
900 print $in $job->{prog};
901 close $in;
902 }
903
904 # Get output of running jobs
905
906 if ($OPTS{debug}) {
907 printf "Select: waiting on (%s)\n",
908 join ', ', sort { $a <=> $b } map $fds{$_}{pid},
909 $select->handles;
910 }
911
912 my @ready = $select->can_read;
913
914 if ($OPTS{debug}) {
915 printf "Select: pids (%s) ready\n",
916 join ', ', sort { $a <=> $b } map $fds{$_}{pid}, @ready;
917 }
918
919 unless (@ready) {
920 die "Panic: select returned no file handles\n";
921 }
922
923 for my $fd (@ready) {
924 my $j = $fds{"$fd"};
925 my $r = sysread $fd, $j->{output}, 8192, length($j->{output});
926 unless (defined $r) {
927 die "Panic: Read from process running $j->{id} gave:\n$!";
928 }
929 next if $r;
930
931 # EOF
932
933 if ($OPTS{debug}) {
934 print "Got eof for pid $fds{$fd}{pid} ($j->{id})\n";
935 }
936
937 $select->remove($j->{out_fd});
938 close($j->{out_fd})
939 or die "Panic: closing output fh on $j->{id} gave:\n$!\n";
940 $running--;
941 delete $fds{"$j->{out_fd}"};
942 my $output = $j->{output};
943
944 if ($OPTS{debug}) {
945 my $p = $j->{prog};
946 $p =~ s/^/ : /mg;
947 my $o = $output;
948 $o =~ s/^/ : /mg;
949
950 print "\n$j->{id}/\nCommand: $j->{cmd}\n"
951 . "Input:\n$p"
952 . "Output\n$o";
953 }
954
4044748b 955 $results->{$j->{test}}{$j->{plabel}}[$j->{active}][$j->{loopix}]
9e7973fa
DM
956 = parse_cachegrind($output, $j->{id}, $j->{perl});
957 }
958
959 # Reap finished jobs
960
961 while (1) {
962 my $kid = waitpid(-1, WNOHANG);
963 my $ret = $?;
964 last if $kid <= 0;
965
966 unless (exists $pids{$kid}) {
967 die "Panic: reaped unexpected child $kid";
968 }
969 my $j = $pids{$kid};
970 if ($ret) {
971 die sprintf("Error: $j->{id} gave return status 0x%04x\n", $ret)
972 . "with the following output\n:$j->{output}\n";
973 }
974 delete $pids{$kid};
975 }
976 }
977
4044748b 978 return $results;
9e7973fa
DM
979}
980
981
982
983
984# grind_process(): process the data that has been extracted from
985# cachgegrind's output.
986#
987# $res is of the form ->{benchmark_name}{perl_name}[active][count]{field_name},
988# where active is 0 or 1 indicating an empty or active loop,
989# count is 0 or 1 indicating a short or long loop. E.g.
990#
991# $res->{'expr::assign::scalar_lex'}{perl-5.21.1}[0][10]{Dw_mm}
992#
993# The $res data structure is modified in-place by this sub.
994#
995# $perls is [ [ perl-exe, perl-label], .... ].
996#
997# $counts is [ N, M ] indicating the counts for the short and long loops.
998#
999#
1000# return \%output, \%averages, where
1001#
1002# $output{benchmark_name}{perl_name}{field_name} = N
1003# $averages{perl_name}{field_name} = M
1004#
1005# where N is the raw count ($OPTS{raw}), or count_perl0/count_perlI otherwise;
1006# M is the average raw count over all tests ($OPTS{raw}), or
1007# 1/(sum(count_perlI/count_perl0)/num_tests) otherwise.
1008
1009sub grind_process {
1010 my ($res, $perls, $counts) = @_;
1011
1012 # Process the four results for each test/perf combo:
1013 # Convert
1014 # $res->{benchmark_name}{perl_name}[active][count]{field_name} = n
1015 # to
1016 # $res->{benchmark_name}{perl_name}{field_name} = averaged_n
1017 #
1018 # $r[0][1] - $r[0][0] is the time to do ($counts->[1]-$counts->[0])
1019 # empty loops, eliminating startup time
1020 # $r[1][1] - $r[1][0] is the time to do ($counts->[1]-$counts->[0])
1021 # active loops, eliminating startup time
1022 # (the two startup times may be different because different code
1023 # is being compiled); the difference of the two results above
1024 # divided by the count difference is the time to execute the
1025 # active code once, eliminating both startup and loop overhead.
1026
1027 for my $tests (values %$res) {
1028 for my $r (values %$tests) {
1029 my $r2;
1030 for (keys %{$r->[0][0]}) {
1031 my $n = ( ($r->[1][1]{$_} - $r->[1][0]{$_})
1032 - ($r->[0][1]{$_} - $r->[0][0]{$_})
1033 ) / ($counts->[1] - $counts->[0]);
1034 $r2->{$_} = $n;
1035 }
1036 $r = $r2;
1037 }
1038 }
1039
1040 my %totals;
1041 my %counts;
1042 my %data;
1043
1a961f9f 1044 my $perl_norm = $perls->[$OPTS{norm}][1]; # the label of the reference perl
9e7973fa
DM
1045
1046 for my $test_name (keys %$res) {
1047 my $res1 = $res->{$test_name};
1048 my $res2_norm = $res1->{$perl_norm};
1049 for my $perl (keys %$res1) {
1050 my $res2 = $res1->{$perl};
1051 for my $field (keys %$res2) {
1052 my ($p, $q) = ($res2_norm->{$field}, $res2->{$field});
1053
1054 if ($OPTS{raw}) {
1055 # Avoid annoying '-0.0' displays. Ideally this number
1056 # should never be negative, but fluctuations in
1057 # startup etc can theoretically make this happen
1058 $q = 0 if ($q <= 0 && $q > -0.1);
1059 $totals{$perl}{$field} += $q;
1060 $counts{$perl}{$field}++;
1061 $data{$test_name}{$perl}{$field} = $q;
1062 next;
1063 }
1064
1065 # $p and $q are notionally integer counts, but
1066 # due to variations in startup etc, it's possible for a
1067 # count which is supposedly zero to be calculated as a
1068 # small positive or negative value.
1069 # In this case, set it to zero. Further below we
1070 # special-case zeros to avoid division by zero errors etc.
1071
1072 $p = 0.0 if $p < 0.01;
1073 $q = 0.0 if $q < 0.01;
1074
1075 if ($p == 0.0 && $q == 0.0) {
1076 # Both perls gave a count of zero, so no change:
1077 # treat as 100%
1078 $totals{$perl}{$field} += 1;
1079 $counts{$perl}{$field}++;
1080 $data{$test_name}{$perl}{$field} = 1;
1081 }
1082 elsif ($p == 0.0 || $q == 0.0) {
1083 # If either count is zero, there were too few events
1084 # to give a meaningful ratio (and we will end up with
1085 # division by zero if we try). Mark the result undef,
1086 # indicating that it shouldn't be displayed; and skip
1087 # adding to the average
1088 $data{$test_name}{$perl}{$field} = undef;
1089 }
1090 else {
1091 # For averages, we record q/p rather than p/q.
1092 # Consider a test where perl_norm took 1000 cycles
1093 # and perlN took 800 cycles. For the individual
1094 # results we display p/q, or 1.25; i.e. a quarter
1095 # quicker. For the averages, we instead sum all
1096 # the 0.8's, which gives the total cycles required to
1097 # execute all tests, with all tests given equal
1098 # weight. Later we reciprocate the final result,
1099 # i.e. 1/(sum(qi/pi)/n)
1100
1101 $totals{$perl}{$field} += $q/$p;
1102 $counts{$perl}{$field}++;
1103 $data{$test_name}{$perl}{$field} = $p/$q;
1104 }
1105 }
1106 }
1107 }
1108
1109 # Calculate averages based on %totals and %counts accumulated earlier.
1110
1111 my %averages;
1112 for my $perl (keys %totals) {
1113 my $t = $totals{$perl};
1114 for my $field (keys %$t) {
1115 $averages{$perl}{$field} = $OPTS{raw}
1116 ? $t->{$field} / $counts{$perl}{$field}
1117 # reciprocal - see comments above
1118 : $counts{$perl}{$field} / $t->{$field};
1119 }
1120 }
1121
1122 return \%data, \%averages;
1123}
1124
1125
9e7973fa 1126
df3d7b3a 1127# print a standard blurb at the start of the grind display
9e7973fa 1128
df3d7b3a
DM
1129sub grind_blurb {
1130 my ($perls) = @_;
9e7973fa
DM
1131
1132 print <<EOF;
1133Key:
1134 Ir Instruction read
1135 Dr Data read
1136 Dw Data write
1137 COND conditional branches
1138 IND indirect branches
1139 _m branch predict miss
1140 _m1 level 1 cache miss
1141 _mm last cache (e.g. L3) miss
1142 - indeterminate percentage (e.g. 1/0)
1143
1144EOF
1145
1146 if ($OPTS{raw}) {
1147 print "The numbers represent raw counts per loop iteration.\n";
1148 }
1149 else {
1150 print <<EOF;
1151The numbers represent relative counts per loop iteration, compared to
df3d7b3a 1152$perls->[$OPTS{norm}][1] at 100.0%.
9e7973fa
DM
1153Higher is better: for example, using half as many instructions gives 200%,
1154while using twice as many gives 50%.
1155EOF
1156 }
df3d7b3a
DM
1157}
1158
1159
1160# return a sorted list of the test names, plus 'AVERAGE'
9e7973fa 1161
df3d7b3a
DM
1162sub sorted_test_names {
1163 my ($results, $order, $perls) = @_;
9e7973fa 1164
df3d7b3a 1165 my @names;
9e7973fa
DM
1166 unless ($OPTS{average}) {
1167 if (defined $OPTS{'sort-field'}) {
1168 my ($field, $perlix) = @OPTS{'sort-field', 'sort-perl'};
beb8db25 1169 my $perl = $perls->[$perlix][1];
df3d7b3a 1170 @names = sort
9e7973fa
DM
1171 {
1172 $results->{$a}{$perl}{$field}
1173 <=> $results->{$b}{$perl}{$field}
1174 }
1175 keys %$results;
1176 }
1177 else {
df3d7b3a 1178 @names = grep $results->{$_}, @$order;
9e7973fa
DM
1179 }
1180 }
1181
1182 # No point in displaying average for only one test.
df3d7b3a
DM
1183 push @names, 'AVERAGE' unless @names == 1;
1184 @names;
1185}
1186
1187
1188# grind_print(): display the tabulated results of all the cachegrinds.
1189#
1190# Arguments are of the form:
1191# $results->{benchmark_name}{perl_name}{field_name} = N
1192# $averages->{perl_name}{field_name} = M
1193# $perls = [ [ perl-exe, perl-label ], ... ]
1194# $tests->{test_name}{desc => ..., ...}
1195
1196sub grind_print {
1197 my ($results, $averages, $perls, $tests, $order) = @_;
1198
1199 my @perl_names = map $_->[0], @$perls;
1a961f9f 1200 my @perl_labels = map $_->[1], @$perls;
df3d7b3a
DM
1201 my %perl_labels;
1202 $perl_labels{$_->[0]} = $_->[1] for @$perls;
1203
1204 my $field_label_width = 6;
1205 # Calculate the width to display for each column.
1206 my $min_width = $OPTS{raw} ? 8 : 6;
1207 my @widths = map { length($_) < $min_width ? $min_width : length($_) }
1a961f9f 1208 @perl_labels;
df3d7b3a
DM
1209
1210 # Print standard header.
1211 grind_blurb($perls);
1212
1213 my @test_names = sorted_test_names($results, $order, $perls);
9e7973fa
DM
1214
1215 # If only a single field is to be displayed, use a more compact
1216 # format with only a single line of output per test.
1217
1218 my $one_field = defined $OPTS{fields} && keys(%{$OPTS{fields}}) == 1;
1219
1220 if ($one_field) {
91cde97c 1221 print "Results for field " . (keys(%{$OPTS{fields}}))[0] . ".\n";
9e7973fa
DM
1222
1223 # The first column will now contain test names rather than
1224 # field names; Calculate the max width.
1225
1226 $field_label_width = 0;
1227 for (@test_names) {
1228 $field_label_width = length if length > $field_label_width;
1229 }
1230
1231 # Print the perl executables header.
1232
1233 print "\n";
1234 for my $i (0,1) {
1235 print " " x $field_label_width;
1236 for (0..$#widths) {
1237 printf " %*s", $widths[$_],
1a961f9f 1238 $i ? ('-' x$widths[$_]) : $perl_labels[$_];
9e7973fa
DM
1239 }
1240 print "\n";
1241 }
1242 }
1243
1244 # Dump the results for each test.
1245
1246 for my $test_name (@test_names) {
1247 my $doing_ave = ($test_name eq 'AVERAGE');
1248 my $res1 = $doing_ave ? $averages : $results->{$test_name};
1249
1250 unless ($one_field) {
1251 print "\n$test_name";
1252 print "\n$tests->{$test_name}{desc}" unless $doing_ave;
1253 print "\n\n";
1254
1255 # Print the perl executables header.
1256 for my $i (0,1) {
1257 print " " x $field_label_width;
1258 for (0..$#widths) {
1259 printf " %*s", $widths[$_],
1a961f9f 1260 $i ? ('-' x$widths[$_]) : $perl_labels[$_];
9e7973fa
DM
1261 }
1262 print "\n";
1263 }
1264 }
1265
1266 for my $field (qw(Ir Dr Dw COND IND
1267 N
1268 COND_m IND_m
1269 N
1270 Ir_m1 Dr_m1 Dw_m1
1271 N
1272 Ir_mm Dr_mm Dw_mm
1273 ))
1274 {
1275 next if $OPTS{fields} and ! exists $OPTS{fields}{$field};
1276
1277 if ($field eq 'N') {
1278 print "\n";
1279 next;
1280 }
1281
91cde97c
DM
1282 if ($one_field) {
1283 printf "%-*s", $field_label_width, $test_name;
1284 }
1285 else {
1286 printf "%*s", $field_label_width, $field;
1287 }
9e7973fa
DM
1288
1289 for my $i (0..$#widths) {
1a961f9f 1290 my $res2 = $res1->{$perl_labels[$i]};
9e7973fa
DM
1291 my $p = $res2->{$field};
1292 if (!defined $p) {
1293 printf " %*s", $widths[$i], '-';
1294 }
1295 elsif ($OPTS{raw}) {
1296 printf " %*.1f", $widths[$i], $p;
1297 }
1298 else {
1299 printf " %*.2f", $widths[$i], $p * 100;
1300 }
1301 }
1302 print "\n";
1303 }
1304 }
1305}
1306
1307
df3d7b3a
DM
1308
1309# grind_print_compact(): like grind_print(), but display a single perl
1310# in a compact form. Has an additional arg, $which_perl, which specifies
1311# which perl to display.
1312#
1313# Arguments are of the form:
1314# $results->{benchmark_name}{perl_name}{field_name} = N
1315# $averages->{perl_name}{field_name} = M
1316# $perls = [ [ perl-exe, perl-label ], ... ]
1317# $tests->{test_name}{desc => ..., ...}
1318
1319sub grind_print_compact {
1320 my ($results, $averages, $which_perl, $perls, $tests, $order) = @_;
1321
1322
1323 # the width to display for each column.
1324 my $width = $OPTS{raw} ? 7 : 6;
1325
1326 # Print standard header.
1327 grind_blurb($perls);
1328
1329 print "\nResults for $perls->[$which_perl][1]\n\n";
1330
1331 my @test_names = sorted_test_names($results, $order, $perls);
1332
1333 # Dump the results for each test.
1334
1335 my @fields = qw( Ir Dr Dw
1336 COND IND
1337 COND_m IND_m
1338 Ir_m1 Dr_m1 Dw_m1
1339 Ir_mm Dr_mm Dw_mm
1340 );
1341 if ($OPTS{fields}) {
1342 @fields = grep exists $OPTS{fields}{$_}, @fields;
1343 }
1344
1345 printf " %*s", $width, $_ for @fields;
1346 print "\n";
1347 printf " %*s", $width, '------' for @fields;
1348 print "\n";
1349
1350 for my $test_name (@test_names) {
1351 my $doing_ave = ($test_name eq 'AVERAGE');
1352 my $res = $doing_ave ? $averages : $results->{$test_name};
beb8db25 1353 $res = $res->{$perls->[$which_perl][1]};
df3d7b3a
DM
1354
1355 for my $field (@fields) {
1356 my $p = $res->{$field};
1357 if (!defined $p) {
1358 printf " %*s", $width, '-';
1359 }
1360 elsif ($OPTS{raw}) {
1361 printf " %*.1f", $width, $p;
1362 }
1363 else {
1364 printf " %*.2f", $width, $p * 100;
1365 }
1366
1367 }
1368
1369 print " $test_name\n";
1370 }
1371}
1372
1373
9e7973fa
DM
1374# do_selftest(): check that we can parse known cachegrind()
1375# output formats. If the output of cachegrind changes, add a *new*
1376# test here; keep the old tests to make sure we continue to parse
1377# old cachegrinds
1378
1379sub do_selftest {
1380
1381 my @tests = (
1382 'standard',
1383 <<'EOF',
1384==32350== Cachegrind, a cache and branch-prediction profiler
1385==32350== Copyright (C) 2002-2013, and GNU GPL'd, by Nicholas Nethercote et al.
1386==32350== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
1387==32350== Command: perl5211o /tmp/uiS2gjdqe5 1
1388==32350==
1389--32350-- warning: L3 cache found, using its data for the LL simulation.
1390==32350==
1391==32350== I refs: 1,124,055
1392==32350== I1 misses: 5,573
1393==32350== LLi misses: 3,338
1394==32350== I1 miss rate: 0.49%
1395==32350== LLi miss rate: 0.29%
1396==32350==
1397==32350== D refs: 404,275 (259,191 rd + 145,084 wr)
1398==32350== D1 misses: 9,608 ( 6,098 rd + 3,510 wr)
1399==32350== LLd misses: 5,794 ( 2,781 rd + 3,013 wr)
1400==32350== D1 miss rate: 2.3% ( 2.3% + 2.4% )
1401==32350== LLd miss rate: 1.4% ( 1.0% + 2.0% )
1402==32350==
1403==32350== LL refs: 15,181 ( 11,671 rd + 3,510 wr)
1404==32350== LL misses: 9,132 ( 6,119 rd + 3,013 wr)
1405==32350== LL miss rate: 0.5% ( 0.4% + 2.0% )
1406==32350==
1407==32350== Branches: 202,372 (197,050 cond + 5,322 ind)
1408==32350== Mispredicts: 19,153 ( 17,742 cond + 1,411 ind)
1409==32350== Mispred rate: 9.4% ( 9.0% + 26.5% )
1410EOF
1411 {
1412 COND => 197050,
1413 COND_m => 17742,
1414 Dr => 259191,
1415 Dr_m1 => 6098,
1416 Dr_mm => 2781,
1417 Dw => 145084,
1418 Dw_m1 => 3510,
1419 Dw_mm => 3013,
1420 IND => 5322,
1421 IND_m => 1411,
1422 Ir => 1124055,
1423 Ir_m1 => 5573,
1424 Ir_mm => 3338,
1425 },
1426 );
1427
5051ccfe
DM
1428 for ('./t', '.') {
1429 my $t = "$_/test.pl";
1430 next unless -f $t;
1431 require $t;
9e7973fa
DM
1432 }
1433 plan(@tests / 3 * keys %VALID_FIELDS);
1434
1435 while (@tests) {
1436 my $desc = shift @tests;
1437 my $output = shift @tests;
1438 my $expected = shift @tests;
1439 my $p = parse_cachegrind($output);
1440 for (sort keys %VALID_FIELDS) {
1441 is($p->{$_}, $expected->{$_}, "$desc, $_");
1442 }
1443 }
1444}