FIXME this can't go into blead as is
Optimization to make the Schwartzian transform faster
This patch consists of two conceptually separate but technically
entwined changes. In a nutshell, the idea is to detect the use of an
array dereference for sorting on precomputed data. This means sort
invocations like the following:
sort { $a->[0] <=> $b->[0] }
and
sort { $a->[0] cmp $b->[0] }
as well as the reversed cases.
The two major changes are:
1) Detection of a basic Schwartzian transform-style sort blocks
and removal of the corresponding bits of OP tree. Sets a new flag on
the sort OP which will indicate that this sort needs to be performed
by one of the special-case optimized C functions in pp_sort.
Right now, we only match array dereferencing using the constant 0.
Technically, it is trivial to match any other constant, and feasible to
match the use of a variable. But I could not think of a good way to pass
the array index to pp_sort.
2) Implementation of special-purpose sort functions that do
array-dereferencing. This change implements the above sort functions for
all of numeric, integer, and string sorts, as well as overloaded values
in the arrays and locale-string sorts.
It supports overload magic inner SVs (see tests added in next
commit). I believe it also handles the locale cmp case in the same way
as the normal optimized sort functions do, but since I don't properly
understand locales, the tests don't cover that.
The net effect of this is a consistent speed-up by over 2x on my
machine when doing a Schwartzian transform or similar technique
involving sorting on the first element of an array. I tried a variety
of array sizes, as well as inclusion of the two map stages of the
idiom.