[MERGE] various tr/// fixups, esp for /c and /d
authorDavid Mitchell <davem@iabyn.com>
Fri, 19 Jan 2018 14:08:28 +0000 (14:08 +0000)
committerDavid Mitchell <davem@iabyn.com>
Fri, 19 Jan 2018 14:08:28 +0000 (14:08 +0000)
commitb1f1599cf3d4ae0cb1f1f3c9d379d89dca1873a0
tree22f6c8767e5e5bc35a947864d39accb1b586eb45
parent840d136c2c4af2a91d93c448456a52a1dda730b2
parentd503fd62fb1d3844fa51a15604fdceab850bc688
[MERGE] various tr/// fixups, esp for /c and /d

This branch does the following:

Fixes an issue with tr/non_utf8/long_non_utf8/c, where
length(long_non_utf8) > 0x7fff.

Fixes an issue with tr/non_utf8/non_utf8/cd: basically, the
implicit \x{100}-\x{7fffffff} added to the searchlist by /c wasn't being
added.

Adds a lot of code comments to the various tr/// functions.

Adds tr///c tests - basically /c was almost completely untested.

Changes the layout of the op_pv transliteration table: it used to be roughly

      256 x short  - basic table
        1 x short  - length of extended table (n)
        n x short  - extended table

where the 2 and 3rd items were only present under /c. Its now

        1 x Size_t - length of table (256+n)
  (256+n) x short  - table - both basic and extended

where n == 0 apart from under /c.

The new table format also allowed the tr/non_utf8/non_utf8/ code branches
to be considerably simplified.

op_dump() now dumps the contents of the (non-utf8 variant) transliteration
table.

Removes I32's from the tr/non_utf8/non_utf8/ code paths, making it fully
64-bit clean.

Improves the pod for tr///.