[RT #111842] prevent TRIE overwriting EXACT following NOTHING at start
authorYves Orton <demerphq@gmail.com>
Mon, 19 Mar 2012 23:46:45 +0000 (00:46 +0100)
committerYves Orton <demerphq@gmail.com>
Tue, 20 Mar 2012 10:20:06 +0000 (11:20 +0100)
Fixes RT #111842. Example:

    "x" =~ /\A(?>(?:(?:)A|B|C?x))\z/

Should match, but didn't due to allowing NOTHING to start a sequence.
See comment in patch for details.

This also changes a test to no longer be TODO, and improves the test
name to explain its purpose.

regcomp.c
t/re/pat_advanced.t

index 8c287bf..70d6c3b 100644 (file)
--- a/regcomp.c
+++ b/regcomp.c
@@ -3325,11 +3325,23 @@ S_study_chunk(pTHX_ RExC_state_t *pRExC_state, regnode **scanp,
                             if ( noper_trietype
                                   &&
                                   (
-                                        ( noper_trietype == NOTHING )
-                                        ||
-                                        ( trietype == NOTHING )
-                                        ||
-                                        ( trietype == noper_trietype )
+                                        /* XXX: Currently we cannot allow a NOTHING node to be the first element
+                                         * of a TRIEABLE sequence, Otherwise we will overwrite the regop following
+                                         * the NOTHING with the TRIE regop later on. This is because a NOTHING node
+                                         * is only one regnode wide, and a TRIE is two regnodes. An example of a
+                                         * problematic pattern is: "x" =~ /\A(?>(?:(?:)A|B|C?x))\z/
+                                         * At a later point of time we can somewhat workaround this by handling
+                                         * NOTHING -> EXACT sequences as generated by /(?:)A|(?:)B/ type patterns,
+                                         * as we can effectively ignore the NOTHING regop in that case.
+                                         * This clause, which allows NOTHING to start a sequence is left commented
+                                         * out as a reference.
+                                         * - Yves
+
+                                           ( noper_trietype == NOTHING)
+                                           || ( trietype == NOTHING )
+                                        */
+                                        ( noper_trietype == NOTHING && trietype )
+                                        || ( trietype == noper_trietype )
                                   )
 #ifdef NOJUMPTRIE
                                   && noper_next == tail
index 775f663..15f25b5 100644 (file)
@@ -2069,10 +2069,8 @@ EOP
         like("\xC0", $p, "Verify \"\\xC0\" =~ /[\\xE0_]/i; pattern in utf8");
     }
 
-    {
-        local $::TODO = 'RT #111842';
-        ok "x" =~ /\A(?>(?:(?:)A|B|C?x))\z/, "EXACT nodetypes";
-    }
+    ok "x" =~ /\A(?>(?:(?:)A|B|C?x))\z/,
+        "Check TRIE does not overwrite EXACT following NOTHING at start - RT #111842";
 
     #
     # Keep the following tests last -- they may crash perl