代码之家  ›  专栏  ›  技术社区  ›  pixeline

在html正则表达式中查找和替换失败

  •  0
  • pixeline  · 技术社区  · 14 年前

    this thread .)

    $find = '/(?![^<]+>)(?<!\w)(' . preg_quote($t['label']) . ')\b/s';
    $text = preg_replace_callback($find, 'replaceCallback', $text);
    
    function replaceCallback($match) {
            if (is_array($match)) {
                $htmlVersion = $match[1];
                $urlVersion = urlencode($htmlVersion);
                return '<a class="tag" rel="tag-definition" title="Click to know more about ' . $htmlVersion . '" href="?tag=' . $urlVersion . '">' . $htmlVersion . '</a>';
            }
            return $match;
        }
    

    错误消息指向preg\u replace\u回调调用并显示:

    Warning: preg_replace_callback() [function.preg-replace-callback]: Unknown modifier 't' in /frontend.functions.php  on line 43
    
    1 回复  |  直到 7 年前
        1
  •  0
  •   Mike    14 年前

    请注意 :这是 试图为正则表达式提供修复程序。这里只是想说明创建一个能够成功解析HTML的regex有多困难(我敢说是不可能的)。即使结构良好的XHTML也会非常困难,但结构不良的HTML是正则表达式的禁区。

    我100%同意使用正则表达式来尝试HTML解析是一个非常糟糕的主意。下面的代码使用提供的函数来解析一些简单的HTML标记。当它找到嵌套的HTML标记时,第二次尝试就失败了 <em>Test<em> :

    $t['label'] = 'Test';
    $text = '<p>Test</p>';
    
    $find = '/(?![^<]+>)(?<!\w)(' . preg_quote($t['label']) . ')\b/s';
    $text = preg_replace_callback($find, 'replaceCallback', $text);
    
    echo "Find:   $find\n";
    echo 'Quote:  ' . preg_quote($t['label']) . "\n";
    echo "Result: $text\n";
    
    /* Returns:
    
    Find:   /(?![^<]+>)(?<!\w)(Test)\b/s
    Quote:  Test
    Result: <p><a class="tag" rel="tag-definition" title="Click to know more about Test" href="?tag=Test">Test</a></p>
    
    */
    
    $t['label'] = '<em>Test</em>';
    $text = '<p>Test</p>';
    
    $find = '/(?![^<]+>)(?<!\w)(' . preg_quote($t['label']) . ')\b/s';
    $text = preg_replace_callback($find, 'replaceCallback', $text);
    
    echo "Find:   $find\n";
    echo 'Quote:  ' . preg_quote($t['label']) . "\n";
    echo "Result: $text\n";
    
    /* Returns:
    
    Find:   /(?![^<]+>)(?<!\w)(Test)\b/s
    Quote:  Test
    Result: <p><a class="tag" rel="tag-definition" title="Click to know more about Test" href="?tag=Test">Test</a></p>
    Warning: preg_replace_callback() [function.preg-replace-callback]: Unknown modifier '\' in /test.php  on line 25
    Find:   /(?![^<]+>)(?<!\w)(\<em\>Test\</em\>)\b/s
    Quote:  \<em\>Test\</em\>
    
    Result: 
    
    */
    
    function replaceCallback($match) {
        if (is_array($match)) {
            $htmlVersion = $match[1];
            $urlVersion = urlencode($htmlVersion);
            return '<a class="tag" rel="tag-definition" title="Click to know more about ' . $htmlVersion . '" href="?tag=' . $urlVersion . '">' . $htmlVersion . '</a>';
        }
        return $match;
    }