代码之家  ›  专栏  ›  技术社区  ›  onassar

mb\u str\u replace()…速度慢。有别的选择吗?

  •  7
  • onassar  · 技术社区  · 15 年前

    有什么建议吗?我正在考虑使用preg\u replace,因为它是本机的,并且是编译的,所以可能会更快。如有任何想法,我们将不胜感激。

    4 回复  |  直到 15 年前
        1
  •  16
  •   Áxel Costas Pena    8 年前

    如前所述 there 只要所有参数都是utf-8有效的,str\u replace就可以安全地在utf-8上下文中使用,因为这两个多字节编码的字符串之间不会有任何不明确的匹配。如果检查输入的有效性,则无需寻找其他函数。

        2
  •  3
  •   Alain    11 年前

    当到处都有输入(utf8或其他)时,编码是一个真正的挑战,我更喜欢只使用多字节安全函数。为了 str_replace ,我正在使用 this one

    if (!function_exists('mb_str_replace'))
    {
       function mb_str_replace($search, $replace, $subject, &$count = 0)
       {
          if (!is_array($subject))
          {
             $searches = is_array($search) ? array_values($search) : array($search);
             $replacements = is_array($replace) ? array_values($replace) : array($replace);
             $replacements = array_pad($replacements, count($searches), '');
             foreach ($searches as $key => $search)
             {
                $parts = mb_split(preg_quote($search), $subject);
                $count += count($parts) - 1;
                $subject = implode($replacements[$key], $parts);
             }
          }
          else
          {
             foreach ($subject as $key => $value)
             {
                $subject[$key] = mb_str_replace($search, $replace, $value, $count);
             }
          }
          return $subject;
       }
    }
    
        3
  •  2
  •   Community CDub    8 年前

    这是我的实现,基于 Alain's answer :

    /**
     * Replace all occurrences of the search string with the replacement string. Multibyte safe.
     *
     * @param string|array $search The value being searched for, otherwise known as the needle. An array may be used to designate multiple needles.
     * @param string|array $replace The replacement value that replaces found search values. An array may be used to designate multiple replacements.
     * @param string|array $subject The string or array being searched and replaced on, otherwise known as the haystack.
     *                              If subject is an array, then the search and replace is performed with every entry of subject, and the return value is an array as well.
     * @param string $encoding The encoding parameter is the character encoding. If it is omitted, the internal character encoding value will be used.
     * @param int $count If passed, this will be set to the number of replacements performed.
     * @return array|string
     */
    public static function mbReplace($search, $replace, $subject, $encoding = 'auto', &$count=0) {
        if(!is_array($subject)) {
            $searches = is_array($search) ? array_values($search) : [$search];
            $replacements = is_array($replace) ? array_values($replace) : [$replace];
            $replacements = array_pad($replacements, count($searches), '');
            foreach($searches as $key => $search) {
                $replace = $replacements[$key];
                $search_len = mb_strlen($search, $encoding);
    
                $sb = [];
                while(($offset = mb_strpos($subject, $search, 0, $encoding)) !== false) {
                    $sb[] = mb_substr($subject, 0, $offset, $encoding);
                    $subject = mb_substr($subject, $offset + $search_len, null, $encoding);
                    ++$count;
                }
                $sb[] = $subject;
                $subject = implode($replace, $sb);
            }
        } else {
            foreach($subject as $key => $value) {
                $subject[$key] = self::mbReplace($search, $replace, $value, $encoding, $count);
            }
        }
        return $subject;
    }
    

    他不接受字符编码,不过我想你可以通过 mb_regex_encoding .

    我的单元测试通过:

    function testMbReplace() {
        $this->assertSame('bbb',Str::mbReplace('a','b','aaa','auto',$count1));
        $this->assertSame(3,$count1);
        $this->assertSame('ccc',Str::mbReplace(['a','b'],['b','c'],'aaa','auto',$count2));
        $this->assertSame(6,$count2);
        $this->assertSame("\xbf\x5c\x27",Str::mbReplace("\x27","\x5c\x27","\xbf\x27",'iso-8859-1'));
        $this->assertSame("\xbf\x27",Str::mbReplace("\x27","\x5c\x27","\xbf\x27",'gbk'));
    }
    
        4
  •  1
  •   RobC Manuel Gonzalez    6 年前

    上的最高级注释 http://php.net/manual/en/ref.mbstring.php#109937 str_replace 适用于多字节字符串。

    推荐文章