代码之家 › 专栏 › 技术社区 › hillu

在Perl中,将字符串转换为字符列表的明智方法是什么?

split string perl

hillu · 技术社区 · 15 年前

我一直在想,是否有一种更好、更简洁的方法可以将字符串拆分为字符

@characters = split //, $string

这并不难理解,但不知何故,正则表达式的使用在我看来似乎有些过分了。

我想到了这个:

@characters = map { substr $string, $_, 1 } 0 .. length($string) - 1

但我觉得它更难看,可读性也更低。将该字符串拆分为字符的首选方式是什么?

7 回复 | 直到 15 年前

itub 15 年前

为什么使用正则表达式会“过分”呢?许多人担心Perl中的正则表达式过于死板,因为他们认为运行它们需要一个高度复杂且缓慢的正则表达式算法。这并不总是正确的:实现是高度优化的,并且许多简单的情况都经过了特殊处理:看起来像正则表达式的东西实际上可以执行简单的子字符串搜索。如果这种情况发生,我一点也不会感到惊讶 split 分裂 是比你的 map 在一些测试中,我运行。 unpack 分裂 .

我推荐因为这是“惯用”的方式。您可以在perldoc和许多书籍中找到它,任何优秀的Perl程序员都应该知道它(如果您不确定您的读者是否理解它,您可以像某人建议的那样向代码中添加注释)

OTOH,如果正则表达式只是因为语法难看而被“过度使用”,那么我就太主观了,不能说任何话

Brad Gilbert 15 年前

各种示例和速度比较。

我想看看有些方法在每个字符上拆分字符串的速度有多快可能是个好主意。

use 5.010;
use Benchmark qw(:all) ;
my %bench = (
   'split' => sub{
     state $string = 'x' x 1000;
     my @chars = split //, $string;
     \@chars;
   },
   'split-string' => sub{
     state $string = 'x' x 1000;
     my @chars = split '', $string;
     \@chars;
   },
   'split-capture' => sub{
     state $string = 'x' x 1000;
     my @chars = split /(.)/, $string;
     \@chars;
   },
   'unpack' => sub{
     state $string = 'x' x 1000;
     my @chars = unpack( '(a)*', $string );
     \@chars;
   },
   'match' => sub{
     state $string = 'x' x 1000;
     my @chars = $string =~ /./gs;
     \@chars;
   },
   'match-capture' => sub{
     state $string = 'x' x 1000;
     my @chars = $string =~ /(.)/gs;
     \@chars;
   },
   'map-substr' => sub{
     state $string = 'x' x 1000;
     my @chars = map { substr $string, $_, 1 } 0 .. length($string) - 1;
     \@chars;
   },
);
# set the initial state of $string
$_->() for values %bench;
cmpthese( -10, \%bench );

for perl in /usr/bin/perl /opt/perl-5.10.1/bin/perl /opt/perl-5.11.2/bin/perl;
do
  $perl -v | perl -nlE'if( /(v5\.\d+\.\d+)/ ){
    say "## Perl $1";
    say "<pre>";
    last;
  }';
  $perl test.pl;
  echo -e '</pre>\n';
done

Perl v5.10.0

               Rate split-capture match-capture map-substr match unpack split split-string
split-capture 296/s            --          -20%       -20%  -23%   -58%  -63%         -63%
match-capture 368/s           24%            --        -0%   -4%   -48%  -54%         -54%
map-substr    370/s           25%            0%         --   -3%   -48%  -53%         -54%
match         382/s           29%            4%         3%    --   -46%  -52%         -52%
unpack        709/s          140%           93%        92%   86%     --  -11%         -11%
split         793/s          168%          115%       114%  107%    12%    --          -0%
split-string  795/s          169%          116%       115%  108%    12%    0%           --

PerlV5.10.1

               Rate split-capture map-substr match-capture match unpack split split-string
split-capture 301/s            --       -31%          -41%  -47%   -60%  -65%         -66%
map-substr    435/s           45%         --          -14%  -23%   -42%  -50%         -50%
match-capture 506/s           68%        16%            --  -10%   -32%  -42%         -42%
match         565/s           88%        30%           12%    --   -24%  -35%         -35%
unpack        743/s          147%        71%           47%   32%     --  -15%         -15%
split         869/s          189%       100%           72%   54%    17%    --          -1%
split-string  875/s          191%       101%           73%   55%    18%    1%           --

               Rate split-capture match-capture match map-substr unpack split-string split
split-capture 300/s            --          -28%  -32%       -38%   -59%         -63%  -63%
match-capture 420/s           40%            --   -5%       -13%   -42%         -48%  -49%
match         441/s           47%            5%    --        -9%   -39%         -46%  -46%
map-substr    482/s           60%           15%    9%         --   -34%         -41%  -41%
unpack        727/s          142%           73%   65%        51%     --         -10%  -11%
split-string  811/s          170%           93%   84%        68%    12%           --   -1%
split         816/s          171%           94%   85%        69%    12%           1%    --

如你所见分裂 split .

$1

所以我建议你和老朋友一起去 split //, ... split '', ... .

Michael Carman 15 年前

没有比使用 split 作用于分裂

my @characters = chars($string);
sub chars { split //, $_[0] }

mob 15 年前

对于可读性较差且更简洁的(仍然使用regex overkill):

@characters = $string =~ /./g;

(这个成语我是从打高尔夫中学来的。)

Ivan Nevostruev 15 年前

你说得对。做这件事的标准方法是 split //, $string . 为了使代码更具可读性,您可以创建一个简单的函数:

sub get_characters {
    my ($string) = @_;
    return ( split //, $string );
}

@characters = get_characters($string);

toolic 15 年前

split 技术。它是众所周知的,并且有文档记录。

@characters = $string =~ /./gs;

Eugene Yarmash 15 年前

使用 split 使用空模式将字符串拆分为单个字符:

@characters = split //, $string;

@values = unpack("C*", $string);

您可能需要包括 use utf8 unpack + chr

@characters = map chr, unpack("C*", $string);