代码之家  ›  专栏  ›  技术社区  ›  sid_com

查找列表中最长元素的最快方法(执行时间)

  •  13
  • sid_com  · 技术社区  · 14 年前

    这是查找列表中最长元素的最快(执行时间)方法吗?

    #!/usr/bin/env perl
    use warnings;
    use 5.012;
    use List::Util qw(reduce);
    use List::Util::XS;
    
    my @array = qw( one two three four five six seven eight nine ten eleven );
    
    my $l = reduce{ length($a) > length($b) ? $a : $b } @array;
    
    say $l;
    
    10 回复  |  直到 10 年前
        1
  •  17
  •   Eric Strom    14 年前

    O(N) O(1)

    sub longest {
        my $max = -1;
        my $max_i = 0;
        for (0 .. $#_) {              # for each index
            my $len = length $_[$_];  # only get length once per item
            if ($len > $max) {        # save index and update max if larger
                $max = $len;
                $max_i = $_;
            }
        }
        $_[$max_i]   # return the largest item
    }
    

    reduce

    sub fastest {
        my $max = -1;
        my $max_ref;
        for (@_) {
            if (length > $max) {  # no temp variable, length() twice is faster
                $max = length;
                $max_ref = \$_;   # avoid any copying
            }
        }
        $$max_ref
    }
    

               Rate longest   drewk  reduce fastest
    longest 44245/s      --    -21%    -30%    -47%
    drewk   55854/s     26%      --    -11%    -33%
    reduce  63014/s     42%     13%      --    -25%
    fastest 83638/s     89%     50%     33%      --
    
        2
  •  6
  •   bvr    14 年前

    my $len = length $array[0];
    my $longest = 0;
    for my $i (1 .. $#array) {
        my $i_len = length $array[$i];
        if($i_len > $len) {
            $longest = $i;
            $len = $i_len;
        }
    }
    my $l = $array[$longest];
    

    我玩了一点基准测试,得到了这个小数字(原始数组)

               Rate REDUCE TMPVAR TMPFOR
    REDUCE 234862/s     --    -0%    -7%
    TMPVAR 235643/s     0%     --    -6%
    TMPFOR 251326/s     7%     7%     --
    

    对于较大的数量或项目(原始数组 x 100 )

             Rate TMPVAR TMPFOR REDUCE
    TMPVAR 3242/s     --   -28%   -32%
    TMPFOR 4503/s    39%     --    -5%
    REDUCE 4750/s    47%     5%     --
    

    注意,由于数据的特殊性,算法的适用性会有很大的变化(我猜更长的字符串可能会增加 length 算法中的函数)。

    编辑 :这里是基准的完整代码(长数组版本,缺少短数组版本) X 100 在数组定义中)

    use Benchmark  qw(:all);
    use List::Util qw(reduce);
    
    my @array = qw( one two three four five six seven eight nine ten eleven ) x 100;
    
    cmpthese(-2, {
        REDUCE => sub {
            my $l = reduce{ length($a) gt length($b) ? $a : $b } @array;
        },
        TMPVAR => sub {
            my $idx = 1;
            my $lastLength = length $array[0];
            my $lastElt = $array[0];
            my $listLength = scalar @array;
            while ($idx < $listLength) {
              my $tmpLength = length $array[$idx];
              if ($tmpLength > $lastLength) {
                $lastElt = $array[$idx];
                $lastLength = $tmpLength
              }
              $idx++
            }
            my $l = $lastElt;
        },
        TMPFOR => sub {
            my $len = length $array[0];
            my $longest = 0;
            for my $i (1 .. $#array) {
                my $i_len = length $array[$i];
                if($i_len > $len) {
                    $longest = $i;
                    $len = $i_len;
                }
            }
            my $l = $array[$longest];
        },
    });
    
        3
  •  3
  •   Community CDub    8 年前

    我最快的是:

    sub drewk {
        my $len = -1;
        for (@_) {
            my $tmp=length($_);
            if ( $tmp > $len ) {
                $longest = $_;
                $len = $tmp;
            }
        }
        return $longest;
    }
    

    但是基准测试针对:

    sub strom {
        my $max = -1;
        my $max_i = 0;
        for (0 .. $#_) {              # for each index
            my $len = length $_[$_];  # only get length once per item
            if ($len > $max) {        # save index and update max if larger
                $max = $len;
                $max_i = $_;
            }
        }
        $_[$max_i]   # return the largest item
    }
    
    sub red {
        return reduce{ length($a) > length($b) ? $a : $b } @_;
    }
    

    表明 reduce 最快:

                Rate  strom  drewk reduce
    strom  1323455/s     --   -38%   -45%
    drewk  2144549/s    62%     --   -10%
    reduce 2390707/s    81%    11%     --
    

    另一个基准是 Eric Strom's sub

        4
  •  2
  •   Don Jones    14 年前

    一个小高尔夫球:

    my @unsorted = qw( one two three four five six seven eight nine ten eleven );
    my $longest =  (
        map { $_->[0] } 
        sort { $b->[1] <=> $a->[1] } 
        map { [ $_, length $_ ] } @unsorted 
    )[0];
    
    say $longest;
    

    编辑:映射/排序/映射是 Schwartzian transform 对于不熟悉这种技术和好奇的人。

        5
  •  2
  •   ysth    14 年前

    假设目标只是找到最长的字符串,而不是其索引:

    my $longest = $array[0];
    my $len = length $longest;
    for my $str (@array) {
        if ( length($str) > $len ) {
            $longest = $str;
            $len = length($str);
        }
    }
    
        6
  •  1
  •   MBO    14 年前

    如果你真的想切断计算的数量 length 是的,那就看看 Schwartian transform 把它应用到你的问题上。

    编辑:

    我看到没有人发布我所指的完整示例,所以这里是(我还没有对它进行基准测试,因为我不在我的个人电脑中):

    my @array = qw( one two three four five six seven eight nine ten eleven );
    my $longest = (
                    reduce { $a->[1] > $b->[1] ? $a : $b } 
                    map { [ $_, length $_ ] }
                    @array
                  )[0];
    
    say $longest;
    
        7
  •  1
  •   mpapec    10 年前

    这似乎比其他解决方案(基于 最快的风暴 )

    use warnings;
    use 5.012;
    use Benchmark qw(:all) ;
    use List::Util qw(reduce);
    
    my @array = map { ($_) x 50 } qw( one two three four five six seven eight nine ten eleven );
    
    sub list_util_xs {
        my $l = reduce{ length($a) > length($b) ? $a : $b } @array;
        return $l;
    }
    
    sub fastest_Eric_Strom {
        my $max = -1; my $max_ref;
        for (@array) {
            if (length > $max) {
                $max = length;
                $max_ref = \$_;
            }
        }
        return $$max_ref;
    }
    
    sub ysth {
        my $longest = $array[0];
        my $len = length $longest;
        for my $str (@array) {
        if ( length($str) > $len ) {
            $longest = $str;
            $len = length($str);
        }
        }
        return $longest;
    }
    
    sub mpapec {
        my $max = -1;
        my $max_ref;
        length > $max and ($max, $max_ref) = (length, \$_) for @array;
        return $$max_ref;
    }
    
    cmpthese( -10, {
        'list_util_xs'  => sub{ list_util_xs() },
        'fastest_Eric_Storm' => sub{ fastest_Eric_Strom() },
        'ysth'      => sub{ ysth() },
        'mpapec'    => sub{ mpapec() },
    });
    

    输出

                          Rate list_util_xs fastest_Eric_Storm       ysth     mpapec
    list_util_xs       13479/s           --               -24%       -24%       -29%
    fastest_Eric_Storm 17662/s          31%                 --        -0%        -6%
    ysth               17680/s          31%                 0%         --        -6%
    mpapec             18885/s          40%                 7%         7%         --
    
        8
  •  0
  •   OMG_peanuts    14 年前

    您可以使用一些临时变量来避免反复计算长度:

    my @unsorted = qw( one two three four five six seven eight nine ten eleven ); 
    my $idx = 1;
    my $lastLength = length $unsorted[0];
    my $lastElt = $unsorted[0];
    my $listLength = scalar @unsorted;
    while ($idx < $listLength) {
      my $tmpLength = length $unsorted[$idx];
      if ($tmpLength > $lastLength) {
        $lastElt = $unsorted[$idx];
        $lastLength = $tmpLength
      }
      $idx++
    }
    print "Longest element:$lastElt";
    

    基准结果:

               Rate REDUCE TMPVAR
    REDUCE 169297/s     --   -29%
    TMPVAR 237926/s    41%     --
    
        9
  •  0
  •   Community CDub    8 年前

    首先,我测试了所有的程序是否都给了我正确的结果。MBO的程序没有通过第一个测试(它返回一个数组引用);为了给他第二个更改,我修改了程序以获得正确的结果。

    我做了几次基准测试,但并不是总能得到相同的顺序。 所以我想说(正如这里已经提到的)Yth和Fasset-Eric-Strom是最快的,但是List-Utils的减少速度几乎和他们一样快; 从结果中很容易看出,DavidPrecious Sort版本是最慢的,MBO修改后的Reduce版本是第二慢的。

    我的结论是:清单减少是最佳性价比的赢家。
    编辑: 我在颁奖仪式上太快了: List::Util - reduce - length - encoding - question

    David_Precious      64147/s             -- -36%        -73%               -79%  -80% -81%         -85%               -86% -87%
    MBO                100195/s            56%   --        -58%               -67%  -69% -70%         -77%               -79% -80%
    OMG_peanuts        237772/s           271% 137%          --               -21%  -27% -30%         -45%               -50% -52%
    longest_Eric_Strom 300466/s           368% 200%         26%                 --   -8% -11%         -31%               -36% -40%
    drewk              325883/s           408% 225%         37%                 8%    --  -4%         -25%               -31% -34%
    bvr                338156/s           427% 237%         42%                13%    4%   --         -22%               -28% -32%
    list_util_xs       434114/s           577% 333%         83%                44%   33%  28%           --                -8% -13%
    fastest_Eric_Strom 471812/s           636% 371%         98%                57%   45%  40%           9%                 --  -5%
    ysth               497198/s           675% 396%        109%                65%   53%  47%          15%                 5%   --
    

    .

    #!/usr/bin/env perl
    use warnings;
    use 5.012;
    use Benchmark qw(:all) ;
    use List::Util qw(reduce);
    
    my @array = qw( one two three four five six seven eight nine very_long_long ten eleven );
    
    sub list_util_xs {
        my $l = reduce{ length($a) > length($b) ? $a : $b } @array;
        return $l;
    }
    
    sub longest_Eric_Strom {
        my $max = -1; my $max_i = 0;
        for (0 .. $#array) {
            my $len = length $array[$_]; 
            if ($len > $max) { 
                $max = $len;
                $max_i = $_;
            }
        }
        return $array[$max_i];
    }
    
    sub fastest_Eric_Strom {
        my $max = -1; my $max_ref;
        for (@array) {
            if (length > $max) {
                $max = length;
                $max_ref = \$_;
            }
        }
        return $$max_ref;
    }
    
    sub David_Precious {
        my $longest =  ( map { $_->[0] } sort { $b->[1] <=> $a->[1] } map { [ $_, length $_ ] } @array )[0];
        return $longest;
    }
    
    sub MBO {
        my $longest = ( reduce { $a->[1] > $b->[1] ? $a : $b } map { [ $_, length $_ ] } @array )[0];
        return $longest->[0];
    }
    
    sub drewk {
        my $len = -1; my $longest;
        for (@array) {
            my $tmp=length($_);
            if ( $tmp > $len ) {
                $longest = $_;
                $len = $tmp;
            }
        }
        return $longest;
    }
    
    sub ysth {
        my $longest = $array[0];
        my $len = length $longest;
        for my $str (@array) {
        if ( length($str) > $len ) {
            $longest = $str;
            $len = length($str);
        }
        }
        return $longest;
    }
    
    sub bvr {
        my $len = length $array[0];
        my $longest = 0;
        for my $i (1 .. $#array) {
        my $i_len = length $array[$i];
        if($i_len > $len) {
            $longest = $i;
            $len = $i_len;
        }
        }
        return $array[$longest];
    }
    
    sub OMG_peanuts {
        my $idx = 1;
        my $lastLength = length $array[0];
        my $lastElt = $array[0];
        my $listLength = scalar @array;
        while ($idx < $listLength) {
        my $tmpLength = length $array[$idx];
        if ($tmpLength > $lastLength) {
        $lastElt = $array[$idx];
        $lastLength = $tmpLength
        }
        $idx++
        }
        return $lastElt;
    }
    
    cmpthese( -10, {
        'list_util_xs'  => sub{ list_util_xs() },
        'longest_Eric_Storm' => sub{ longest_Eric_Strom() },
        'fastest_Eric_Storm' => sub{ fastest_Eric_Strom() },
        'David_Precious'    => sub{ David_Precious() },
        'MBO'       => sub{ MBO() },
        'drewk'         => sub{ drewk() },
        'ysth'      => sub{ ysth() },
        'OMG_peanuts'   => sub{ OMG_peanuts() },
        'bvr'       => sub{ bvr() },
    });
    
        10
  •  -1
  •   xtofl Adam Rosenfield    14 年前

    您可以通过减少到包含字符串本身旁边的长度的结构或数组来减少计算字符串长度的次数。

    此外,迭代由 reduce 算法 length 调用很难优化。