代码之家  ›  专栏  ›  技术社区  ›  CoffeeBasedLifeform

对列表列表进行逻辑排序(部分排序集->拓扑排序)

  •  8
  • CoffeeBasedLifeform  · 技术社区  · 6 年前

    编辑 接受的答案适用于满足 strict partially ordered set ,所以 directed acyclic graph 可以构造:

    • 非自反性 not a < a :列表不包含 ['a','a']
    • 传递性 if a < b and b < c then a < c :列表不包含 ['a','b'],['b','c'],['c','a']
    • 不对称 if a < b then not b < a :列表不包含 ['a','b'],['b','a']

    获取此列表:
    [['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd'], ]
    并将其展平为单个列表,根据值的邻居进行排序:

    • 第一个子列表告诉你b在c之前
    • 然后a在c之前
    • b在a之前
    • 最后是c之后的d

    子列表之间的总体顺序是一致的,这意味着不会有这样的子列表: ['b','c'],['c','b'] . 所以结果应该是: ['b', 'a', 'c', 'd']

    过了一段时间,我发现了这个丑陋的烂摊子:

    def get_order(_list):
        order = _list[0]
        for sublist in _list[1:]:
            if not sublist:
                continue
            if len(sublist) == 1:
                if sublist[0] not in order:
                    order.append(sublist[0])
                continue
            new_order = order.copy()
            for index, value in enumerate(sublist):
                inserted = False
                new_order_index = None
                if value in new_order:
                    new_order_index = new_order.index(value)
                    new_order.remove(value)
                for previous_value in sublist[:index][::-1]:
                    if previous_value in new_order:
                        insert_index = new_order.index(previous_value) + 1
                        print('inserting', value, 'at position', insert_index, 'after', previous_value)
                        new_order.insert(insert_index, value)
                        inserted = True
                        break
                if inserted:
                    continue
                for next_value in sublist[index:]:
                    if next_value in new_order:
                        insert_index = new_order.index(next_value)
                        print('inserting', value, 'at position', insert_index, 'before', next_value)
                        new_order.insert(insert_index, value)
                        inserted = True
                        break
                if inserted:
                    continue
                if new_order_index is None:
                    print('appending', value)
                    new_order.append(value)
                else:
                    print('leaving', value, 'at position', new_order_index)
                    new_order.insert(new_order_index, value)
            order = new_order
        return order
    
    if __name__ == '__main__':
        test_list = [['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd'], ]
        order = get_order(test_list)
        #>>> inserting a at position 1 before c
        #>>> inserting c at position 2 after a
        #>>> inserting d at position 3 after c
        #>>> inserting a at position 1 before c
        #>>> inserting c at position 2 after a
        #>>> inserting b at position 0 before a
        #>>> inserting a at position 1 after b
        print(order)
        #>>> ['b', 'a', 'c', 'd']
    

    出现 完全按照预期做,但远没有效率(或优雅)。
    有什么算法可以这样排序吗?
    或者有一些蟒蛇的把戏能让这个更有效吗?


    2 回复  |  直到 6 年前
        1
  •  1
  •   user2357112    6 年前

    NoSKLO和AJAX1234的现有答案都失败了。 [[1, 3], [3, 5], [5, 2], [2, 4]] . 你问题中的尝试 fails 关于的输入 [[1, 4], [2, 3], [3, 4], [1, 2]] 是的。

    正确的方法如Bowlinghawk95所述:执行 topological sort 在由输入列表诱导的有向无环图上。

    我们可以实现我们自己的拓扑排序,但是让现有的图形库处理它更安全。例如, NetworkX :

    from itertools import chain, tee
    
    import networkx
    import networkx.algorithms
    
    # pairwise recipe from the itertools docs.
    def pairwise(iterable):
        "s -> (s0,s1), (s1,s2), (s2, s3), ..."
        a, b = tee(iterable)
        next(b, None)
        return zip(a, b)
    
    def merge_ordering(sublists):
        # Make an iterator of graph edges for the new graph. Some edges may be repeated.
        # That's fine. NetworkX will ignore duplicates.
        edges = chain.from_iterable(map(pairwise, sublists))
    
        graph = networkx.DiGraph(edges)
        return list(networkx.algorithms.topological_sort(graph))
    

    这将为问题中的输入生成正确的输出 [[1,3],[3,5],[5,2],[2,4]] 如果其他答案失败了, [[1,4],[2,3],[3,4],[1,2]] 如果您的尝试失败于:

    >>> merge_ordering([[1, 3], [3, 5], [5, 2], [2, 4]])
    [1, 3, 5, 2, 4]
    >>> merge_ordering([['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd']])
    ['b', 'a', 'c', 'd']
    >>> merge_ordering([[1, 4], [2, 3], [3, 4], [1, 2]])
    [1, 2, 3, 4]
    

    我们还可以编写一个版本,如果输入列表不能唯一确定展平形式,则会引发错误:

    def merge_ordering_unique(sublists):
        # Make an iterator of graph edges for the new graph. Some edges may be repeated.
        # That's fine. NetworkX will ignore duplicates.
        edges = chain.from_iterable(map(pairwise, sublists))
    
        graph = networkx.DiGraph(edges)
        merged = list(networkx.algorithms.topological_sort(graph))
    
        for a, b in pairwise(merged):
            if not graph.has_edge(a, b):
                raise ValueError('Input has multiple possible topological orderings.')
    
        return merged
    

    演示:

    >>> merge_ordering_unique([['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd']])
    ['b', 'a', 'c', 'd']
    >>> merge_ordering_unique([[1, 3, 4], [1, 2, 4]])
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<string>", line 11, in merge_ordering_unique
    ValueError: Input has multiple possible topological orderings.
    
        2
  •  4
  •   Ajax1234    6 年前

    可以创建一个查找函数,该函数确定是否应将特定值放在另一个值之前或之后:

    d = [['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd']]
    flattened = {i for b in d for i in b}
    def _lookup(a, b):
      _loc = [i for i in d if a in i and b in i]
      return True if not _loc else _loc[0].index(a) < _loc[0].index(b)
    
    class T: 
      def __init__(self, _val):
        self.v = _val
      def __lt__(self, _n):
        return _lookup(self.v, _n.v)
    
    final_result = [i.v for i in sorted(map(T, flattened))]
    

    输出:

    ['b', 'a', 'c', 'd']
    

    使用 [['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd'], ['a', 'e']] :

    ['b', 'a', 'c', 'e', 'd']