代码之家 › 专栏 › 技术社区 › dfundako

大熊猫的递归运算

dataframe pandas python

dfundako · 技术社区 · 5 年前

vals = {"operator": [1, 1, 1, 2, 3, 5], "nextval": [2, 3, 6, 4, 5, 6]}
df = pd.DataFrame(vals)

   operator  nextval
0         1        2
1         1        3
2         1        6
3         2        4
4         3        5
5         5        6

我要做的是从一个起点,比如1,到一个终点,比如6,得到所有可能的路径的列表,使用操作符和nextvals,而不是严格意义上的最短路径。

1 -> 6
1 -> 2 -> 4 
1 -> 3 -> 5 -> 6

我可以接近它,但不确定如何正确地进行递归,因为dict不能处理2个相同的键:

import pandas as pd

vals = {"operator": [1, 1, 1, 2, 3, 5], "nextval": [2, 3, 6, 4, 5, 6]}
df = pd.DataFrame(vals)

df1 = df.set_index("operator")

dictvals = {}
for x in df1.index.unique():
    dictvals[x] = []
    df2 = df1.loc[x]
    if isinstance(df2, pd.DataFrame):
        for idx, rowdata in df2.iterrows():
            dictvals[x].append(rowdata["nextval"])
    else:
        dictvals[x] = df2[0]

print(dictvals) 

{1: [2, 3, 6], 2: 4, 3: 5, 5: 6}

0 回复 | 直到 5 年前

BENY 5 年前

检查 networkx ,你需要一个方向图 'root' 到 'leaf' 路径

import networkx as nx
G=nx.from_pandas_edgelist(df,source='operator',target='nextval', edge_attr=None, create_using=nx.DiGraph())
road=[]
for n in G:
       if G.out_degree(n)==0: #leaf
           road.append(nx.shortest_path(G, 1, n))
           
road
Out[82]: [[1, 2, 4], [1, 3, 5, 6]]

更新

import networkx as nx
G=nx.from_pandas_edgelist(df,source='operator',target='nextval', edge_attr=None, create_using=nx.DiGraph())
road=[]
for n in G:
       if G.out_degree(n)==0: #leaf
           road.append(list(nx.all_simple_paths(G, 1, n)))
           
road
Out[509]: [[[1, 3, 5, 6], [1, 6]], [[1, 2, 4]]]

Karl Knechtel 5 年前

让我们尝试手动生成一个解决方案,因为考虑这种递归算法是很有教育意义的。(当然,在现实世界中只使用现有的库是合适的;它的容错性可能要高得多。)

graph = {1: {2, 3}, 2: {4}, 3: {5}, 5: {6}}

每个路径来自每个

当然,当我们编写递归代码时,我们喜欢避免副作用,因为它们使我们更难推理。所以让我们说:所有路径的累加,定义为(从节点到后续节点的链接)+(从后续节点到结束的路径),对于每个后续节点,对于该后续节点的每个pat。当然,我们表示“从节点到后续节点的链接”的方式只是当前节点名和一个箭头;我们从递归中获得路径的其余部分,包括后继节点的名称。

然后我们需要一个基本情况:如果没有后继者,那么我们就只有一条从这里到终点的路径(因为我们在终点),这就是节点名本身。如果图中的死胡同用空集表示,对我们的代码来说会更简单;但是显然更容易 dict.get 而不是在我们检查的时候索引。

for 条款`。对于基本情况,为了与之匹配,我们需要一个包含一个路径的列表。这给了我们:

def paths(graph, root):
    successors = graph.get(root, set())
    if not successors:
        return [str(root)] # a single path starting here and going nowhere.
    return [
        f'{root} -> {path}'
        for successor in successors
        for path in paths(graph, successor)
    ]

让我们试试:

>>> paths({1: {2, 3}, 2: {4}, 3: {5}, 5: {6}}, 1)
['1 -> 2 -> 4', '1 -> 3 -> 5 -> 6']

或者,可以使用生成器表达式而不是列表理解,甚至可以将其编写为递归生成器(使用 yield 和 yield from

(如果我们觉得不够厚颜无耻,我们可以使用条件表达式继续函数式编程主题:)

def paths(graph, root):
    successors = graph.get(root, set())
    return [
        f'{root} -> {path}'
        for successor in successors
        for path in paths(graph, successor)
    ] if successors else [str(root)]