代码之家 › 专栏 › 技术社区 › ddbeck

如何处理嵌套列表?

list parsing python

ddbeck · 技术社区 · 15 年前

假设我有这样一个项目符号列表:

* list item 1
* list item 2 (a parent)
** list item 3 (a child of list item 2)
** list item 4 (a child of list item 2 as well)
*** list item 5 (a child of list item 4 and a grand-child of list item 2)
* list item 6

编辑

[('list item 1',), ('list item 2', [('list item 3',), ('list item 4', [('list item 5',)])] ('list item 6',)]

[('list item 1',), ('list item 2', [('list item 3',), ('list item 4', [('list item 5',)])]), ('list item 6',)]

我已经尝试用普通Python和Pyparsing做了一些实验,但是没有进展。我还有两个主要问题:

我需要用什么策略来实现这个目标?我知道递归是解决方案的一部分,但是我很难把它和斐波那契序列联系起来。

3 回复 | 直到 15 年前

Alex Martelli 15 年前

要使树结构显式化,例如:

data = '''* list item 1
* list item 2
** list item 3
** list item 4
*** list item 5
* list item 6'''.splitlines()

class Node(object):
  def __init__(self, payload):
    self.payload = payload
    self.children = []
  def show(self, indent):
    print ' '*indent, self.payload
    for c in self.children:
      c.show(indent+2)

def makenest(linelist):
  rootnode = Node(None)
  stack = [(rootnode, 0)]
  for line in linelist:
    for i, c in enumerate(line):
      if c != '*': break
    stars, payload = line[:i], line[i:].strip()
    curlev = len(stars)
    curnod = Node(payload)
    while True:
      parent, level = stack[-1]
      if curlev > level: break
      del stack[-1]
    # a child node of the current top-of-stack
    parent.children.append(curnod)
    stack.append((curnod, curlev))
  rootnode.show(0)

makenest(data)

这个 show Node 合适的(可能是递归的)方法——那么,您能给出这个缺少的规范吗。。。?

编辑 class Node

  def emit(self):
    if self.children:
      return (self.payload,
              [c.emit() for c in self.children])
    else:
      return (self.payload,)

并更改代码的最后三行 makenest 制造 )收件人:

  return [c.emit() for c in rootnode.children]

print(makenest(data))

(后面的括号) print 在Python2中是多余的,但是在Python3中是必需的,所以我把它们放在那里以防万一;-)。

通过这些微小的更改,我的代码可以按要求运行,现在

[('list item 1',), ('list item 2', [('list item 3',), ('list item 4', [('list item 5',)])]), ('list item 6',)]

xiao å¸ 15 年前

从搜索算法的角度看,您给出的项目符号实际上是深度优先搜索生成的序列。所以我的策略就是用dfs序列重建树结构。

下面是python代码:

from collections import deque
def dfsBullet(bullet,depth):
    """
       parse the subtree with depth and the startnode of bullet[0]
    """
    li = []
    if depth != 0:
            item = bullet.popleft()
            li.append(item.split(' ',1)[1])
    while (len(bullet) != 0):
            item = bullet[0]
            #apply same algo to the child node
            if len(item.split(' ',1)[0]) > depth:
                    sublist = dfsBullet(bullet, len(item.split(' ')[0]))
            #we have traverse all childnode, so go back 
            else:
                    return li
            #add child tree to the list
            li.append(sublist)
    return li

Amber 15 年前

如果下一行的深度小于当前深度,则返回当前列表。