代码之家 › 专栏 › 技术社区 › Nico Müller

按列表中匹配的任何第一项拆分文本

parsing list python

Nico Müller · 技术社区 · 7 年前

只要课文中没有多个介词,它就能起作用

我的钥匙在窗户后面 :我的钥匙之后 :后面窗口

之前 :我的钥匙在下面桌子

我的钥匙在厨房桌子下面的盒子里钥匙之后 :在厨房桌子下面的盒子里

在第二个例子中,结果应该是[“我的钥匙”,“厨房桌子下面”]

找到列表中任何单词的第一个匹配项的优雅方法是什么?

def get_text_after_preposition_of_place(text):
    """Returns the texts before[0] and after[1] <preposition of place>"""

prepositions_of_place = ["in front of","behind","in","on","under","near","next to","between","below","above","close to","beside"]
    textres = ["",""]

    for key in prepositions_of_place:
        if textres[0] == "":
            if key in text:
                textres[0] = text.split(key, 1)[0].strip()
                textres[1] = key + " " + text.split(key, 1)[1].strip()
    return textres

1 回复 | 直到 7 年前

Thierry Lathuille 7 年前

你可以用 re.split :

import re

def get_text_after_preposition_of_place(text):
    """Returns the texts before[0] and after[1] <preposition of place>"""

    prepositions_of_place = ["in front of","behind","in","on","under","near","next to","between","below","above","close to","beside"]
     preps_re = re.compile(r'\b(' + '|'.join(prepositions_of_place) + r')\b')

    split = preps_re.split(text, maxsplit=1)
    return split[0], split[1]+split[2]

print(get_text_after_preposition_of_place('The cat in the box on the table'))  
# ('The cat ', 'in the box on the table')

(in|on|under) . 请注意括号:它们允许我们捕获拆分字符串的字符串,以便将它们保留在输出中。

然后,我们拆分,最多允许1个拆分,并连接最后两个部分:介词和字符串的其余部分。

推荐文章

David542 · 任何语言都允许函数名中有空格吗?

1 年前

Abbey A. · 从中的文本字符串中有条件地解析数字,并将其分配给R中的新列

1 年前

David542 · 为什么词法分析器通常将var定义为不能以数字开头?

1 年前

thenightmarechild92 · 使用正则表达式拆分具有唯一标题的子节

1 年前

Andy · 将LENGTH OF移动到COMP字段解析失败

1 年前

Chris Geo · 如何找到LR0项目的FOLLOW集合?

1 年前

Anton · 不能将运算符[]与数组的字符串参数一起使用(解析json)

1 年前

user25485370 · 如何在带分隔符的C++中解析字符串?[关闭]

2 年前

Yash Singhal · 在reactjs中解析Pdf中的文本

2 年前

i33SoDA · 如何将逗号分隔的数字字符串解析为int数组?

2 年前