代码之家  ›  专栏  ›  技术社区  ›  Nico Müller

按列表中匹配的任何第一项拆分文本

  •  2
  • Nico Müller  · 技术社区  · 7 年前

    只要课文中没有多个介词,它就能起作用

    我的钥匙在窗户后面 :我的钥匙 之后 :后面 窗口

    之前 :我的钥匙在下面 桌子

    我的钥匙在厨房桌子下面的盒子里 钥匙 之后 :在厨房桌子下面的盒子里

    在第二个例子中,结果应该是[“我的钥匙”,“厨房桌子下面”]

    找到列表中任何单词的第一个匹配项的优雅方法是什么?

    def get_text_after_preposition_of_place(text):
        """Returns the texts before[0] and after[1] <preposition of place>"""
    
    prepositions_of_place = ["in front of","behind","in","on","under","near","next to","between","below","above","close to","beside"]
        textres = ["",""]
    
        for key in prepositions_of_place:
            if textres[0] == "":
                if key in text:
                    textres[0] = text.split(key, 1)[0].strip()
                    textres[1] = key + " " + text.split(key, 1)[1].strip()
        return textres
    
    1 回复  |  直到 7 年前
        1
  •  3
  •   Thierry Lathuille    7 年前

    你可以用 re.split :

    import re
    
    def get_text_after_preposition_of_place(text):
        """Returns the texts before[0] and after[1] <preposition of place>"""
    
        prepositions_of_place = ["in front of","behind","in","on","under","near","next to","between","below","above","close to","beside"]
         preps_re = re.compile(r'\b(' + '|'.join(prepositions_of_place) + r')\b')
    
        split = preps_re.split(text, maxsplit=1)
        return split[0], split[1]+split[2]
    
    print(get_text_after_preposition_of_place('The cat in the box on the table'))  
    # ('The cat ', 'in the box on the table')
    

    (in|on|under) . 请注意括号:它们允许我们捕获拆分字符串的字符串,以便将它们保留在输出中。

    然后,我们拆分,最多允许1个拆分,并连接最后两个部分:介词和字符串的其余部分。