代码之家  ›  专栏  ›  技术社区  ›  Adrian

如何在R[重复]中提取多个嵌套圆括号之间的字符串

  •  2
  • Adrian  · 技术社区  · 2 年前
    mystring <- c("code IS (k(384333)\n   AND parse = TURE \n ) \n 
                  code IS (\n FROM (43343344)\n ) some information code IS \n
                  code IS (  ( \n (data)(23423422 \n)) ) ) and more information)")
    

    我想提取的所有实例 code IS (...) 。但是由于嵌套的括号,我的正则表达式似乎只在第一个闭括号之后停止。

    library(stringr)
    > str_extract_all(pattern = 'code IS \\([\\s\\S]+?\\)', mystring)
    [[1]]
    [1] "code IS (k(384333)"          "code IS (\n FROM (43343344)" "code IS (  ( \n (data)"    
    

    所需输出为

    [[1]]
    [1] "code IS (k(384333)\n   AND parse = TURE \n )"          "code IS (\n FROM (43343344)\n )" "code IS (  ( \n (data)(23423422 \n)) )" 
    

    编辑: 潜在的正则表达式解决方案包括 here :

    现在的问题是如何调整这些解决方案以与 str_extract_all 在R中?

    我尝试使用PCRE模式:

    > str_extract_all(pattern = 'code IS \((?:[^)(]+|(?R))*+\)', mystring)
    Error: '\(' is an unrecognized escape in character string starting "'code IS \(" 
    
    0 回复  |  直到 2 年前