代码之家  ›  专栏  ›  技术社区  ›  Nebula Tech

在SQL Server中搜索字符串中单词的多次出现

  •  -1
  • Nebula Tech  · 技术社区  · 4 月前

    我在SQL Server表的一列中有以下文本。

    "55-9988, Version 1.0 dated 07/20/2009
    3684 for 66-0022
    IB from Microsoft , for Monovalent A/king Influenza Subvirion Vaccine, Version 1.0 dated 06/27/2009
    Package Insert from Microsoft , for Fluzone, dated 06/2008
    Package Insert from Microsoft , for H5N1, dated 04/2007
    IB from google, for AS93 as an Adjuvant for use with a  king Vaccine, Version 1.0 dated 07/2009
    Package Insert from Microsoft , for Fluzone, Version 37 dated 06/18/2008
    55-9988 MIA, Version 1.0 dated 07/20/2009"
    

    这是一个单元格中的示例,我有多行文本相似。

    在这种情况下,我需要获得模式 xx-xxxx ( 55-9988 , 66-0022 )从上面的字符串中,我需要所有的出现。

    我能够获取事件的第一个实例,但无法从文本中获取其余事件。

    1 回复  |  直到 4 月前
        1
  •  0
  •   Yitzhak Khabinsky    4 月前

    请根据以下内容尝试以下解决方案 STRING_SPLIT() 功能使用。

    此外,它还使用了增强的 TRIM() 功能。因此,它将从SQL Server 2017开始工作。

    LIKE '[0-9][0-9]-[0-9][0-9][0-9][0-9]%' 子句正在寻找与特定模式匹配的标记。

    我补充道 方法#2 基于XML的实现,以支持SQL Server 2008。

    SQL

    -- DDL and sample data population, start
    DECLARE @tbl TABLE (id INT IDENTITY PRIMARY KEY, tokens VARCHAR(1024));
    INSERT INTO @tbl (tokens) VALUES
    ('55-9988, Version 1.0 dated 07/20/2009
    3684 for 66-0022
    IB from Microsoft , for Monovalent A/king Influenza Subvirion Vaccine, Version 1.0 dated 06/27/2009
    Package Insert from Microsoft , for Fluzone, dated 06/2008
    Package Insert from Microsoft , for H5N1, dated 04/2007
    IB from google, for AS93 as an Adjuvant for use with a  king Vaccine, Version 1.0 dated 07/2009
    Package Insert from Microsoft , for Fluzone, Version 37 dated 06/18/2008
    55-9988 MIA, Version 1.0 dated 07/20/2009'),
    ('fafa');
    -- DDL and sample data population, end
    
    -- Method #1
    -- SQL Server 2017 onwards
    DECLARE @CrLf CHAR(2) = CHAR(13) + CHAR(10);
    
    SELECT id, token
    FROM @tbl
    CROSS APPLY STRING_SPLIT(REPLACE(tokens, @CrLf,SPACE(1)), SPACE(1))
    CROSS APPLY (SELECT TRIM(',' FROM value)) AS t(token)
    WHERE token LIKE '[0-9][0-9]-[0-9][0-9][0-9][0-9]%';
    
    -- Method #2
    -- SQL Server 2008 onwards
    DECLARE @separator CHAR(1) = SPACE(1);
    
    SELECT id, REPLACE(token, ',', '') AS token
    FROM @tbl
    CROSS APPLY (SELECT CAST('<root><r><![CDATA[' + 
          REPLACE(REPLACE(tokens, @CrLf,@separator), @separator, ']]></r><r><![CDATA[') + 
          ']]></r></root>' AS XML)) AS t1(c)
    CROSS APPLY c.nodes('/root/r/text()') AS t2(x)
    CROSS APPLY (SELECT x.value('.', 'VARCHAR(50)')) AS t3(token)
    WHERE token LIKE '[0-9][0-9]-[0-9][0-9][0-9][0-9]%';
    

    输出

    身份证件 代币
    1. 55-9988
    1. 66-0022
    1. 55-9988