代码之家  ›  专栏  ›  技术社区  ›  pOrinG

REGEXP捕获由一组分隔符分隔的值

  •  0
  • pOrinG  · 技术社区  · 7 年前

    我的列值如下所示:[只是我创建的一个示例]

    {BASICINFOxxxFyyy100x} {CONTACTxxx12345yyy20202x}

    它可以包含0个或多个数据块。。。我创建了下面的查询来分割块

    with x as
     (select 
     '{BASICINFOxxxFyyy100x}{CONTACTxxx12345yyy20202x}' a from dual)
     select REGEXP_SUBSTR(a,'({.*?x})',1,rownum,null,1)
     from x
     connect by rownum <= REGEXP_COUNT(a,'x}')
    

    然而,我想进一步将输出分成如下3列:

    ColumnA    | ColumnB | ColumnC
    ------------------------------
    BASICINFO  | F       |100
    CONTACT    | 12345   |20202
    

    分隔符始终是标准的。我未能创建一个提供所需输出的漂亮查询。

    提前谢谢。

    3 回复  |  直到 7 年前
        1
  •  1
  •   MT0    7 年前

    SQL Fiddle

    Oracle 11g R2架构设置 :

    CREATE TABLE your_table ( str ) AS
    SELECT '{BASICINFOxxxFyyy100x}{CONTACTxxx12345yyy20202x}' from dual
    /
    

    查询1 :

    select REGEXP_SUBSTR(
             t.str,
             '\{([^}]*?)xxx([^}]*?)yyy([^}]*?)x\}',
             1,
             l.COLUMN_VALUE,
             NULL,
             1
           ) AS col1,
           REGEXP_SUBSTR(
             str,
             '\{([^}]*?)xxx([^}]*?)yyy([^}]*?)x\}',
             1,
             l.COLUMN_VALUE,
             NULL,
             2
           ) AS col2,
           REGEXP_SUBSTR(
             str,
             '\{([^}]*?)xxx([^}]*?)yyy([^}]*?)x\}',
             1,
             l.COLUMN_VALUE,
             NULL,
             3
           ) AS col3
    FROM   your_table t
           CROSS JOIN
           TABLE(
             CAST(
               MULTISET(
                 SELECT LEVEL
                 FROM   DUAL
                 CONNECT BY LEVEL <= REGEXP_COUNT( t.str,'\{([^}]*?)xxx([^}]*?)yyy([^}]*?)x\}')
               ) AS SYS.ODCINUMBERLIST
             )
           ) l
    

    Results :

    |      COL1 |  COL2 |  COL3 |
    |-----------|-------|-------|
    | BASICINFO |     F |   100 |
    |   CONTACT | 12345 | 20202 |
    

    注:

    您的查询:

    select REGEXP_SUBSTR(a,'({.*?x})',1,rownum,null,1)
    from x
    connect by rownum <= REGEXP_COUNT(a,'x}')
    

    当您有多行输入时将不起作用-在 CONNECT BY 子句中,分层查询没有任何限制它将行1-Level2连接到行1-Level1或行2-Level1,因此它将同时将其连接到这两个行,并且随着层次的深度越来越大,它将创建输出行的成倍增加的副本。有一些黑客可以用来阻止这种情况,但将行生成器放入相关子查询中效率更高,然后可以 CROSS JOIN 如果要使用分层查询,请返回原始表(它是相关的,因此不会连接到错误的行)。

        2
  •  1
  •   Connor McDonald    7 年前
    SQL> with x as
      2   (select  '{BASICINFOxxxFyyy100x}{CONTACTxxx12345yyy20202x}' a from dual
      3   ),
      4  y as (
      5   select REGEXP_SUBSTR(a,'({.*?x})',1,rownum,null,1) c1
      6   from x
      7   connect by rownum <= REGEXP_COUNT(a,'x}')
      8  )
      9  select
     10    substr(c1,2,instr(c1,'xxx')-2) z1,
     11    substr(c1,instr(c1,'xxx')+3,instr(c1,'yyy')-instr(c1,'xxx')-3) z2,
     12    rtrim(substr(c1,instr(c1,'yyy')+3),'x}') z3
     13  from y;
    
    Z1              Z2              Z3
    --------------- --------------- ---------------
    BASICINFO       F               100
    CONTACT         12345           20202
    
        3
  •  1
  •   XING    7 年前

    这是另一个解决方案,它是从您离开的地方导出的。您的查询已导致将一行拆分为两行。下面将分为3列:

    WITH x
         AS (SELECT '{BASICINFOxxxFyyy100x}{CONTACTxxx12345yyy20202x}' a
               FROM DUAL),
    -- Your query result here
         tbl
         AS (    SELECT REGEXP_SUBSTR (a,
                                       '({.*?x})',
                                       1,
                                       ROWNUM,
                                       NULL,
                                       1)
                           Col
                   FROM x
             CONNECT BY ROWNUM <= REGEXP_COUNT (a, 'x}'))
    --- Actual Query
    SELECT col,
           REGEXP_SUBSTR (col,
                          '(.*?{)([^x]+)',
                          1,
                          1,
                          '',
                          2)
              AS COL1,
           REGEXP_SUBSTR (REGEXP_SUBSTR (col,
                                         '(.*?)([^x]+)',
                                         1,
                                         2,
                                         '',
                                         2),
                          '[^y]+',
                          1,
                          1)
              AS COL2,
           REGEXP_SUBSTR (REGEXP_SUBSTR (col,
                                         '[^y]+x',
                                         1,
                                         2),
                          '[^x]+',
                          1,
                          1)
              AS COL3
      FROM tbl;
    

    输出:

    SQL> /
    
    COL                                              COL1                                             COL2                                             COL3
    ------------------------------------------------ ------------------------------------------------ ------------------------------------------------ ------------------------------------------------
    {BASICINFOxxxFyyy100x}                           BASICINFO                                        F                                                100
    {CONTACTxxx12345yyy20202x}                       CONTACT                                          12345                                            20202