代码之家  ›  专栏  ›  技术社区  ›  Chris Medlin

SQL-将具有编码文本值的行转换/转置为列

  •  2
  • Chris Medlin  · 技术社区  · 7 年前

    我正在寻找帮助,将数据存储在行中的表转换为存储在表中的数据。

    出身背景我正在处理一个包含住院数据的表。我们把这张表叫做“住院病人”。

    数据当前格式为一个包含3列和 N 行。这3列包含以下数据:

  • “患者ID” = 唯一的患者/个人标识符 .将此视为患者的姓名;
  • “Event\u ID” = 唯一接纳事件标识符 .确定医院护理的独特事件;
  • “Diagnosis\u代码” = ICD-10 code 用于记录患者住院的原因。

    对于单个患者(患者ID), 每个 住院(Event\u ID)表示为 一个或多个 表中的行,其中一行用于为给定住院时间记录的每个诊断。

    因此,表格中的一行(一个记录的诊断)可以记录任何给定的住院时间 表中有多行(与多个诊断相关)。

    下面给出了当前“住院患者”表的一个示例。。。

    -------------------------------------------
    Patient_ID |  Event_ID   |  Diagnosis_Code
    -------------------------------------------
    Pers001    | HospStay001 |     C139
    Pers001    | HospStay001 |     I245
    Pers001    | HospStay001 |     D456
    Pers001    | HospStay002 |     C139
    Pers001    | HospStay002 |     J123
    Pers555    | HospStay001 |     D312
    Pers999    | HospStay001 |     C120
    Pers999    | HospStay001 |     E101
    

    这是我真正想做的 :我想转换数据,以便 每名患者每次住院仅一排 ,因此上表的格式如下:

    ----------------------------------------------------------------------------------------------------
    Patient_ID |  Event_ID   | Diagnosis_Code_1 | Diagnosis_Code_2 | Diagnosis_Code_3 | Diagnosis_Code_n
    ----------------------------------------------------------------------------------------------------
    Pers001    | HospStay001 |       C139       |       I245       |       D456       |
    Pers001    | HospStay002 |       C139       |       J123       |                  |
    Pers555    | HospStay001 |       D312       |                  |                  |
    Pers999    | HospStay001 |       C120       |       E101       |                  |
    

    我怀疑解决方案需要一些动态sql。。。恐怕这不是我的强项。

    谢谢

  • 3 回复  |  直到 7 年前
        1
  •  1
  •   Rajat Jaiswal    7 年前
    CREATE  table #source (Patient_ID varchar(100), Event_ID varchar (100) ,Diagnosis_Code VARCHAR(100),Dig_Number INT)
    insert into #source (Patient_ID, Event_ID,Diagnosis_Code,Dig_Number) values
    ('Pers001','HospStay001','I245',2),
    ('Pers001','HospStay001','D456',3),
    ('Pers001','HospStay002','C139',1),
    ('Pers001','HospStay002','J123',2),
    ('Pers555','HospStay001','D312',1),
    ('Pers999','HospStay001','C120',1),
    ('Pers999','HospStay001','E101',2),
    ('Pers001','HospStay001','C139',1)
    
    
    --DROP TABLE tempdb..#source
    
    
    DECLARE @cols AS NVARCHAR(MAX),
            @query AS NVARCHAR(MAX)
    
    SELECT @cols = STUFF
            (
              (
                SELECT ',' + QUOTENAME( CONVERT(VARCHAR(10),Dig_Number))
                FROM #source
                GROUP BY Dig_Number
    
                ORDER BY Dig_Number
                FOR XML PATH(''), TYPE
              ).value('.', 'NVARCHAR(MAX)'),
              1,1,''
            );
    
    SET @query = 'SELECT Patient_ID,Event_ID,' + @cols + ' 
                  FROM
                  (
                    SELECT Patient_ID,Event_ID,Diagnosis_Code,dig_number
                    FROM #source
                 ) x
                 PIVOT
                 (
                    MAX(Diagnosis_Code)
                    FOR Dig_Number IN (' + @cols + ')
                 ) p ';
    
    EXECUTE(@query);
    

    如果再增加一列,即dianosis数,则会起作用。

        2
  •  1
  •   Chris Medlin    7 年前

    感谢Rajat Jaiswal、LeasMaps和Tim Biegeleisen的贡献。非常感谢。

    建议添加 原始表的附加列 在转换后的表中用作列标题是关键。事实证明,这相对容易做到(我在MS Excel中做到了这一点)。

    所以我的原始表格被编辑成这样。。。

    --------------------------------------------------------------
    Patient_ID |  Event_ID   | Diagnosis_Code | DiagCode_Counter |
    --------------------------------------------------------------
    Pers001    | HospStay001 |     C139       | Diagnosis_Code_1 |
    Pers001    | HospStay001 |     I245       | Diagnosis_Code_2 |
    Pers001    | HospStay001 |     D456       | Diagnosis_Code_3 |
    Pers001    | HospStay002 |     C139       | Diagnosis_Code_1 |
    Pers001    | HospStay002 |     J123       | Diagnosis_Code_2 |
    Pers555    | HospStay001 |     D312       | Diagnosis_Code_1 |
    Pers999    | HospStay001 |     C120       | Diagnosis_Code_1 |
    Pers999    | HospStay001 |     E101       | Diagnosis_Code_2 |
    --------------------------------------------------------------
    

    在he中新增 “DiagCode\u计数器” 字段中,每次新的 Diagnosis\u代码 根据唯一 “Event\u ID”

    然后我就可以创建一个 交叉表 在MS Access中使用 “患者ID” “Event\u ID” 字段为 一行 标题;的“DiagCode\u计数器”字段 标题;以及 “Diagnosis\u代码” 条目为 价值观

        3
  •  0
  •   LeasMaps    7 年前

    Rajat是正确的-您需要某种列来创建diagnosis\u column\u 1、diagnosis\u column\u 2。。。。等

    要在ms access中执行此操作,我将: 1、创建一个虚拟列来统计诊断列 2、使用VBA填充它(大型数据库更快),如下所示

    Sub Update_Diagnosis_Code_ID()
    
    Dim db As DAO.Database
    'Dim qdf As DAO.QueryDef
    Dim rs As DAO.Recordset
    
    Dim TmpRecord As String
    
    Dim dummyId As Integer
    Dim patientID As String
    Dim eventID As String
    
    Dim lastDummyId As Integer
    Dim lastpatientID As String
    Dim lasteventID As String
    
    Dim i As Integer
    
    pstrSQL = "SELECT Inpat.Dummy_id, Inpat.Patient_id, Inpat.Event_ID, Inpat.Diagnosis_Code FROM Inpat ORDER BY Inpat.Patient_id, Inpat.Event_ID;"
    Set db = CurrentDb
    
    Set rs = db.OpenRecordset(pstrSQL)
    
    dummyId = 0
    
    
    With rs
        If Not .EOF Then
        'first record
          .MoveFirst
    
            patientID = rs.Fields(1) '
            eventID = rs.Fields(2) '
            .Edit
            rs.Fields(0) = dummyId + 1
            .Update
            .MoveNext
    
            Do While Not .EOF
              'store the values from the last record
              lastpatientID = patientID
              lasteventID = eventID
    
              'get the new values
    
              patientID = rs.Fields(1) '
              eventID = rs.Fields(2) '
    
              'new patient or new hospital stay
              If patientID <> lastpatientID Or eventID <> lasteventID Then
                dummyId = 0 'reset back to 1
              Else
                dummyId = dummyId + 1
              End If
    
              .Edit
              rs.Fields(0) = dummyId + 1
              .Update
              .MoveNext
    
            Loop
          End If
        End With
    
    rs.Close
    
    Set rs = Nothing
    Set dbs = Nothing
    
    MsgBox "Finished", vbExclamation
    
    End Sub
    

    如果使用交叉表显示数据:

    TRANSFORM First(Inpat.[Diagnosis_Code]) AS FirstOfDiagnosis_Code
    SELECT Inpat.[Patient_id], Inpat.[Event_ID], Count(Inpat.[Diagnosis_Code]) 
    AS [Total Of Diagnosis_Code]
    FROM Inpat
    GROUP BY Inpat.[Patient_id], Inpat.[Event_ID]
    PIVOT Inpat.[Dummy_id];