代码之家  ›  专栏  ›  技术社区  ›  Atharv Thakur

在scala spark中更改日期格式后,列的值会更改

  •  0
  • Atharv Thakur  · 技术社区  · 7 年前

    这是我的数据帧,没有数据格式

    +---------------------+---------------+-------------------------+----------------+------------+-----+-----------+-------------------------+---------------------------+-------------------------+-----------------------------+----------------------------+---------------------+-------------------------+----------------------+--------------------------+--------------+------------------------+-------------+---------------+-------------------------+
    |Source_organizationId|Source_sourceId|FilingDateTime_1         |SourceTypeCode_1|DocumentId_1|Dcn_1|DocFormat_1|StatementDate_1          |IsFilingDateTimeEstimated_1|ContainsPreliminaryData_1|CapitalChangeAdjustmentDate_1|CumulativeAdjustmentFactor_1|ContainsRestatement_1|FilingDateTimeUTCOffset_1|ThirdPartySourceCode_1|ThirdPartySourcePriority_1|SourceTypeId_1|ThirdPartySourceCodeId_1|FFAction|!|_1|DataPartition_1|TimeStamp                |
    +---------------------+---------------+-------------------------+----------------+------------+-----+-----------+-------------------------+---------------------------+-------------------------+-----------------------------+----------------------------+---------------------+-------------------------+----------------------+--------------------------+--------------+------------------------+-------------+---------------+-------------------------+
    |4295876589           |1              |1977-02-14T03:00:00+00:00|YUH             |null        |null |null       |1976-12-31T00:00:00+00:00|true                       |false                    |1976-12-31T00:00:00+00:00    |0.82457                     |false                |540                      |SS                    |1                         |3013057       |1000716240              |I|!|         |Japan          |2018-05-03T07:03:27+00:00|
    |4295876589           |8              |1984-02-14T03:00:00+00:00|YUH             |null        |null |null       |1983-12-31T00:00:00+00:00|true                       |false                    |1983-12-31T00:00:00+00:00    |0.82457                     |false                |540                      |SS                    |1                         |3013057       |1000716240              |I|!|         |Japan          |2018-05-03T09:46:58+00:00|
    |4295876589           |1              |1977-02-14T03:00:00+00:00|YUH             |null        |null |null       |1976-12-31T00:00:00+00:00|true                       |false                    |1976-12-31T00:00:00+00:00    |0.82457                     |false                |540                      |SS                    |1                         |3013057       |1000716240              |I|!|         |Japan          |2018-05-03T07:30:16+00:00|
    +---------------------+---------------+-------------------------+----------------+------------+-----+-----------+-------------------------+---------------------------+-------------------------+-----------------------------+----------------------------+---------------------+-------------------------+----------------------+--------------------------+--------------+------------------------+-------------+---------------+-------------------------+
    

    这是我改变数据格式的方法

    val df2resultTimestamp = finalXmlDf.withColumn("FilingDateTime_1", date_format(col("FilingDateTime_1"), "yyyy-MM-dd'T'HH:mm:ss'Z'"))
          .withColumn("StatementDate_1", date_format(col("StatementDate_1"), "yyyy-MM-dd'T'HH:mm:ss'Z'"))
          .withColumn("CapitalChangeAdjustmentDate_1", date_format(col("CapitalChangeAdjustmentDate_1"), "yyyy-MM-dd'T'HH:mm:ss'Z'"))
          .withColumn("CumulativeAdjustmentFactor_1", regexp_replace(format_number($"CumulativeAdjustmentFactor_1".cast(DoubleType), 5), ",", ""))
    

    这是我从哪里得到的输出 FilingDateTime_1 列值已更改

    +---------------------+---------------+--------------------+----------------+------------+-----+-----------+--------------------+---------------------------+-------------------------+-----------------------------+----------------------------+---------------------+-------------------------+----------------------+--------------------------+--------------+------------------------+-------------+---------------+-------------------------+
    |Source_organizationId|Source_sourceId|FilingDateTime_1    |SourceTypeCode_1|DocumentId_1|Dcn_1|DocFormat_1|StatementDate_1     |IsFilingDateTimeEstimated_1|ContainsPreliminaryData_1|CapitalChangeAdjustmentDate_1|CumulativeAdjustmentFactor_1|ContainsRestatement_1|FilingDateTimeUTCOffset_1|ThirdPartySourceCode_1|ThirdPartySourcePriority_1|SourceTypeId_1|ThirdPartySourceCodeId_1|FFAction|!|_1|DataPartition_1|TimeStamp                |
    +---------------------+---------------+--------------------+----------------+------------+-----+-----------+--------------------+---------------------------+-------------------------+-----------------------------+----------------------------+---------------------+-------------------------+----------------------+--------------------------+--------------+------------------------+-------------+---------------+-------------------------+
    |4295876589           |1              |1977-02-14T08:30:00Z|YUH             |null        |null |null       |1976-12-31T05:30:00Z|true                       |false                    |1976-12-31T05:30:00Z         |0.82457                     |false                |540                      |SS                    |1                         |3013057       |1000716240              |I|!|         |Japan          |2018-05-03T07:03:27+00:00|
    |4295876589           |8              |1984-02-14T08:30:00Z|YUH             |null        |null |null       |1983-12-31T05:30:00Z|true                       |false                    |1983-12-31T05:30:00Z         |0.82457                     |false                |540                      |SS                    |1                         |3013057       |1000716240              |I|!|         |Japan          |2018-05-03T09:46:58+00:00|
    |4295876589           |1              |1977-02-14T08:30:00Z|YUH             |null        |null |null       |1976-12-31T05:30:00Z|true                       |false                    |1976-12-31T05:30:00Z         |0.82457                     |false                |540                      |SS                    |1                         |3013057       |1000716240              |I|!|         |Japan          |2018-05-03T07:30:16+00:00|
    +---------------------+---------------+--------------------+----------------+------------+-----+-----------+--------------------+---------------------------+-------------------------+-----------------------------+----------------------------+---------------------+-------------------------+----------------------+--------------------------+--------------+------------------------+-------------+---------------+-------------------------+
    

    值应为 1984-02-14T03:00:00Z

    我不知道我在这里错过了什么。。

    1 回复  |  直到 7 年前
        1
  •  1
  •   Ramesh Maharjan    7 年前

    您只需添加 to_timestamp 内置函数如下

    val df2resultTimestamp = df.withColumn("FilingDateTime_1", date_format(to_timestamp(col("FilingDateTime_1"), "yyyy-MM-dd'T'HH:mm:ss"), "yyyy-MM-dd'T'HH:mm:ss'Z'"))
      .withColumn("StatementDate_1", date_format(to_timestamp(col("StatementDate_1"), "yyyy-MM-dd'T'HH:mm:ss"), "yyyy-MM-dd'T'HH:mm:ss'Z'"))
      .withColumn("CapitalChangeAdjustmentDate_1", date_format(to_timestamp(col("CapitalChangeAdjustmentDate_1"), "yyyy-MM-dd'T'HH:mm:ss"), "yyyy-MM-dd'T'HH:mm:ss'Z'"))
      .withColumn("CumulativeAdjustmentFactor_1", regexp_replace(format_number($"CumulativeAdjustmentFactor_1".cast(DoubleType), 5), ",", ""))
    

    这将为您提供正确的输出

    +---------------------+---------------+--------------------+----------------+------------+-----+-----------+--------------------+---------------------------+-------------------------+-----------------------------+----------------------------+---------------------+-------------------------+----------------------+--------------------------+--------------+------------------------+-------------+---------------+-------------------------+
    |Source_organizationId|Source_sourceId|FilingDateTime_1    |SourceTypeCode_1|DocumentId_1|Dcn_1|DocFormat_1|StatementDate_1     |IsFilingDateTimeEstimated_1|ContainsPreliminaryData_1|CapitalChangeAdjustmentDate_1|CumulativeAdjustmentFactor_1|ContainsRestatement_1|FilingDateTimeUTCOffset_1|ThirdPartySourceCode_1|ThirdPartySourcePriority_1|SourceTypeId_1|ThirdPartySourceCodeId_1|FFAction|!|_1|DataPartition_1|TimeStamp                |
    +---------------------+---------------+--------------------+----------------+------------+-----+-----------+--------------------+---------------------------+-------------------------+-----------------------------+----------------------------+---------------------+-------------------------+----------------------+--------------------------+--------------+------------------------+-------------+---------------+-------------------------+
    |4295876589           |1              |1977-02-14T03:00:00Z|YUH             |null        |null |null       |1976-12-31T00:00:00Z|true                       |false                    |1976-12-31T00:00:00Z         |0.82457                     |false                |540                      |SS                    |1                         |3013057       |1000716240              |I|!|         |Japan          |2018-05-03T07:03:27+00:00|
    |4295876589           |8              |1984-02-14T03:00:00Z|YUH             |null        |null |null       |1983-12-31T00:00:00Z|true                       |false                    |1983-12-31T00:00:00Z         |0.82457                     |false                |540                      |SS                    |1                         |3013057       |1000716240              |I|!|         |Japan          |2018-05-03T09:46:58+00:00|
    |4295876589           |1              |1977-02-14T03:00:00Z|YUH             |null        |null |null       |1976-12-31T00:00:00Z|true                       |false                    |1976-12-31T00:00:00Z         |0.82457                     |false                |540                      |SS                    |1                         |3013057       |1000716240              |I|!|         |Japan          |2018-05-03T07:30:16+00:00|
    +---------------------+---------------+--------------------+----------------+------------+-----+-----------+--------------------+---------------------------+-------------------------+-----------------------------+----------------------------+---------------------+-------------------------+----------------------+--------------------------+--------------+------------------------+-------------+---------------+-------------------------+