您可以使用更改此列
to_json
,例如。
from pyspark.sql.functions import to_json
df_pyspark = df_pyspark.withColumn("occurred", to_json("occurred"))
给予
ocurred
列:
[{"occurredTimes":3,"sys":[{"varTyp":"Conf Param","varCode":"P33"}],"userAssignments":[]}]
如果您的代码失败,请尝试在以下步骤之前创建带有模式的PySpark数据帧:
json_schema = StructType(
[
StructField("fd", StringType(), True),
StructField("appVar", ArrayType(StringType()), True),
StructField("varMode", StringType(), True),
StructField(
"occurred",
ArrayType(
StructType(
[
StructField("occurredTimes", IntegerType(), True),
StructField(
"sys",
ArrayType(
StructType(
[
StructField("varTyp", StringType(), True),
StructField("varCode", StringType(), True),
]
)
),
True,
),
StructField("userAssignments", ArrayType(StringType()), True),
]
)
),
True,
),
]
)