我是新来的。我试过引爆一个炸弹 array 内部 struct . JSON循环有点复杂,如下所示。
array
struct
{ "id": 1, "firstfield": "abc", "secondfield": "zxc", "firststruct": { "secondstruct": { "firstarray": [{ "firstarrayfirstfield": "asd", "firstarraysecondfield": "dasd", "secondarray": [{ "score": " 7 " }] }] } }
}
我正在尝试访问 score secondarray 字段,以便能够计算几个指标,并得出每个指标的平均得分 id
score
secondarray
id
如果您使用的是胶水,那么应该将DynamicFrame转换为Spark的数据帧,然后使用 explode 功能:
from pyspark.sql.functions import col, explode scoresDf = dynamicFrame.toDF .withColumn("firstExplode", explode(col("firststruct.secondstruct.firstarray"))) .withColumn("secondExplode", explode(col("firstExplode.secondarray"))) .select("secondExplode.score") scoresDyf = DynamicFrame.fromDF(scoresDf, glueContext, "scoresDyf")