我在尝试向HDFS写入数据时遇到了这个错误。这工作做得很好,我得到这个错误。所以很明显存在数据问题。
18/09/15 04:13:43 ERROR JobScheduler: Error running job streaming job 1536977640000 ms.0
java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:347)
at scala.None$.get(Option.scala:345)
at org.apache.spark.sql.execution.command.DataWritingCommand$class.metrics(DataWritingCommand.scala:49)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.metrics$lzycompute(InsertIntoHadoopFsRelationCommand.scala:46)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.metrics(InsertIntoHadoopFsRelationCommand.scala:46)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.metrics$lzycompute(commands.scala:100)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.metrics(commands.scala:100)
at org.apache.spark.sql.execution.SparkPlanInfo$.fromSparkPlan(SparkPlanInfo.scala:58)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:654)
at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:273)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:267)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:225)
这是否意味着我的输出在数据流中没有任何内容?下面是我用来将DStream写入HDFS的代码
outputDStream.repartition(100).foreachRDD((rdd: RDD[Transaction], time: SparkTime) => {
val df = rdd.toDF
val dfWithTimestamp = df.select("*").withColumn("current_timestamp",current_timestamp())
dfWithTimestamp.write
.mode(SaveMode.Overwrite)
.save(s"${outputPath}")
})