代码之家  ›  专栏  ›  技术社区  ›  Andrew Cassidy

sco all saveas txt文件方法输出带有零件前缀的txt文件

  •  0
  • Andrew Cassidy  · 技术社区  · 6 年前

    private[scio] def pathWithShards(path: String) = path.replaceAll("\\/+$", "") + "/part" 
    

    强制文件名以“part”开头。使用saveascustomoutput输出自定义分片文件的唯一方法是什么?

    2 回复  |  直到 6 年前
        1
  •  3
  •   Andrew Cassidy    6 年前

    我必须通过saveascustoutput用光束编码

    import org.apache.beam.sdk.util.Transport
    val jsonFactory: JsonFactory = Transport.getJsonFactory
    val outputPath = "gs://foo/bar_" // file prefix will be bar_
    @BigQueryType.toTable()
    case class Clazz(foo: String, bar: String)
    val collection: SCollection[Clazz] = ....
    collection.map(Clazz.toTableRow).
              map(jsonFactory.toString).
              saveAsCustomOutput(name = "CustomWrite", io.TextIO.write()
                .to(outputPath)
                .withSuffix("")
                .withWritableByteChannelFactory(FileBasedSink.CompressionType.GZIP))
    
        2
  •  0
  •   Neville Li    6 年前

    SCollection#saveAs* SCollection#saveAsCustomOutput