我正在Java的IntelliJ上运行spark应用程序。我在pom.xml中添加了spark、Hadoop和AWS依赖项,但不知怎么的,没有加载AWS凭据。
Caused by: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint
下面是我的.java和pom.xml文件。
SparkSession spark = SparkSession
.builder()
.master("local") .config("spark.hadoop.fs.s3a.impl","org.apache.hadoop.fs.s3a.S3AFileSystem") .config("spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version", "2")
.config("spark.hadoop.fs.s3a.awsAccessKeyId", AWS_KEY)
.config("spark.hadoop.fs.s3a.awsSecretAccessKey", AWS_SECRET_KEY)
.getOrCreate();
JavaSparkContext sc = new JavaSparkContext(spark.sparkContext());
Dataset<Row> dF = spark.read().load("s3a://bucket/abc.parquet");
这是我的pom.xml
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.3.2</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.3.2</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk</artifactId>
<version>1.11.417</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-aws</artifactId>
<version>3.1.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>3.1.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>3.1.1</version>
</dependency>
</dependencies>
我被困在这一段时间,并尝试了所有可用的解决方案。我在我的环境中添加了导出AWS密钥。
考虑到没有像python或Scala这样的java spark shell,pom.xml是唯一的方法,您是否有其他方法为java指定jar或键?