org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:...解决
当我使用val peopleDF = spark.sparkContext.textFile("examples/src/main/resources/people.txt").map(_.split(",")).map(para=>Person(para(0).trim,para(1).trim.toInt)).toDFpeopleDF: org.apache.spark.sql.Data
·
当我使用
val peopleDF = spark.sparkContext.textFile("examples/src/main/resources/people.txt").map(_.split(",")).map(para=>Person(para(0).trim,para(1).trim.toInt)).toDF
peopleDF: org.apache.spark.sql.DataFrame = [name: string, age: bigint]
创建完peopleDF后,用peopleDF.show后出现这个错误
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/opt/module/hadoop-3.1.3/etc/hadoop/spark-2.1.1-bin-hadoop2.7/bin/examples/src/main/resources/people.txt
at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:202)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
。。。
但是我确认这个路径上有这个文件
最终解决办法
scala> val peopleDF = spark.sparkContext.textFile("../examples/src/main/resources/people.txt").map(_.split(",")).map(para=>Person(para(0).trim,para(1).trim.toInt)).toDF
peopleDF: org.apache.spark.sql.DataFrame = [name: string, age: bigint]
最终结果
scala> peopleDF.show
+-------+---+
| name|age|
+-------+---+
|Michael| 29|
| Andy| 30|
| Justin| 19|
+-------+---+
成功运行
更多推荐
已为社区贡献3条内容
所有评论(0)