【发布时间】:2016-12-21 17:51:17
【问题描述】:
大家好,我是 Hadoop 的新手。这是我的第一个程序,我需要帮助解决以下错误。
当我将文件直接放入 HDFS 而不使用 hdfs://localhost:9000/ 时,我收到错误消息 dir not exist。
所以我通过以下方式将文件放入hdfs
hadoop fs -put file.txt hdfs://localhost:9000/sawai.txt
在这个文件像这样加载到 HDFS 之后:
-
好的,然后我尝试像这样运行 wordcount jar 文件的程序:
hadoop jar wordcount.jar hdp.WordCount sawai.txt outputdir我收到以下错误消息:
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://localhost:9000/user/hadoop_usr/sawai.txt -
然后我尝试另一种方式,我尝试像这样指定 hdfs 路径。
hadoop jar wordcount.jar hdp.WordCount hdfs://localhost:9000/sawai.txt hdfs://localhost:9000/outputdir我收到以下错误消息:
org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://localhost:9000/sawai.txt already exists at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:131) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:268) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:870) at hdp.WordCount.run(WordCount.java:40) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at hdp.WordCount.main(WordCount.java:17) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
我读了很多文章,他们建议我每次都更改输出目录名称,我采用了这种方式,但在我的情况下它不起作用,而且似乎问题在于定义我们要对其执行操作的源文件名。
是什么导致了异常,我该如何解决?
【问题讨论】:
标签: java hadoop mapreduce hdfs