【发布时间】:2016-03-19 15:44:44
【问题描述】:
我正在尝试从 apache Drill 中查询我的 HDFS 文件系统。 我已经成功地能够查询 hive 表,csv 文件,但部分文件不起作用。
hadoop fs -cat BANK_FINAL/2015-11-02/part-r-00000 | head -1
给出结果:
028|S80306432|2015-11-02|BRN-CLG-CHQ 支付给 SILVER ROCK BANDRA CO-OP|485|ZONE SERIAL [ 485]|L|I|MAHARASHTRA STATE CO-OP BANK LTD|3320.0|向内CLG|D11528|SBPRM
select * from dfs.`/user/ituser1/e.csv` limit 10
工作正常并成功给出结果。
但是当我尝试查询时
select * from dfs.`/user/ituser1/BANK_FINAL/2015-11-02/part-r-00000` limit 10
给出错误:
org.apache.drill.common.exceptions.UserRemoteException:验证错误:从第 1 行第 15 列到第 1 行第 17 列:表 'dfs./user/ituser1/BANK_FINAL/2015-11-02/part- r-00000' 未找到 [错误 ID:6f80392a-51af-4b61-94d8-335b33b0048c 在基因组 dev13.axs:31010 上]
Apache Drilldfs存储插件json如下:
{
"type": "file",
"enabled": true,
"connection": "hdfs://10.9.1.33:8020/",
"workspaces": {
"root": {
"location": "/",
"writable": true,
"defaultInputFormat": null
},
"tmp": {
"location": "/tmp",
"writable": true,
"defaultInputFormat": null
}
},
"formats": {
"psv": {
"type": "text",
"extensions": [
"psv"
],
"delimiter": "|"
},
"csv": {
"type": "text",
"extensions": [
"csv"
],
"delimiter": ","
},
"tsv": {
"type": "text",
"extensions": [
"tsv"
],
"delimiter": "\t"
},
"parquet": {
"type": "parquet"
},
"json": {
"type": "json"
},
"avro": {
"type": "avro"
},
"sequencefile": {
"type": "sequencefile",
"extensions": [
"seq"
]
},
"csvh": {
"type": "text",
"extensions": [
"csvh"
],
"extractHeader": true,
"delimiter": ","
}
}
}
【问题讨论】:
标签: hadoop hdfs apache-drill