sqoop 导出到 hana 失败答案

【问题标题】：sqoop export to hana failedsqoop 导出到 hana 失败
【发布时间】：2016-10-25 06:54:14
【问题描述】：

当我想将数据从 sqoop 导出到 hana DB 时，我收到错误消息。

这里是 sqoop 调用

sqoop export \
 --connect "jdbc:sap://saphana:30115" \
 --username username \
 --password password \
 --table "INTERFACE_BO.WR_CUSTOMER_DATA" \
 --columns "WEEK_DAY_START,SALES_ORG,REG_ORDERS_CNT,REG_EMAILS_CNT,UNIQUE_EMAILS" \ 
 --driver "com.sap.db.jdbc.Driver" \      
 --export-dir "/user/hive/warehouse/export.db/weekly_holding_report" \
 --input-fields-terminated-by '\t'

这就是我创建表格的方式

CREATE TABLE weekly_holding_report
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY '\t'
    stored as textfile
AS
SELECT
...

我也尝试了多种存储方式，例如：存储为镶木地板等。但没有任何变化

这是回复

Warning: /opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/10/25 06:43:04 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.8.0
16/10/25 06:43:04 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/10/25 06:43:05 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
16/10/25 06:43:05 INFO manager.SqlManager: Using default fetchSize of 1000
16/10/25 06:43:05 INFO tool.CodeGenTool: Beginning code generation
16/10/25 06:43:05 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM INTERFACE_BO.WR_CUSTOMER_DATA AS t WHERE 1=0
16/10/25 06:43:05 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-root/compile/515ace5e3ad6ad117ba5f5fa61bf279c/INTERFACE_BO_WR_CUSTOMER_DATA.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
16/10/25 06:43:07 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/515ace5e3ad6ad117ba5f5fa61bf279c/INTERFACE_BO.WR_CUSTOMER_DATA.jar
16/10/25 06:43:07 INFO mapreduce.ExportJobBase: Beginning export of INTERFACE_BO.WR_CUSTOMER_DATA
16/10/25 06:43:07 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
16/10/25 06:43:07 INFO Configuration.deprecation: mapred.map.max.attempts is deprecated. Instead, use mapreduce.map.maxattempts
16/10/25 06:43:08 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
16/10/25 06:43:08 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
16/10/25 06:43:08 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
16/10/25 06:43:08 INFO client.RMProxy: Connecting to ResourceManager at cz-dc-v-564.mall.local/10.200.58.21:8032
16/10/25 06:43:10 INFO input.FileInputFormat: Total input paths to process : 10
16/10/25 06:43:10 INFO input.FileInputFormat: Total input paths to process : 10
16/10/25 06:43:10 INFO mapreduce.JobSubmitter: number of splits:4
16/10/25 06:43:10 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
16/10/25 06:43:10 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1476740951886_1458
16/10/25 06:43:10 INFO impl.YarnClientImpl: Submitted application application_1476740951886_1458
16/10/25 06:43:10 INFO mapreduce.Job: The url to track the job: http://cz-dc-v-564.mall.local:8088/proxy/application_1476740951886_1458/
16/10/25 06:43:10 INFO mapreduce.Job: Running job: job_1476740951886_1458
16/10/25 06:43:18 INFO mapreduce.Job: Job job_1476740951886_1458 running in uber mode : false
16/10/25 06:43:18 INFO mapreduce.Job:  map 0% reduce 0%
16/10/25 06:43:25 INFO mapreduce.Job:  map 100% reduce 0%
16/10/25 06:43:25 INFO mapreduce.Job: Job job_1476740951886_1458 failed with state FAILED due to: Task failed task_1476740951886_1458_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

16/10/25 06:43:26 INFO mapreduce.Job: Counters: 13
    Job Counters 
        Failed map tasks=1
        Killed map tasks=3
        Launched map tasks=4
        Data-local map tasks=1
        Rack-local map tasks=3
        Total time spent by all maps in occupied slots (ms)=39746
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=19873
        Total vcore-seconds taken by all map tasks=19873
        Total megabyte-seconds taken by all map tasks=30524928
    Map-Reduce Framework
        CPU time spent (ms)=0
        Physical memory (bytes) snapshot=0
        Virtual memory (bytes) snapshot=0
16/10/25 06:43:26 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
16/10/25 06:43:26 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 17,9084 seconds (0 bytes/sec)
16/10/25 06:43:26 INFO mapreduce.ExportJobBase: Exported 0 records.
16/10/25 06:43:26 ERROR tool.ExportTool: Error during export: Export job failed!

有人知道我做错了什么吗？

感谢您的帮助！

【问题讨论】：

在命令末尾尝试--verbose（查看扩展日志）
你能从 hana 数据库中读取吗？尝试简单地选择 1 行并将其存储为文本文件。 -- 如果失败，则可能是连接问题，如果可行，请尝试指定实际的插入查询（您可以在没有 sqoop 的情况下使用），因为您可能会收到更有用的错误消息。
cz-dc-v-564.mall.local:8088/proxy/… 导航到这个job url，应该可以看到失败任务的详细日志-task_1476740951886_1458_m_000000，必须有一些异常。

标签： hadoop mapreduce hive sqoop

【解决方案1】：

您是否尝试过添加一个 sqoop 参数 --m 1

这基本上是为了给出映射器的数量，所以在这种情况下尝试使用 mapper = 1

您的数据是否太大（超过 1 GB）然后增加映射器的数量

【讨论】：

data> 1gb 不是使用多个映射器的唯一参数。
@devツ：我同意，但是为了运行查询，他可以尝试使用这个参数。是的，它们是提供映射器编号的更多参数。

【解决方案2】：

检查日志以获取相应的作业 ID，在您的情况下为 job_1476740951886_1458。一定有一些权限或连接问题（使用给定的 saphana url），这就是它传输零字节的原因。

【讨论】：