【问题标题】:Sqoop Oracle Import does not create tableSqoop Oracle 导入不创建表
【发布时间】:2016-11-25 05:57:44
【问题描述】:

我想使用 Sqoop 将数据从 Oracle 数据库导入 Hive。我希望 Sqoop 在目标 Hive 数据库中创建表。

我把Oracle JDBC(ojdbc6.jar)放到Sqoop lib目录下。

我尝试了这两种方法,但都没有。

sqoop import \
    --connect jdbc:oracle:thin:@${DB_HOST}:${DB_PORT}:${DB_NAME} \
    --username ${DB_USER} \
    --password ${DB_PWD} \
    --table ${INPUT_TABLE} \
    --hcatalog-home /usr/hdp/current/hive-webhcat \
    --hcatalog-database ${OUTPUT_DB} \
    --hcatalog-table ${OUTPUT_TABLE} \
    --create-hcatalog-table \
    --num-mappers 1


sqoop import  \
    --connect jdbc:oracle:thin:@${DB_HOST}:${DB_PORT}:${DB_NAME} \
    --username ${DB_USER} \
    --password ${DB_PWD} \
    --hive-home /usr/hdp/current/hive \
    --hive-import \
    --create-hive-table \
    --hive-table "${OUTPUT_DB}.${OUTPUT_TABLE}" \
    --table ${INPUT_TABLE}

我有这个错误信息:

ERROR tool.ImportTool: Imported Failed: There is no column found in 目标表 input_table。请确保您的表名是 正确。

似乎 Sqoop 没有考虑 --create-hcatalog-table 或 --create-hive-table

但是,当我使用 Sqoop 从 PostgreSQL 导入数据时,表创建工作正常。有任何想法吗?谢谢

有关信息,Sqoop 很好地阅读了 Oracle 表。我运行了这个命令,得到了很好的结果:

sqoop eval \
    --connect jdbc:oracle:thin:@${DB_HOST}:${DB_PORT}:${DB_NAME} \
    --username ${DB_USER} \
    --password ${DB_PWD} \
    --query "select count(1) from input_table"

我的错误的完整日志:

16/07/21 18:08:29 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.4.0.0-169
16/07/21 18:08:29 DEBUG tool.BaseSqoopTool: Enabled debug logging.
16/07/21 18:08:29 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/07/21 18:08:29 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
16/07/21 18:08:29 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
16/07/21 18:08:29 DEBUG sqoop.ConnFactory: Loaded manager factory: org.apache.sqoop.manager.oracle.OraOopManagerFactory
16/07/21 18:08:29 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory
16/07/21 18:08:29 DEBUG sqoop.ConnFactory: Trying ManagerFactory: org.apache.sqoop.manager.oracle.OraOopManagerFactory
16/07/21 18:08:29 DEBUG oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop can be called by Sqoop!
16/07/21 18:08:29 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.
16/07/21 18:08:29 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory
16/07/21 18:08:29 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:oracle:thin:@host:1521:sid
16/07/21 18:08:29 DEBUG manager.OracleManager$ConnCache: Instantiated new connection cache.
16/07/21 18:08:29 INFO manager.SqlManager: Using default fetchSize of 1000
16/07/21 18:08:29 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.OracleManager@7d8704ef
16/07/21 18:08:29 INFO tool.CodeGenTool: Beginning code generation
16/07/21 18:08:29 DEBUG manager.OracleManager: Using column names query: SELECT t.* FROM input_table t WHERE 1=0
16/07/21 18:08:29 DEBUG manager.SqlManager: Execute getColumnInfoRawQuery : SELECT t.* FROM input_table t WHERE 1=0
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
16/07/21 18:08:30 DEBUG manager.OracleManager: Creating a new connection for jdbc:oracle:thin:@host:1521:sid/user, using username: user
16/07/21 18:08:30 DEBUG manager.OracleManager: No connection paramenters specified. Using regular API for making connection.
16/07/21 18:08:30 INFO manager.OracleManager: Time zone has been set to GMT
16/07/21 18:08:30 DEBUG manager.SqlManager: Using fetchSize for next query: 1000
16/07/21 18:08:30 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM input_table t WHERE 1=0
16/07/21 18:08:30 DEBUG manager.SqlManager: Found column c1 of type [2, 10, 0]
16/07/21 18:08:30 DEBUG manager.SqlManager: Found column c2 of type [2, 10, 0]
16/07/21 18:08:30 DEBUG manager.SqlManager: Found column c3 of type [12, 20, 0]
16/07/21 18:08:30 DEBUG manager.SqlManager: Found column c4 of type [2, 0, -127]
16/07/21 18:08:30 DEBUG manager.SqlManager: Found column c5 of type [2, 0, -127]
16/07/21 18:08:30 DEBUG manager.SqlManager: Found column c6 of type [12, 80, 0]
16/07/21 18:08:30 DEBUG manager.SqlManager: Found column c7 of type [93, 0, 0]
16/07/21 18:08:30 DEBUG manager.SqlManager: Found column c8 of type [12, 20, 0]
16/07/21 18:08:30 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@host:1521:sid/user
16/07/21 18:08:30 ERROR tool.ImportTool: Imported Failed: There is no column found in the target table intput_table. Please ensure that your table name is correct.

【问题讨论】:

  • --hive-import 将自动在--hive-table 标签中提到的配置单元中创建表。将-verbose 放在导入查询的末尾(获取扩展日志)并共享完整日志
  • 使用带有详细选项的完整日志编辑了我的帖子
  • 日志中的错误显示target table input_table,但您引用了output_table:确保${OUTPUT_TABLE}${INPUT_TABLE} 值正确。都是int 类型的列吗? - 首先尝试不使用变量(使用实际值)运行。
  • 我纠正了错字,错误是“目标表输入”_table。我试过没有变量,结果是一样的。我仍然有同样的错误。列不都是 int(int、varchar 和 date)

标签: oracle hadoop sqoop


【解决方案1】:

我找到了一个解决方案,似乎--table参数对我不起作用,所以我习惯了--query参数。

    sqoop import \
        --connect ${DB_CNX_STR} \
        --username ${DB_USER} \
        --password ${DB_PWD} \
        --query "SELECT * FROM ${INPUT_TABLE} WHERE \$CONDITIONS" \
        --target-dir ${TARGET_DIR}/${INPUT_TABLE} \
        --hive-import \
        --hive-home "/usr/hdp/current/hive" \
        --create-hive-table \
        --hive-table "${OUTPUT_DB}.${OUTPUT_TABLE}" \
        --num-mappers 1 

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-08-05
    • 1970-01-01
    • 1970-01-01
    • 2019-10-07
    • 1970-01-01
    相关资源
    最近更新 更多