Hadoop生态圈-Sqoop部署以及基本使用方法
作者:尹正杰
版权声明:原创作品,谢绝转载!否则将追究法律责任。
Sqoop(发音:skup)是一款开源的工具,主要用于在Hadoop(Hive)与传统的数据库(mysql、postgresql...)间进行数据的传递,可以将一个关系型数据库(例如 : MySQL ,Oracle ,Postgres等)中的数据导进到Hadoop的HDFS中,也可以将HDFS的数据导进到关系型数据库中。
Sqoop项目开始于2009年,最早是作为Hadoop的一个第三方模块存在,后来为了让使用者能够快速部署,也为了让开发人员能够更快速的迭代开发,Sqoop独立成为一个Apache项目。详情请参考:http://sqoop.apache.org/)
注意,本篇博客部署方式是建立在高可用集群的基础上部署的Sqoop,关于高可用集群部署请参考:https://www.cnblogs.com/yinzhengjie/p/9154265.html。
一.部署Sqoop工具
1>.下载Sqoop软件(下载地址:http://mirrors.hust.edu.cn/apache/sqoop/1.4.7/,建议下载最新版本,截止2018-06-14时,最新版本为1.4.7。)
2>.解压并创建符号链接
[yinzhengjie@s101 data]$ tar zxf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz -C /soft/
[yinzhengjie@s101 data]$ ln -s /soft/sqoop-1.4.7.bin__hadoop-2.6.0/ /soft/sqoop
[yinzhengjie@s101 data]$
3>.配置环境变量并使之生效
[yinzhengjie@s101 ~]$ sudo vi /etc/profile
[sudo] password for yinzhengjie:
[yinzhengjie@s101 ~]$ tail -3 /etc/profile
#ADD SQOOP
SQOOP_HOME=/soft/sqoop
PATH=$PATH:$SQOOP_HOME/bin
[yinzhengjie@s101 ~]$
[yinzhengjie@s101 ~]$ source /etc/profile
[yinzhengjie@s101 ~]$
4>.创建sqoop-env.sh配置文件
[yinzhengjie@s101 ~]$ cp /soft/sqoop/conf/sqoop-env-template.sh /soft/sqoop/conf/sqoop-env.sh [yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ more /soft/sqoop/conf/sqoop-env.sh | grep -v ^# | grep -v ^$ export HADOOP_COMMON_HOME=/soft/hadoop export HADOOP_MAPRED_HOME=/soft/hadoop export HBASE_HOME=/soft/hbase export HIVE_HOME=/soft/hive export ZOOCFGDIR=/soft/zk/conf [yinzhengjie@s101 ~]$
5>.将mysql驱动放置在sqoop/lib下
[yinzhengjie@s101 ~]$ cp /soft/hive/lib/mysql-connector-java-5.1.41.jar /soft/sqoop/lib/ [yinzhengjie@s101 ~]$
6>.sqoop version验证安装
[yinzhengjie@s101 ~]$ sqoop version Warning: /soft/sqoop-1.4.7.bin__hadoop-2.6.0/bin/../../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /soft/sqoop-1.4.7.bin__hadoop-2.6.0/bin/../../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /soft/sqoop-1.4.7.bin__hadoop-2.6.0/bin/../../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 18/06/14 00:30:34 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 Sqoop 1.4.7 git commit id 2328971411f57f0cb683dfb79d19d4d19d185dd8 Compiled by maugli on Thu Dec 21 15:59:58 STD 2017 [yinzhengjie@s101 ~]$
二.基本使用
1>.使用sqoop命令行链接MySQL数据库
[yinzhengjie@s101 ~]$ sqoop list-databases --connect jdbc:mysql://s101 --username root -P Warning: /soft/sqoop-1.4.7.bin__hadoop-2.6.0/bin/../../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /soft/sqoop-1.4.7.bin__hadoop-2.6.0/bin/../../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /soft/sqoop-1.4.7.bin__hadoop-2.6.0/bin/../../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 18/06/14 00:33:02 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 Enter password: 18/06/14 00:33:07 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/soft/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/soft/hbase-1.2.6/lib/phoenix-4.10.0-HBase-1.2-client.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/soft/hbase-1.2.6/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] information_schema hive mysql performance_schema [yinzhengjie@s101 ~]$
2>.sqoop查看帮助
[yinzhengjie@s101 ~]$ sqoop help Warning: /soft/sqoop-1.4.7.bin__hadoop-2.6.0/bin/../../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /soft/sqoop-1.4.7.bin__hadoop-2.6.0/bin/../../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /soft/sqoop-1.4.7.bin__hadoop-2.6.0/bin/../../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 18/06/14 01:50:37 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7 usage: sqoop COMMAND [ARGS] Available commands: codegen Generate code to interact with database records create-hive-table Import a table definition into Hive eval Evaluate a SQL statement and display the results export Export an HDFS directory to a database table help List available commands import Import a table from a database to HDFS import-all-tables Import tables from a database to HDFS import-mainframe Import datasets from a mainframe server to HDFS job Work with saved jobs list-databases List available databases on a server list-tables List available tables in a database merge Merge results of incremental imports metastore Run a standalone Sqoop metastore version Display version information See 'sqoop help COMMAND' for information on a specific command. [yinzhengjie@s101 ~]$