【问题标题】:Unable to install spark 2.2 in Cloudera Quickstart VM (5.10)无法在 Cloudera Quickstart VM (5.10) 中安装 spark 2.2
【发布时间】:2018-01-26 21:22:52
【问题描述】:

我已经按照这里的博客(下面提到)下载了包裹并按要求放置。 有没有安装的请告诉我步骤。

(https://www.cloudera.com/documentation/spark2/latest/topics/spark2_installing.html)

/opt/cloudera/csd/SPARK2-2.1.0.cloudera2-1.cdh5.7.0.p0.171658-el5.parcel

但是 service cloudera-scm-server restart 没有执行。 要使用 Cloudera Express(免费),请运行:

sudo /home/cloudera/cloudera-manager --express

这需要至少 8 GB 的 RAM 和至少 2 个虚拟 CPU。

【问题讨论】:

  • 我在使用 Cloudera Express 时遇到了很多问题。我认为您必须等到他们将 2.2 添加到 VM 中。我刚才无法让它工作。最好在自己的干净机器上使用,但需要考虑许多问题。
  • 我添加了它,然后遇到了 hive 元存储问题。我怀疑这与 1.6 的存在有关。绝望
  • 我已经安装了具有 SPARK 2.2 但没有 IMPALA 的 mapR Sandbox。 Interace 更好,它似乎不费吹灰之力。这是我的建议。

标签: cloudera apache-spark-2.0 cloudera-quickstart-vm


【解决方案1】:
SPARK 2.2 Installation Setup on Cloudera VM

Step 1: Download a quickstart_vm from the link:
Prefer a vmware platform as it is easy to use, anyways all the options are viable.
Size is around 5.4gb of the entire tar file. We need to provide the business email id as it won’t accept personal email ids. 


Step 2: The virtual environment requires around 8gb of RAM, please allocate sufficient memory to avoid performance glitches.


Step 3: Please open the terminal and switch to root user as:
         su root
         password: cloudera

Step 4: Cloudera provides java –version 1.7.0_67 which is old and does not match with our needs. To avoid java related exceptions, please install java with the following commands:
(a). Downloading Java:
wget -c --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.tar.gz

(b). Switch to /usr/java/ directory with “cd /usr/java/” command.

(c). cp the java download tar file to the /usr/java/ directory.

(d). Untar the directory with “tar –zxvf jdk-8u31-linux-x64.tar.gz”

(e). Open the profile file with the command “vi ~/.bash_profile” 

(f). export JAVA_HOME to the new java directory.
       “export JAVA_HOME=/usr/java/jdk1.8.0_131”

       Save and Exit.


(g). In order to reflect the above change, following command needs to be executed on the shell:
       source ~/.bash_profile

Step 5:  The Cloudera VM provides spark 1.6 version by default. However, 1.6 API’s are old and do not match with production environments. In that case, we need to download and manually install Spark 2.2.

(a). Switch to /opt/  directory with the command:
“cd /opt/”

(b). Download spark with the command:
wget https://d3kbcqa49mib13.cloudfront.net/spark-2.2.0-bin-hadoop2.7.tgz

(c). Untar the spark tar with the following command:
tar -zxvf spark-2.2.0-bin-hadoop2.7.tgz

(d). We need to define some environment variables as default settings:
Please open a file with the following command:
vi /opt/spark-2.2.0-bin-hadoop2.7/conf/spark-env.sh
Paste the following configurations in the file:
SPARK_MASTER_IP=192.168.50.1
SPARK_EXECUTOR_MEMORY=512m
SPARK_DRIVER_MEMORY=512m
SPARK_WORKER_MEMORY=512m
SPARK_DAEMON_MEMORY=512m
Save and exit

(e).    We need to start spark with the following command:
/opt/spark-2.2.0-bin-hadoop2.7/sbin/start-all.sh
Export spark_home : 
export SPARK_HOME=/opt/spark-2.2.0-bin-hadoop2.7/

(f). Change the permissions of the directory:
chmod 777 -R /tmp/hive

(g). Try “spark-shell”, it should work.

【讨论】:

【解决方案2】:

请观看下面的视频,它包含在 Clouedra VM 中安装 Sprak2 所需的所有必要步骤。

youtube 链接 - https://www.youtube.com/watch?v=lQxlO3coMxM

对于启动 Cloudera Express(免费),您的 VM 应该至少分配 8Gb RAM,或者如果您分配了默认 4GB RAM,那么您可以强制启动下面的命令,然后按照上面的视频进行操作。

sudo /home/cloudera/cloudera-manager --force --express

【讨论】:

    【解决方案3】:

    试试这个命令

    sudo /home/cloudera/cloudera-manager --express --force
    

    【讨论】:

    • 可以解释一下
    • 它会在至少有 8 GB RAM 的 VM 中运行 Cloudera Express 时发出警告,通过添加 --force,我们告诉它即使使用当前 RAM 也运行 Cloudera Express。
    【解决方案4】:

    我放弃了这个,parcel 和 non-parcel 安装都不能很好地工作。

    一旦 cloudera express 启动,就会出现大量错误,并且是 Java 7 而不是 Java 8。

    我使用 Spark 2.x 安装了 mapr VM。没有问题。第一次工作。

    效果很好。这是我的建议 #1。

    如果你想要KUDU,那么我会安装centos并自己安装东西。这是第 2 条建议。好吧,您可能会想念 Impala,但如果是纯粹的研究和开发,那么问题不大。

    【讨论】:

      【解决方案5】:

      通过以下两个命令,我的 spark2.2 自动更新为 spark 2.4:

      (i) 须藤百胜更新

      可能是你的java home路径搞错了,这种情况请将java home路径导出到bash文件中。

      (a) vi ~/.bash_profile (b) (c) 来源 ~/.bash_profile

      【讨论】:

        【解决方案6】:

        只需下载您需要的正确版本的 spark 说“spark-2.2.0-bin-hadoop2.6”

        通过 vi 编辑器打开 bashrc_profile vi ~/.bash_profile。粘贴下面两行

        SPARK_HOME=/home/cloudera/Downloads/spark-2.2.0-bin-hadoop2.6 PATH=$PATH:$HOME/bin:$SPARK_HOME/bin

        保存 然后运行命令:source ~/.bash_profile

        现在启动 spark-shell 。 注意:确保您已安装 JDK 1.8

        【讨论】:

          【解决方案7】:
            SnPARK 2.2 Installation Setup on Cloudera VM
          
              Step 1: Download a quickstart_vm from the link:
              Prefer a vmware platform as it is easy to use, anyways all the options are viable.
              Size is around 5.4gb of the entire tar file. We need to provide the business email id as it won’t accept personal email ids. 
          
          
              Step 2: The virtual environment requires around 8gb of RAM, please allocate sufficient memory to avoid performance glitches.
          
          
              Step 3: Please open the terminal and switch to root user as:
                       su root
                       password: cloudera
          
              Step 4: Cloudera provides java –version 1.7.0_67 which is old and does not match with our needs. To avoid java related exceptions, please install java with the following commands:
              (a). Downloading Java:
              wget -c --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.tar.gz
          
              (b). Switch to /usr/java/ directory with “cd /usr/java/” command.
          
              (c). cp the java download tar file to the /usr/java/ directory.
          
              (d). Untar the directory with “tar –xvzf jdk-8u31-linux-x64.tar.gz”
          
              (e). Open the profile file with the command “vi ~/.bash_profile” 
          
              (f). export JAVA_HOME to the new java directory.
                     “export JAVA_HOME=/usr/java/jdk1.8.0_131”
          
                     Save and Exit.
          
          
              (g). In order to reflect the above change, following command needs to be executed on the shell:
                     source ~/.bash_profile
          
              Step 5:  The Cloudera VM provides spark 1.6 version by default. However, 1.6 API’s are old and do not match with production environments. In that case, we need to download and manually install Spark 2.2.
          
              (a). Switch to /opt/  directory with the command:
              “cd /opt/”
          
              (b). Download spark with the command:
              wget https://d3kbcqa49mib13.cloudfront.net/spark-2.2.0-bin-hadoop2.7.tgz
          
              (c). Untar the spark tar with the following command:
              tar -xvzf spark-2.2.0-bin-hadoop2.7.tgz
          
              (d). We need to define some environment variables as default settings:
              Please open a file with the following command:
              vi /opt/spark-2.2.0-bin-hadoop2.7/conf/spark-env.sh
              Paste the following configurations in the file:
              SPARK_MASTER_IP=192.168.50.1
              SPARK_EXECUTOR_MEMORY=512m
              SPARK_DRIVER_MEMORY=512m
              SPARK_WORKER_MEMORY=512m
              SPARK_DAEMON_MEMORY=512m
              SPARK_LOCAL_IP=127.0.0.1
              Save and exit
          
              (e).    We need to start spark with the following command:
              /opt/spark-2.2.0-bin-hadoop2.7/sbin/start-all.sh
              Export spark_home : 
              export SPARK_HOME=/opt/spark-2.2.0-bin-hadoop2.7/
          
              (f). Change the permissions of the directory:
              chmod 777 -R /tmp/hive
          
              (g). Try “spark-shell”, it should work.
          

          swapnil shashank 的答案相同,但在下方稍作修改

          SPARK_LOCAL_IP=127.0.0.1
          tar -xvzf spark-2.2.0-bin-hadoop2.7.tgz
          

          【讨论】:

            猜你喜欢
            • 1970-01-01
            • 1970-01-01
            • 1970-01-01
            • 1970-01-01
            • 2014-02-21
            • 1970-01-01
            • 2011-01-14
            • 2019-10-31
            • 1970-01-01
            相关资源
            最近更新 更多