首页 > 学院 > 操作系统 > 正文

安装zeppelin安装spark ha集群

2024-06-28 16:01:16
字体:
来源:转载
供稿:网友
安装zeppelin 1.默认安装好spark集群2.安装zeppelin    1.解压安装包        tar zxvf zeppelin-0.5.5-incubating-bin-all.tgz    2.配置环境变量        vim /etc/PRofile                 #zeppelin        export ZEPPELIN_HOME=/opt/zeppelin-0.5.5                 #CLASSPATH        export CLASSPATH=$CLASSPATH:$ZEPPELIN_HOME/lib                 #PATH        export PATH=$PATH:$ZEPPELIN_HOME/bin         

        保存退出

         source /etc/profile

我们的/etc/profile配置如下

export LANG=zh_CN.GBK

export LC_ALL=zh_CN.GBK

export PATH=$PATH:/usr/local/python2.7.6/bin/

export java_HOME=/usr/java/jdk1.8.0_101

export CLASSPATH=$JAVA_HOME/lib/tools.jar

export PATH=$JAVA_HOME/bin:$PATH

export PATH="/usr/local/MySQL/bin:$PATH"

#set for nodejs

export NODE_HOME=/usr/local/nodejs

export PATH=$NODE_HOME/bin:$PATH

PATH="/usr/local/mysql/bin:$PATH"

        

    3.修改配置文件        cd /opt/zeppelin-0.5.5                 1.根据模板复制相关配置文件            cp zeppelin-env.sh.template zeppelin-env.sh            cp zeppelin-site.xml.template zeppelin-site.xml        2.创建相关目录            mkdir /opt/zeppelin-0.5.5/logs            mkdir /opt/zeppelin-0.5.5/tmp        3.修改配置文件参数                     ####zeppelin-env.sh####            export JAVA_HOME=/usr/java/jdk1.8.0_65            export HADOOP_CONF_DIR=/opt/hadoop-2.5.2/etc/hadoop   //如果需要读取hdfs的文件  则必须配置此条             export MASTER=spark://hadoop.master:7077      //使用spark 集群模式            export SPARK_HOME=/opt/spark-1.6.0            export SPARK_SUBMIT_OPTIONS="--driver-memory 500M --executor-memory 500m"   //可以根据实际内存情况进行调整             export ZEPPELIN_JAVA_OPTS="-Dspark.executor.memory=500m -Dspark.cores.max=1"  //可以根据实际内存情况进行调整            export ZEPPELIN_MEM="-Xmx500m -XX:MaxPermSize=500m"     //可以根据实际内存情况进行调整            export ZEPPELIN_LOG_DIR=/opt/zeppelin-0.5.6/logs

             

我们的zeppelin-env.sh配置如下

export  SPARK_MASTER_ip=127.0.0.1

export  SPARK_LOCAL_IP=127.0.0.1

export ZEPPELIN_MEM="-Xms1024m -Xmx16384m -XX:MaxPermSize=16384m"

                         ####zeppelin-site.xml####            <property>              <name>zeppelin.server.addr</name>              <value>hadoop.slaver3</value>         //主机名或Ip  当前主机              <description>Server address</description>            </property>             <property>              <name>zeppelin.server.port</name>              <value>8084</value>                        //不一定非要一致  但这个端口是通过web访问的端口              <description>Server port.</description>            </property>

             

        4.启动            zeppelin-daemon.sh start        5.验证            主机名:端口    //能显示界面则表示安装成功            

安装spark ha集群

1.默认安装好hadoop+zookeeper2.安装scala    1.解压安装包        tar zxvf scala-2.11.7.tgz    2.配置环境变量        vim /etc/profile                 #scala        export SCALA_HOME=/opt/scala-2.11.7                 #CLASSPATH        export CLASSPATH=$CLASSPATH:$SCALA_HOME/lib                 #PATH        export PATH=$PATH:$SCALA_HOME/bin                 保存退出                 source /etc/profile    3.验证        scala -version                 3.安装spark    1.解压安装包        tar zxvf spark-1.6.0-bin-hadoop2.4.tgz    2.配置环境变量        vim /etc/profile                 #spark        export SPARK_HOME=/opt/spark-1.6.0                 #CLASSPATH        export CLASSPATH=$CLASSPATH:$SPARK_HOME/lib                 #PATH        export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin                 保存退出         

        source /etc/profile

我们的/etc/profile配置如下

export LANG=zh_CN.GBK

export LC_ALL=zh_CN.GBK

export PATH=$PATH:/usr/local/python2.7.6/bin/

export JAVA_HOME=/usr/java/jdk1.8.0_101

export CLASSPATH=$JAVA_HOME/lib/tools.jar

export PATH=$JAVA_HOME/bin:$PATH

export PATH="/usr/local/mysql/bin:$PATH"

#set for nodejs

export NODE_HOME=/usr/local/nodejs

export PATH=$NODE_HOME/bin:$PATH

PATH="/usr/local/mysql/bin:$PATH"

         

    3.修改配置文件        1.根据模板复制相关配置文件            cp spark-env.sh.template spark-env.sh            cp slaves.template slaves            cp log4j.properties.template log4j.properties            cp spark-defaults.conf.template spark-defaults.conf        2.创建相关目录            mkdir /opt/spark-1.6.0/logs            mkdir /opt/spark-1.6.0/tmp            hadoop fs -mkdir /spark //在hdfs上创建存储spark的任务日志文件        3.修改配置文件参数            ####spark-env.sh#### 最后加入 其中hadoop.master为主节点 hadoop.slaver1为备份主节点            export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=hadoop.master:2181,hadoop.slaver1:2181 -Dspark.deploy.zookeeper.dir=/spark"            export JAVA_HOME=/usr/java/jdk1.8.0_65            export SPARK_WORKER_CORES=1            export SPARK_WORKER_INSTANCES=1

            export SPARK_WORKER_MEMORY=1g

我们的spark-env.sh配置如下

export SPARK_DIST_CLASSPATH=$(/home/haoren/soft/hadoop/bin/hadoop classpath)

#export SPARK_CLASSPATH=$SPARK_CLASSPATH:/home/ztgame/soft/mysql-connector-java-5.1.40.jar

#export SPARK_MASTER_PORT=17077

#export SPARK_MASTER_HOST=222.192.205.26

#export SPARK_MASTER_WEBUI_PORT=18082

             

            ####slaves####将所有的从节点主机名加入            hadoop.slaver1            hadoop.slaver2            hadoop.slaver3                         ####log4j.properties####            无需修改                                      ####spark-defaults.conf####            spark.eventLog.enabled  true            spark.eventLog.dir      hdfs://ns1:8020/spark            spark.history.fs.logDirectory      hdfs://ns1:8020/spark            spark.eventLog.compress true                 4.分发到各个节点        scp -r /opt/spark-1.6.0 hadoop@hadoop.slaver1:/opt        scp -r /opt/spark-1.6.0 hadoop@hadoop.slaver2:/opt        scp -r /opt/spark-1.6.0 hadoop@hadoop.slaver3:/opt    5.启动        //先启动zookeeper 和 hdfs        sbin/start-all.sh   //注意切换目录  不然跟hadoop的start-all 冲突                 spark-shell --master spark://hadoop.master:7077  //集群模式启动客户端        spark-shell     //单机模式启动客户端    6.验证        1.jps        2.web            节点主机名:8080  //如果采用默认端口的話则是8080  主节点web            节点主机名:18080  //主节点 历史任务web            节点主机名:4040   //子节点正在进行任务web        3.HA            在备份主机节点执行 start-master.sh命令            然后在主机节点把master进程kill掉,此时会自行切换至备份节点(需要几秒钟的缓冲时间)    7.常用命令        1.启动            start-all.sh  //注意切换目录            start-master.sh            stop-master.sh            start-slave.sh 主节点:7077  //默认端口  如果不修改的話            start-history-server.sh   //启动任务历史服务        2.使用            1.本机模式                运行 spark-shell            2.yarn                打包运行jar包                spark-submit                --master spark://spark113:7077                --class org.apache.spark.examples.SparkPi                --name Spark-Pi --executor-memory 400M                --driver-memory 512M                /opt/spark-1.6.0/lib/spark-examples-1.6.0-hadoop2.4.0.jar            3.Wordcount                val file=sc.textFile("hdfs://ns1:8020/huangzhijian/test.dat")                val count=file.flatMap(line => line.split(" ")).map(word => (word,1)).reduceByKey(_+_)                count.saveAsTextFile("hdfs://ns1:8020/output")  //需保证hdfs上该目录不存在             转自http://www.CUOXin.com/ciade/p/5141264.html
发表评论 共有条评论
用户名: 密码:
验证码: 匿名发表