Azkaban-two-server环境搭建

::: hljs-center # Azkaban-two-server环境搭建 ::: two server mode(双进程服务模式 ):存放元数据的数据库为 MySQL,MySQL 应采用主从模式进行备份和容错。这种模式下 webServer 和 executorServer 在不同进程中运行( 同一服务器 )。该模式适合生产环境,更新和升级时对用户的影响较小。 >i # 一、前置准备 需要用到 Azkaban 编译后的 gz包( azkaban-web-server-0.1.0-SNAPSHOT.tar.gz、 azkaban-exec-server-0.1.0-SNAPSHOT.tar.gz、 azkaban-db-0.1.0-SNAPSHOT.tar.gz ) ,手动编译 Azkaban 源码教程见: [Azkaban-solo-server环境搭建](doc:olUfwBd7) >i # 二、Two Server 模式部署 ## 2.1 解压 在 /opt/software下创建目录azkaban-two, 然后把 3 个安装包分别解压到azkaban-two目录下. ```shell [liulike@hadoop software]$ mkdir azkaban-two [liulike@hadoop ~]$ tar -zxvf azkaban-web-server-0.1.0-SNAPSHOT.tar.gz -C /opt/software/azkaban-two/ [liulike@hadoop ~]$ tar -zxvf azkaban-exec-server-0.1.0-SNAPSHOT.tar.gz -C /opt/software/azkaban-two/ [liulike@hadoop ~]$ tar -zxvf azkaban-db-0.1.0-SNAPSHOT.tar.gz -C /opt/software/azkaban-two/ #重命名(可选) [liulike@hadoop azkaban-two]$ mv azkaban-web-server-0.1.0-SNAPSHOT/ web-server [liulike@hadoop azkaban-two]$ mv azkaban-exec-server-0.1.0-SNAPSHOT/ executor-server [liulike@hadoop azkaban-two]$ mv azkaban-db-0.1.0-SNAPSHOT/ sql-db ``` ## 2.2 在 MySQL 中创建 azkaban 需要的表 进入 MySQL, 创建数据库azkaban_two,并创建需要的表 ```shell mysql> create database azkaban_two; Query OK, 1 row affected (0.01 sec) mysql> use azkaban_two; Database changed mysql> source E:\software\azkaban-db-0.1.0-SNAPSHOT\create-all-sql-0.1.0-SNAPSHOT.sql Query OK, 0 rows affected (0.09 sec) Query OK, 0 rows affected (0.02 sec) Query OK, 0 rows affected (0.02 sec) ... ``` ![image.png](https://cos.easydoc.net/52087651/files/l6c865q3.png) ![image.png](https://cos.easydoc.net/52087651/files/l6c86cmj.png) 报错的这两张表都建立了一个varchar类型的索引,一个varchar(512)、一个varchar(640),上述报错说的是767字节,而varchar是字符,由于这里我使用的字符集为utf8,这个指每个字符最大的字节数为4,所以很明显 4*512(2048) 和 4*640(2560)都大于767,这里我都改成了varchar(128),问题成功解决! ## 2.3 生成密钥和证书(可选) ``` #此证书是在web-server中使用,所以建议生成在web-server目录下 [liulike@hadoop azkaban-two]$ keytool -keystore /opt/software/azkaban-two/web-server/keystore -alias liulike -genkey -keyalg rsa keytool是 Java 数据证书的管理工具,使用户能够管理自己的公 /私钥 对及相关证书 。 -keystore 指定密钥库的名称及位置 (产生的各类信息将存在 .keystore文件中) -genkey (或者 -genkeypair) 生成密钥对 -alias 为生成的密钥对指定别名,如果没有默认是 mykey -keyalg 指定密钥的算法 RSA/DSA,默认是 DSA ``` ![image.png](https://cos.easydoc.net/52087651/files/l6c896kx.png) 查看密钥库信息: ``` [liulike@hadoop azkaban-two]$ keytool -list -keystore web-server/keystore ``` ![image.png](https://cos.easydoc.net/52087651/files/l6c8a6eg.png) ## 2.4 Web 服务器配置 在web服务器目录下创建多级文件夹plugins/jobtypes: ``` [liulike@hadoop web-server]$ mkdir -p /opt/software/azkaban-two/web-server/plugins/jobtypes ``` 进入 azkaban web 服务器安装目录的conf目录下,修改azkaban.properties ```shell #默认 Web Server 存放 web 文件的目录 web.resource.dir=/opt/software/azkaban-two/web-server/web #默认时区为美国,改为亚洲 上海 默认 default.timezone.id=Asia/Shanghai #用户权限管理信息文件 user.manager.xml.file=/opt/software/azkaban-two/web-server/conf/azkaban-users.xml #executor全局配置文件 executor.global.properties=/opt/software/azkaban-two/web-server/conf/global.properties #jetty配置 #jetty.use.ssl=false jetty.ssl.port=8443 jetty.port=8081 jetty.keystore=/opt/software/azkaban-two/web-server/keystore jetty.password=liulike jetty.keypassword=liulike jetty.truststore=/opt/software/azkaban-two/web-server/keystore jetty.trustpassword=liulike jetty.maxThreads=25 # Azkaban Executor settings executor.port=12321 # Azkaban plugin settings azkaban.jobtype.plugin.dir=/opt/software/azkaban-two/web-server/plugins/jobtypes #数据库配置 database.type=mysql mysql.port=3306 mysql.host=192.168.1.106 mysql.database=azkaban_two mysql.user=root mysql.password=liulike mysql.numconnections=100 ``` 完整配置文件内容如下: ```shell # Azkaban Personalization Settings azkaban.name=liulike azkaban.label=liulike-Azkaban azkaban.color=#FF3601 azkaban.default.servlet.path=/index web.resource.dir=/opt/software/azkaban-two/web-server/web default.timezone.id=Asia/Shanghai # Azkaban UserManager class user.manager.class=azkaban.user.XmlUserManager user.manager.xml.file=/opt/software/azkaban-two/web-server/conf/azkaban-users.xml # Loader for projects executor.global.properties=/opt/software/azkaban-two/web-server/conf/global.properties azkaban.project.dir=projects # Velocity dev mode velocity.dev.mode=false # Azkaban Jetty server properties. jetty.ssl.port=8443 jetty.port=8081 jetty.keystore=/opt/software/azkaban-two/web-server/keystore jetty.password=liulike jetty.keypassword=liulike jetty.truststore=/opt/software/azkaban-two/web-server/keystore jetty.trustpassword=liulike jetty.maxThreads=25 # Azkaban Executor settings executor.port=12321 # mail settings mail.sender= mail.host= # User facing web server configurations used to construct the user facing server URLs. They are useful when there is a reverse proxy between Azkaban web servers and users. # enduser -> myazkabanhost:443 -> proxy -> localhost:8081 # when this parameters set then these parameters are used to generate email links. # if these parameters are not set then jetty.hostname, and jetty.port(if ssl configured jetty.ssl.port) are used. # azkaban.webserver.external_hostname=myazkabanhost.com # azkaban.webserver.external_ssl_port=443 # azkaban.webserver.external_port=8081 job.failure.email= job.success.email= lockdown.create.projects=false cache.directory=cache # JMX stats jetty.connector.stats=true executor.connector.stats=true # Azkaban plugin settings azkaban.jobtype.plugin.dir=/opt/software/azkaban-two/web-server/plugins/jobtypes # Azkaban mysql settings by default. Users should configure their own username and password. database.type=mysql mysql.port=3306 mysql.host=192.168.1.106 mysql.database=azkaban_two mysql.user=root mysql.password=liulike mysql.numconnections=100 #Multiple Executor azkaban.use.multiple.executors=true azkaban.executorselector.filters=StaticRemainingFlowSize,CpuStatus azkaban.executorselector.comparator.NumberOfAssignedFlowComparator=1 azkaban.executorselector.comparator.Memory=1 azkaban.executorselector.comparator.LastDispatched=1 azkaban.executorselector.comparator.CpuUsage=1 ``` `log4j.properties`修改日志文件路径: `log4j.appender.server.File=/opt/software/azkaban-two/web-server/logs/azkaban-webserver.log` 在azkaban web 服务器安装目录的conf目录下, 按照如下配置修改 azkaban-users.xml文件,增加自定义管理员用户。 ```xml <azkaban-users> <user groups="azkaban" password="azkaban" roles="admin" username="azkaban"/> <user groups="azkaban" password="liulike" roles="admin" username="liulike"/> <user password="metrics" roles="metrics" username="metrics"/> <role name="admin" permissions="ADMIN"/> <role name="metrics" permissions="METRICS"/> </azkaban-users> ``` ## 2.5 Executor 服务器配置 进入 azkaban executor 服务器安装目录的conf目录下,修改azkaban.properties ```shell #默认时区为美国,改为亚洲 上海 默认 default.timezone.id=Asia/Shanghai #executor全局配置文件 executor.global.properties=/opt/software/azkaban-two/executor-server/conf/global.properties #web服务器url #azkaban.webserver.url=http://hadoop:8081 azkaban.webserver.url=https://hadoop:8443 # Azkaban plugin settings azkaban.jobtype.plugin.dir=azkaban.jobtype.plugin.dir=/opt/software/azkaban-two/executor-server/plugins/jobtypes #数据库配置 database.type=mysql mysql.port=3306 mysql.host=192.168.1.106 mysql.database=azkaban_two mysql.user=root mysql.password=liulike mysql.numconnections=100 # Azkaban Executor settings executor.port=12321 ``` 完整配置文件内容如下: ```shell # Azkaban Personalization Settings default.timezone.id=Asia/Shanghai # Azkaban UserManager class # Loader for projects executor.global.properties=/opt/software/azkaban-two/executor-server/conf/global.properties azkaban.project.dir=projects azkaban.webserver.url=https://hadoop:8443 # Azkaban plugin settings azkaban.jobtype.plugin.dir=/opt/software/azkaban-two/executor-server/plugins/jobtypes # Azkaban mysql settings by default. Users should configure their own username and password. database.type=mysql mysql.port=3306 mysql.host=192.168.1.106 mysql.database=azkaban_two mysql.user=root mysql.password=liulike mysql.numconnections=100 # Azkaban Executor settings executor.maxThreads=50 executor.flow.threads=30 executor.port=12321 ``` `log4j.properties`修改日志文件路径: `log4j.appender.server.File=/opt/software/azkaban-two/executor-server/logs/azkaban-execserver.log` ## 2.6 启动 ```shell #executor服务器bin目录下执行启动命令 [liulike@hadoop bin]$ ./start-exec.sh #手动激活executor服务器 [liulike@hadoop ~]$ curl http://hadoop:12321/executor?action=activate #web服务器bin目录下执行启动命令 [liulike@hadoop bin]$ ./start-web.sh ``` ## 2.7 可能遇到的错误 启动web服务器遇到以下错误: ![image.png](https://cos.easydoc.net/52087651/files/l6c8errg.png) 没有找到活动的executors,需在MySQL数据库里设置端口为35496(每次重启后都不一样)的executors表的active为1 ![image.png](https://cos.easydoc.net/52087651/files/l6c8f32d.png) ## 2.8 验证 验证方式一:使用 `jps` 命令查看是否有`AzkabanExecutorServer`和 `AzkabanWebServer` 进程: ![image.png](https://cos.easydoc.net/52087651/files/l6c8fnxe.png) 验证方式二:访问 8081 端口(未配置SSL),查看 Web UI 界面: ![image.png](https://cos.easydoc.net/52087651/files/l6c8fudl.png) 访问 8443 端口(配置SSL),查看 Web UI 界面: ![image.png](https://cos.easydoc.net/52087651/files/l6c8g2cm.png) ![image.png](https://cos.easydoc.net/52087651/files/l6c8g87f.png) ## 2.9 踩坑分享 - web.resource.dir项的值要使用绝对路径,否则web页面会没有美丽的界面。 - user.manager.xml.file项的值要使用绝对路径,否则启动的时候会报找不到文件的错误。 每次重启web服务器都会报以下错误,可以按照2.7操作,或者我们也可以直接更改web服务器目录和executor服务器目录下`conf`目录下的`azkaban.properties`文件: ``` #web-设置Executor服务器的端口号固定为12321 # Azkaban Executor settings executor.port=12321 #executor-设置Executor服务器的端口号固定为12321 # Azkaban Executor settings executor.port=12321 ``` ```curl #语法:curl http://${executorHost}:${executorPort}/executor?action=activate [liulike@hadoop bin]$ curl http://hadoop:12321/executor?action=activate {"status":"success"} ``` >i # 三、基本任务调度 ## 3.1 新建项目 在 Azkaban 主界面创建一个新项目: ![image.png](https://cos.easydoc.net/52087651/files/l6c8i718.png) ## 3.2 任务配置 新建 `liulike.flow` 配置文件,内容如下。这里的任务很简单,就是输出一句` Hello Azkaban,Flow-2.0!` : ```flow nodes: - name: firstJob-liulike type: command config: command: echo "Hello Azkaban,Flow-2.0! " ``` 如果你希望以 2.0 的方式运行,则需要新建一个 project 文件,指明是使用的是 Flow 2.0: azkaban-flow-version: 2.0 ## 3.3 打包上传 将 liulike.flow和liulike.project 打包为 zip 压缩文件: ![image.png](https://cos.easydoc.net/52087651/files/l6c8jlx8.png) 通过 Web UI 界面上传: ![image.png](https://cos.easydoc.net/52087651/files/l6c8jtuc.png) 上传成功后可以看到对应的 Flows: ![image.png](https://cos.easydoc.net/52087651/files/l6c8k1by.png) ## 3.4 执行任务 点击页面上的 Execute Flow 执行任务: ![image.png](https://cos.easydoc.net/52087651/files/l6c8k7pd.png) ## 3.5 执行结果 点击 Log 可以查看到任务的执行日志: ![image.png](https://cos.easydoc.net/52087651/files/l6c8kgvy.png)![image.png](https://cos.easydoc.net/52087651/files/l6c8kgko.png) ![image.png](https://cos.easydoc.net/52087651/files/l6c8knuc.png) ## 3.6 踩坑分享 web服务器目录下azkaban.jobtype.plugin.dir=plugins/jobtypes这一项一定要配置,并且创建文件夹,不然会报错,错误信息如下: ![image.png](https://cos.easydoc.net/52087651/files/l6c8kypn.png) `[liulike@hadoop conf]$ mkdir -p /opt/software/azkaban-two/web-server/plugins/jobtypes` 执行job时一直处于preparing状态: ![image.png](https://cos.easydoc.net/52087651/files/l6c8ld1s.png) 解决方案:修改web服务器目录中conf目录下的azkaban.properties配置文件,去掉MinimumFreeMemory ![image.png](https://cos.easydoc.net/52087651/files/l6c8li1l.png) 执行任务时直接失败,报错如下: ![image.png](https://cos.easydoc.net/52087651/files/l6c8lyqq.png) 解决方案:在executor服务器目录中`plugins/jobtypes` 目录下修改`commonprivate.properties`配置文件,追加内容:`azkaban.native.lib=false` ,改完后分发给web服务器一份儿,然后重启web和executor。 ``` # set execute-as-user execute.as.user=false azkaban.native.lib=false [liulike@hadoop jobtypes]$ cp /opt/software/azkaban-two/executor-server/plugins/jobtypes/commonprivate.properties /opt/software/azkaban-two/web-server/plugins/jobtypes/ ```