Skip to content

Commit

Permalink
Support FATE 1.3 and FATE-Serving 1.2.2
Browse files Browse the repository at this point in the history
* Add delete method in docker_deploy script
* Updated README
Signed-off-by: Lu Peng <penglu@hydsoft.com>
  • Loading branch information
pengluhyd committed Mar 5, 2020
1 parent 3d70926 commit b043b9d
Show file tree
Hide file tree
Showing 20 changed files with 668 additions and 147 deletions.
3 changes: 2 additions & 1 deletion .env
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
PREFIX=federatedai
RegistryURI=
TAG=1.2.0-release
TAG=1.3.0-release
SERVING_TAG=1.2.2-release

# PREFIX: namespace on the registry's server.
# RegistryURI: address of the local registry
Expand Down
47 changes: 34 additions & 13 deletions docker-deploy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,15 +42,19 @@ The following steps illustrate how to generate necessary configuration files and

Before deploying the FATE system, multiple parties should be defined in the configuration file: `docker-deploy/parties.conf`.

In the following sample of `docker-deploy/parties.conf` , two parities are specified by id as `10000` and `9999`. They are going to be deployed on hosts with IP addresses of *192.168.7.1* and *192.168.7.2*, respectively.
In the following sample of `docker-deploy/parties.conf` , two parities are specified by id as `10000` and `9999`. Their training cluster are going to be deployed on hosts with IP addresses of *192.168.7.1* and *192.168.7.2*, respectively. And their serving cluster are going to be deployed on hosts with IP addresses of *192.168.7.3* and *192.168.7.4*, respectively.

```bash
user=root
dir=/data/projects/fate
partylist=(10000 9999)
partyiplist=(192.168.7.1 192.168.7.2)
servingiplist=(192.168.7.3 192.168.7.4)
exchangeip=192.168.7.1
```

If 'servingiplist' is the same as 'partyiplist', the training cluster and service cluster can be deployed on the same machine.

By default, the exchange node co-locates on the same host of the first party. The exchange service runs on port 9371. For this reason, the IP address of the exchange node should be the same as that of the first party. If a standalone exchange node is needed, update the value of `exchangeip` to the IP address of the desired host.

After completing the above configuration file, use the following commands to generate configuration of target hosts.
Expand All @@ -59,7 +63,7 @@ $ cd docker-deploy
$ bash generate_config.sh
```

Now, tar files have been generated for each party including the exchange node (party). They are named as ```<party-id>-confs.tar ```.
Now, tar files have been generated for each party including the exchange node (party). They are named as ```confs-<party-id>.tar ``` and ```serving-<party-id>.tar```.

### Deploying FATE to target hosts

Expand All @@ -73,13 +77,30 @@ To deploy FATE to all configured target hosts, use the below command:
$ bash docker_deploy.sh all
```

The script copies tar files (e.g. `10000-confs.tar` or `9999-confs.tar`) to corresponding target hosts. It then launches a FATE cluster on each host using `docker-compose` commands.
The script copies tar files (e.g. `confs-<party-id>.tar` or `serving-<party-id>.tar`) to corresponding target hosts. It then launches a FATE cluster on each host using `docker-compose` commands.

To deploy all parties training cluster, use the below command:
```bash
$ bash docker_deploy.sh all --training
```

To deploy all parties serving cluster, use the below command:
```bash
$ bash docker_deploy.sh all --serving
```

To deploy FATE to a single target host, use the below command with the party's id (10000 in the below example):
```bash
$ bash docker_deploy.sh 10000
```
To deploy a single party's training cluster, use the below command:
```bash
$ bash docker_deploy.sh 10000 --training
```
To deploy a single party's serving cluster, use the below command:
```bash
$ bash docker_deploy.sh 10000 --serving
```
To deploy the exchange node to a target host, use the below command:
```bash
$ bash docker_deploy.sh exchange
Expand All @@ -89,16 +110,16 @@ $ bash docker_deploy.sh exchange
Once the commands finish, log in to any host and use `docker ps` to verify the status of the cluster. A sample output is as follows:

```
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d4686d616965 federatedai/python:1.2.0-release "/bin/bash -c 'sourc…" About a minute ago Up 52 seconds 9360/tcp, 9380/tcp confs-10000_python_1
4086ef0dc2de federatedai/fateboard:1.2.0-release "/bin/sh -c 'cd /dat…" About a minute ago Up About a minute 0.0.0.0:8080->8080/tcp confs-10000_fateboard_1
5cf3e1f1731a federatedai/roll:1.2.0-release "/bin/sh -c 'cd roll…" About a minute ago Up About a minute 8011/tcp confs-10000_roll_1
11c01143540b federatedai/meta-service:1.2.0-release "/bin/sh -c 'java -c…" About a minute ago Up About a minute 8590/tcp confs-10000_meta-service_1
f0976f48f0f7 federatedai/proxy:1.2.0-release "/bin/sh -c 'cd /dat…" About a minute ago Up About a minute 0.0.0.0:9370->9370/tcp confs-10000_proxy_1
7354af787036 redis:5 "docker-entrypoint.s…" About a minute ago Up About a minute 6379/tcp confs-10000_redis_1
ed11ce8eb20d federatedai/egg:1.2.0-release "/bin/sh -c 'cd /dat…" About a minute ago Up About a minute 7778/tcp, 7888/tcp, 50001-50004/tcp confs-10000_egg_1
6802d1e2bd21 mysql:8 "docker-entrypoint.s…" About a minute ago Up About a minute 3306/tcp, 33060/tcp confs-10000_mysql_1
5386bcb7565f federatedai/federation:1.2.0-release "/bin/sh -c 'cd /dat…" About a minute ago Up About a minute 9394/tcp confs-10000_federation_1
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d4686d616965 federatedai/python:<version>-release "/bin/bash -c 'sourc…" About a minute ago Up 52 seconds 9360/tcp, 9380/tcp confs-10000_python_1
4086ef0dc2de federatedai/fateboard:<version>-release "/bin/sh -c 'cd /dat…" About a minute ago Up About a minute 0.0.0.0:8080->8080/tcp confs-10000_fateboard_1
5cf3e1f1731a federatedai/roll:<version>-release "/bin/sh -c 'cd roll…" About a minute ago Up About a minute 8011/tcp confs-10000_roll_1
11c01143540b federatedai/meta-service:<version>-release "/bin/sh -c 'java -c…" About a minute ago Up About a minute 8590/tcp confs-10000_meta-service_1
f0976f48f0f7 federatedai/proxy:<version>-release "/bin/sh -c 'cd /dat…" About a minute ago Up About a minute 0.0.0.0:9370->9370/tcp confs-10000_proxy_1
7354af787036 redis:5 "docker-entrypoint.s…" About a minute ago Up About a minute 6379/tcp confs-10000_redis_1
ed11ce8eb20d federatedai/egg:<version>-release "/bin/sh -c 'cd /dat…" About a minute ago Up About a minute 7778/tcp, 7888/tcp, 50001-50004/tcp confs-10000_egg_1
6802d1e2bd21 mysql:8 "docker-entrypoint.s…" About a minute ago Up About a minute 3306/tcp, 33060/tcp confs-10000_mysql_1
5386bcb7565f federatedai/federation:<version>-release "/bin/sh -c 'cd /dat…" About a minute ago Up About a minute 9394/tcp confs-10000_federation_1
```

### Verifying the deployment
Expand Down
68 changes: 36 additions & 32 deletions docker-deploy/README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,15 @@ Compose是用于定义和运行多容器Docker应用程序的工具。通过Comp
如果运行机没有FATE组件的镜像,可以通过以下命令从Docker Hub获取镜像:

```bash
$ docker pull federatedai/egg:1.2.0-release
$ docker pull federatedai/fateboard:1.2.0-release
$ docker pull federatedai/meta-service:1.2.0-release
$ docker pull federatedai/python:1.2.0-release
$ docker pull federatedai/roll:1.2.0-release
$ docker pull federatedai/proxy:1.2.0-release
$ docker pull federatedai/federation:1.2.0-release
$ docker pull federatedai/egg:<version>-release
$ docker pull federatedai/fateboard:<version>-release
$ docker pull federatedai/meta-service:<version>-release
$ docker pull federatedai/python:<version>-release
$ docker pull federatedai/roll:<version>-release
$ docker pull federatedai/proxy:<version>-release
$ docker pull federatedai/federation:<version>-release
$ docker pull federatedai/serving-server:<version>-release
$ docker pull federatedai/serving-proxy:<version>-release
$ docker pull redis:5
$ docker pull mysql:8
```
Expand All @@ -42,14 +44,15 @@ $ docker pull mysql:8
```bash
$ docker images
REPOSITORY TAG
federatedai/egg 1.2.0-release
federatedai/fateboard 1.2.0-release
federatedai/serving-server 1.2.0-release
federatedai/meta-service 1.2.0-release
federatedai/python 1.2.0-release
federatedai/roll 1.2.0-release
federatedai/proxy 1.2.0-release
federatedai/federation 1.2.0-release
federatedai/egg <version>-release
federatedai/fateboard <version>-release
federatedai/meta-service <version>-release
federatedai/python <version>-release
federatedai/roll <version>-release
federatedai/proxy <version>-release
federatedai/federation <version>-release
federatedai/serving-server <version>-release
federatedai/serving-proxy <version>-release
redis 5
mysql 8
```
Expand All @@ -64,7 +67,7 @@ mysql 8

```bash
PREFIX=federatedai
TAG=1.2.0-release
TAG=1.3.0-release
```
我们这里采用从Docker Hub下载镜像。如果在运行机器上已经下载或导入了所需镜像,部署将会变得非常容易。

Expand All @@ -86,13 +89,14 @@ RegistryURI=192.168.10.1/federatedai

部署脚本提供了部署多个FATE实例的功能,下面的例子我们部署在两个机器上,每个机器运行一个FATE实例。根据需求修改配置文件`kubeFATE\docker-deploy\parties.conf`

下面是修改好的文件,分别是在主机*192.168.7.1*上的节点`10000`和主机*192.168.7.2*上的`9999`
下面是修改好的文件,节点`10000`的训练集群在*192.168.7.1*上,在线预测集群在*192.168.7.3*上。节点`9999`的训练集群在*192.168.7.2*,在线预测集群在*192.168.7.4*

```
user=root #运行机运行FATE实例的用户
dir=/data/projects/fate #docker-compose部署目录
partylist=(10000 9999) #组织id
partyiplist=(192.168.7.1 192.168.7.2) #id对应节点ip
user=root #运行机运行FATE实例的用户
dir=/data/projects/fate #docker-compose部署目录
partylist=(10000 9999) #组织id
partyiplist=(192.168.7.1 192.168.7.2) #id对应训练集群ip
servingiplist=(192.168.7.3 192.168.7.4) #id对应在线预测集群ip
exchangeip=192.168.7.1 #通信组件标识
```

Expand All @@ -108,7 +112,7 @@ exchangeip=192.168.7.1 #通信组件标识
$ bash generate_config.sh # 生成部署文件
$ bash docker_deploy.sh all # 在各个party上部署FATE
```
脚本将会生成10000、9999两个组织(Party)和exchange的部署文件,然后打包成tar文件。接着把tar文件`confs-10000.tar``confs-9999.tar``confs-exchange.tar`分别复制到party对应的主机上并解包,解包后的文件默认在`/data/projects/fate`目录下。然后脚本将远程登录到这些主机并使用docker compose命令启动FATE实例。
脚本将会生成10000、9999两个组织(Party)和exchange的部署文件,然后打包成tar文件。接着把tar文件`confs-<party-id>.tar``serving-<party-id>.tar``confs-exchange.tar`分别复制到party对应的主机上并解包,解包后的文件默认在`/data/projects/fate`目录下。然后脚本将远程登录到这些主机并使用docker compose命令启动FATE实例。

命令成功执行返回后,登录其中任意一个主机:

Expand All @@ -124,16 +128,16 @@ $ docker ps
输出显示如下,若各个组件都是运行(up)状态,说明部署成功。

```
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f8ae11a882ba fatetest/fateboard:1.2.0-release "/bin/sh -c 'cd /dat…" 5 days ago Up 5 days 0.0.0.0:8080->8080/tcp confs-10000_fateboard_1
d72995355962 fatetest/python:1.2.0-release "/bin/bash -c 'sourc…" 5 days ago Up 5 days 9360/tcp, 9380/tcp confs-10000_python_1
dffc70fc68ac fatetest/egg:1.2.0-release "/bin/sh -c 'cd /dat…" 7 days ago Up 7 days 7778/tcp, 7888/tcp, 50001-50004/tcp confs-10000_egg_1
dc23d75692b0 fatetest/roll:1.2.0-release "/bin/sh -c 'cd roll…" 7 days ago Up 7 days 8011/tcp confs-10000_roll_1
7e52b1b06d1a fatetest/meta-service:1.2.0-release "/bin/sh -c 'java -c…" 7 days ago Up 7 days 8590/tcp confs-10000_meta-service_1
50a6323f5cb8 fatetest/proxy:1.2.0-release "/bin/sh -c 'cd /dat…" 7 days ago Up 7 days 0.0.0.0:9370->9370/tcp confs-10000_proxy_1
4526f8e57004 redis:5 "docker-entrypoint.s…" 7 days ago Up 7 days 6379/tcp confs-10000_redis_1
586f3f2fe191 fatetest/federation:1.2.0-release "/bin/sh -c 'cd /dat…" 7 days ago Up 7 days 9394/tcp confs-10000_federation_1
ec434dcbbff1 mysql:8 "docker-entrypoint.s…" 7 days ago Up 7 days 3306/tcp, 33060/tcp confs-10000_mysql_1
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f8ae11a882ba fatetest/fateboard:<version>-release "/bin/sh -c 'cd /dat…" 5 days ago Up 5 days 0.0.0.0:8080->8080/tcp confs-10000_fateboard_1
d72995355962 fatetest/python:<version>-release "/bin/bash -c 'sourc…" 5 days ago Up 5 days 9360/tcp, 9380/tcp confs-10000_python_1
dffc70fc68ac fatetest/egg:<version>-release "/bin/sh -c 'cd /dat…" 7 days ago Up 7 days 7778/tcp, 7888/tcp, 50001-50004/tcp confs-10000_egg_1
dc23d75692b0 fatetest/roll:<version>-release "/bin/sh -c 'cd roll…" 7 days ago Up 7 days 8011/tcp confs-10000_roll_1
7e52b1b06d1a fatetest/meta-service:<version>-release "/bin/sh -c 'java -c…" 7 days ago Up 7 days 8590/tcp confs-10000_meta-service_1
50a6323f5cb8 fatetest/proxy:<version>-release "/bin/sh -c 'cd /dat…" 7 days ago Up 7 days 0.0.0.0:9370->9370/tcp confs-10000_proxy_1
4526f8e57004 redis:5 "docker-entrypoint.s…" 7 days ago Up 7 days 6379/tcp confs-10000_redis_1
586f3f2fe191 fatetest/federation:<version>-release "/bin/sh -c 'cd /dat…" 7 days ago Up 7 days 9394/tcp confs-10000_federation_1
ec434dcbbff1 mysql:8 "docker-entrypoint.s…" 7 days ago Up 7 days 3306/tcp, 33060/tcp confs-10000_mysql_1
```
#### 验证部署
Expand Down
43 changes: 43 additions & 0 deletions docker-deploy/docker-compose-serving.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
########################################################
# Copyright 2019-2020 program was created VMware, Inc. #
# SPDX-License-Identifier: Apache-2.0 #
########################################################

version: '3'

networks:
fate-network:
services:
serving-server:
image: "${PREFIX}/serving-server:${SERVING_TAG}"
ports:
- "8000:8000"
volumes:
- ./confs/serving-server/conf/serving-server.properties:/data/projects/fate/serving-server/conf/serving-server.properties
- ./confs/serving-server/.fate:/root/.fate
networks:
- fate-network

serving-proxy:
image: "${PREFIX}/serving-proxy:${SERVING_TAG}"
ports:
- "8059:8059"
- "8869:8869"
expose:
- 8879
volumes:
- ./confs/serving-proxy/conf/application.properties:/data/projects/fate/serving-proxy/conf/application.properties
- ./confs/serving-proxy/conf/route_table.json:/data/projects/fate/serving-proxy/conf/route_table.json
networks:
- fate-network

redis:
image: "redis:5"
expose:
- 6379
command: redis-server --requirepass fate_dev
volumes:
- ./confs/redis/data:/data
networks:
- fate-network

1 change: 1 addition & 0 deletions docker-deploy/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ services:
- ./confs/federatedml/conf:/data/projects/fate/python/arch/conf
- ./confs/federatedml/conf:/data/projects/fate/python/eggroll/conf
- ./confs/fate_flow/conf/settings.py:/data/projects/fate/python/fate_flow/settings.py
- ./confs/fate_flow/conf/server_conf.json:/data/projects/fate/python/arch/conf/server_conf.json
- fate_flow_logs:/data/projects/fate/python/logs
depends_on:
- redis
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
{
"servers": {
"proxy": {
"host": "proxy",
"port": 9370
},
"fateboard": {
"host": "fateboard",
"port": 8080
},
"roll": {
"host": "roll",
"port": 8011
},
"fateflow": {
"host": "python",
"grpc.port": 9360,
"http.port": 9380
},
"federation": {
"host": "federation",
"port": 9394
},
"clustercomm": {
"host": "federation",
"port": 9394
},
"servings": ["serving:8000"]
}
}
Loading

0 comments on commit b043b9d

Please sign in to comment.