Experimental installation¶
Tip
It is recommended to have a high speed network connection, as multiple GBs of data will be downloaded during installation.
It is recommended to visit the hardware and software prerequisites to learn more about deployment options and hardware requirements.
For the installation you must have two type of nodes:
-
Control Node: The control node is where you install Ansible to manage other machines. It runs the playbooks and coordinates operations.
-
Target Nodes: These are the machines Ansible manages. They receive and execute tasks sent by the control node.
You must have ssh capability from the Control node to the Target nodes.
Info
One machine can serve as both the control and a target node. You'll install Ansible on this machine and include it in the inventory for task management.
Prerequisites on the target node(s)¶
The system can be installed on any OS that supports docker, however the recommended OS is Ubuntu 22.04. The steps described below assume an Ubuntu 22.04 installation.
Info
The installation will require you to have sudo rights.
1. System update¶
Make sure your system is up-to-date.
sudo apt update && \
sudo apt upgrade -y && \
sudo reboot
2. Add Docker GPG key and repository¶
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg && \
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
3. Add NVIDIA Docker GPG key and repository¶
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
4. Add the NVIDIA container toolkit repository¶
echo 'deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://nvidia.github.io/libnvidia-container/stable/deb/$(ARCH) /' \
| sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
5. Install required dependencies¶
sudo apt update && \
sudo apt -y install ca-certificates curl software-properties-common docker-ce docker-ce-cli containerd.io python3-pip python3-venv nvidia-driver-550 nvidia-docker2 && \
sudo reboot
Installation steps on the control node¶
1. Install Ansible¶
The installer requires Ansible on the Control node
sudo apt -y install sshpass && \
pip3 install ansible && \
source ~/.profile
2. Download and extract Ansible installer¶
wget https://u-query.ultinous.com/docs/r9/ansible_playbook.tar -P /tmp && \
mkdir -p ~/ansible-installer && \
tar -xvf /tmp/ansible_playbook.tar -C ~/ansible-installer && \
rm /tmp/ansible_playbook.tar && \
cd ~/ansible-installer
3. Set up the Ansible inventory¶
Info
If your Control node and Target node are same, you can use 127.0.0.1
for ansible_host
You have to set up the Ansible inventory file under inventories/all.yaml
:
-
You have to replace
YOUR_TARGET_NODE_NAME_HERE
as you would name your target node. It'll be a reference during the installation. -
You have to add value for:
-
ansible_host
: It has to be the target node's IP address or domain name. -
ansible_port
: It has to be the target node's SSH port number.
-
all:
hosts:
YOUR_TARGET_NODE_NAME_HERE:
ansible_host: # TARGET_NODE_IP_HERE
ansible_port: # TARGET_NODE_PORT_HERE
Tip
You can check out the connectivity via nc
command:
nc -z TARGET_NODE_IP_HERE TARGET_NODE_PORT_HERE && echo "Connectivity succeeded!" || echo "Connectivity failed!"
4. Set up the Target node specific variables¶
Create a directory for node/host specific variables in the document root of the installer and copy the example variables:
mkdir -p host_vars/YOUR_TARGET_NODE_NAME_HERE && \
cp examples/vars.yaml host_vars/YOUR_TARGET_NODE_NAME_HERE/
host_vars/YOUR_TARGET_NODE_NAME_HERE/vars.yaml
.
5. Start the installation¶
Start the Ansible playbook:
ansible-playbook start.yaml
It'll prompt you for ssh password and sudo password for the target node's authentication:
SSH password:
BECOME password[defaults to SSH password]:
Tip
If you can authenticate via an SSH private key, you can leave the 'SSH password' blank. If there is no password requirement for sudo commands, you can leave the 'BECOME password' blank as well.
The installation can take up to 30 minutes.