Ubuntu 20 Helm Chart
In this guide we will go from a basic Ubuntu 20 install to a fully functioning Kubernetes cluster running Layar. This guide is intended as an example installation only and makes configuration decisions that may not be best suited for your environment.
Prerequisites
Hardware
- 256 GB RAM
- 32 CPUs
- 1TB SSD disk
- 2x A100 or later generation GPUs (A100 strongly preferred)
Additional Requirements
- The system must have an external DNS entry (not relying on
/etc/hosts
) with the corresponding IP assigned to the host - Internet access from the installation system is a requirement.
- Swap must be disabled.
- SELinux must be disabled or set to
Permissive
.
Installation
Commands should be run as root
or prefixed with sudo
.
Ensure you have current Docker installed
docker version
Both Client and Server output should show version 24.0.5 or later. If not please follow the instructions for installing docker on Ubuntu here before proceeding.
Increase operating system vm.max_map_count
vm.max_map_count
echo "vm.max_map_count=262144" > /etc/sysctl.d/99-vyasa.conf
sysctl -p /etc/sysctl.d/99-vyasa.conf
Configure NetworkManager to ignore Calico interfaces
If using NetworkManager, edit the file /etc/NetworkManager/conf.d/calico.conf
and set the following:
[keyfile]
unmanaged-devices=interface-name:cali*;interface-name:tunl*;interface-name:vxlan.calico;interface-name:wireguard.cali
Restart NetworkManager
systemctl restart NetworkManager
Install CUDA 11
Install CUDA per the instructions at NVIDIA. Ensure DKMS is also installed. Version 11.8 or later is required.
https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
Install NVIDIA Docker toolkit
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
apt update
apt-get install -y nvidia-docker2
Set docker default runtime
Edit the file /etc/docker/daemon.json
and replace its contents with:
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}
Restart Docker
systemctl restart docker
Install Kubernetes
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key
add -
echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" >> /etc/apt/sources.list && apt update
apt install -y kubeadm=1.23.15-00 kubelet=1.23.15-00 kubectl=1.23.15-00
/usr/bin/kubeadm init --kubernetes-version=1.23.15 --token-ttl 0 --pod-network-cidr=10.17.0.0/16 -v 5
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Install NGINX Ingress controller
kubectl apply -f https://vyasa-static-assets.s3.amazonaws.com/layar/nginx-ingress.yaml
Install Calico CNI
kubectl create -f https://vyasa-static-assets.s3.amazonaws.com/layar/tigera-operator.yaml
kubectl create -f https://vyasa-static-assets.s3.amazonaws.com/layar/custom-resources.yaml
Let head node run pods
kubectl taint nodes --all node-role.kubernetes.io/master:NoSchedule-
Install local volume provisioner
/usr/bin/kubectl apply -f https://vyasa-static-assets.s3.amazonaws.com/layar/local-storage-provisioner.yml
Set default storage class
kubectl patch storageclass 'local-path' -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
Install Helm
wget -P /usr/local/bin/ https://vyasa-static-assets.s3.amazonaws.com/layar/helm
chmod +x /usr/local/bin/helm
Add the Layar helm repository
helm repo add vyasa https://helm.vyasa.com/charts/ --username
vyasahelm --password "sail#away()"
helm repo update
Install Layar
Replace MY_URL
with the IP address or DNS name of the system. Set MY_GPU_COUNT
to the number of GPUs available to Layar.
helm install layar vyasa/layar --set APPURL=MY_URL --set TRITON_GPU_COUNT=MY_GPU_COUNT
Uninstall Layar
To uninstall the Layar helm chart, execute the following command:
helm uninstall layar
Updated about 2 months ago