RHEL 8 Helm Chart
In this guide we will go from a basic RHEL 8 operating system to a fully functioning Layar installation. This guide is intended as an example installation only and makes configuration decisions that may not be best suited for your environment.
Prerequisites
Hardware
- 256 GB RAM
- 16+ CPUs
- 1TB SSD
- 4x NVIDIA A100 generation GPUs (or later)
Software
- The system must have an external DNS entry (not relying on
/etc/hosts
) with the corresponding IP assigned to the host. - Internet access from the installation system is a requirement.
- Swap must be disabled.
- SELinux must be disabled or set to
Permissive
.
Installation
Commands should be run as root
or prefixed with sudo
.
Install docker
yum install -y yum-utils
yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo
yum -y install docker-ce
systemctl --now enable docker
Increase operating system vm.max_map_count
vm.max_map_count
echo "vm.max_map_count=262144" > /etc/sysctl.d/99-vyasa.conf
sysctl -p /etc/sysctl.d/99-vyasa.conf
Configure NetworkManager to ignore Calico interfaces
If using NetworkManager, edit the file /etc/NetworkManager/conf.d/calico.conf
and set the following:
[keyfile]
unmanaged-devices=interface-name:cali*;interface-name:tunl*;interface-name:vxlan.calico;interface-name:wireguard.cali
Restart NetworkManager
systemctl restart NetworkManager
Install kernel development packages and compiler
dnf -y install kernel-devel-$(uname -r) kernel-headers-$(uname -r) gcc
Install and enable the EPEL and CUDA repository
dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo
dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
dnf-config-manager --enable epel
Install CUDA 12
dnf module install nvidia-driver:latest-dkms
dnf install -y cuda-12
nvidia-smi
Install NVIDIA Docker toolkit
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | tee /etc/yum.repos.d/nvidia-docker.repo
yum install -y nvidia-docker2 nvidia-container-runtime
Set docker default runtime
Edit the file /etc/docker/daemon.json
and replace its contents with:
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}
Restart docker
systemctl restart docker
Add kubernetes repo
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-\$basearch
enabled=1
gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF
Install Kubernetes and initialize
dnf install -y kubeadm-1.23.15-00 kubelet-1.23.15-00 kubectl-1.23.15-00
/usr/bin/kubeadm init --kubernetes-version=1.23.15 --token-ttl 0 --pod-network-cidr=10.17.0.0/16 -v 5
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
Install NGINX Ingress controller
kubectl apply -f https://vyasa-static-assets.s3.amazonaws.com/layar/nginx-ingress.yaml
Install Calico CNI
kubectl create -f https://vyasa-static-assets.s3.amazonaws.com/layar/tigera-operator.yaml
kubectl create -f https://vyasa-static-assets.s3.amazonaws.com/layar/custom-resources.yaml
Let head node run pods
/usr/bin/kubectl taint nodes --all node-role.kubernetes.io/master:NoSchedule-
Install local volume provisioner
/usr/bin/kubectl apply -f https://vyasa-static-assets.s3.amazonaws.com/layar/local-storage-provisioner.yml
Set default storage class
kubectl patch storageclass 'local-path' -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
Install Helm
wget -P /usr/local/bin/ https://vyasa-static-assets.s3.amazonaws.com/helm/helm
chmod +x /usr/local/bin/helm
Add the Layar helm repository
helm repo add vyasa https://helm.vyasa.com/charts/ --username vyasahelm --password "sail#away()"
helm repo update
Install Layar
Replace MY_URL
with the IP address or DNS name of your system and setting TRITON_GPU_COUNT to n-1 of available GPUs.
helm install layar vyasa/layar --set APPURL=MY_URL --set TRITON_GPU_COUNT=MY_GPU_COUNT
Updated 10 months ago