Setting Generative System Model

Models & Requirements

Default Model: Mistral 7B V0.1

Model	Name	Quantization	Minimum Requirements	Minimum Layar Version
Mistral 7B V0.1	mistralai/Mistral-7B-Instruct-v0.1	None	A10 24gb x1	1.7
Mixtral 8x7B	casperhansen/mixtral-instruct-awq	AWQ	A100 40gb x1	1.7
Llama 3 70B	casperhansen/llama-3-70b-instruct-awq	AWQ	A10 10gb x4	1.8
Llama 3 70B	meta-llama/llama-3-70b-instruct	None	A100 40gb x2	1.8
Llama 3.1 70B	meta-llama/Meta-Llama-3.1-70B-Instruct	None	A100 80gb x2	1.9
Llama 3.1 70B	hugging-quants/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4	GPTQ	A100 80gb x1	1.9
Llama 3.1 70B	hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4	AWQ	A100 80gb x1	1.9

🚧
Llama 3.1 VRAM Limitations
If you are on Layar 1.9, you will need to have GPUs with 80gb of VRAM. If there are further questions about this, please e-mail [email protected]

📘
GPU Configuration Considerations
If you are interested in putting the models on smaller GPU partitions, please review GPU Considerations

Setting Your System to a New Model

To change the model from the default, you will need to ssh into the instance in question.
Once you are logged in, follow these steps:

Become root with sudo su -
Once you have root access, you will be able to edit the Layar configuration file.
Do this by running vi /data/layar/layar.config
Enter edit mode by typing i
Add the following line to the file: TGI_MODEL: model name where 'model name' is contents of the Name column above of the model you'd like to use.
Once you've made the edit, save and exit by first pressing ctrl+c to exit edit mode, then type :wqfollowed by enter to write the changes and exit the file.
⚠️Be careful not to change anything other than the new line containing TGI_MODEL: model name
Now that you've edited the Layar configuration file, you'll need to restart the pods that consume TGI MODEL so that they are started with the correct model in place.
Run k delete deployment certara-llm certara-tgi
Run kps every 10 seconds until you don't see the llm and tgi pod in the stack
Run /deployLayar.sh llm_model_swap to redeploy the pods with the new model.
You should be able to use kps and k <pod name> logsto watch as the pods start up. The model server(the TGI pod) should log that the model you specified has been loaded successfully.

Once the pods have finished loading, you should now be able to now use Composer and other apps that use the /generate endpoint

Setting Generative System Model

Models & Requirements

Default Model: Mistral 7B V0.1

🚧
Llama 3.1 VRAM Limitations

📘
GPU Configuration Considerations

Setting Your System to a New Model

If you have issues, please contact Certara Support at [email protected]

Models & Requirements

Default Model: Mistral 7B V0.1

🚧Llama 3.1 VRAM Limitations

📘GPU Configuration Considerations

Setting Your System to a New Model

If you have issues, please contact Certara Support at [email protected]

🚧
Llama 3.1 VRAM Limitations

📘
GPU Configuration Considerations