Loading Models Using The API
Introduction
Layar version 1.15 allows users to swap out models that are sitting on the servers storage but not currently served on the application stack.
Model Requirements
It's important to understand that trying to load a model when not enough GPU resources are available will lead to either a unstable environment or the model load failing. Please review the model requirement documentation.
Adding Models To The Server
The model files you which to use will need to be placed on the server at /data/jars/hfmodels
. The format of the model files must be .zip, .tar, or .tar.gz. Once the model files have been added to the directory we can utilize the API to check what models are located in that folder.
Checking Available Models
In order to check what models are available we will do a GET to /layar/hfModels
. No request body is needed to query this endpoint.
header = {'Accept': 'application/json',
'Content-Type': 'application/json',
'Authorization': f"Bearer {token}"}
response = requests.get("https://YOUR_ENVIRONMENT_URL/hfModels,
headers = header).json()
The response will look like this.
[{
"modelName": "Llama-3.2-3B-Instruct",
"fileName": "Llama-3.2-3B-Instruct.tar",
"size": "4.7 GB",
"lastModified": "Oct 10,2025 at 02:13AM GMT"
}]
The list may have multiple results in it depending on how many models you have.
Serving The Model
Once we have the modelName
we can do a GET to /layar/hfModels/<modelName>/download
.
header = {'Accept': 'application/json',
'Content-Type': 'application/json',
'Authorization': f"Bearer {token}"}
response = requests.get("https://YOUR_ENVIRONMENT_URL/layar/hfModels/<modelName>/download,
headers = header).json()
Updated 2 days ago