Loading Models Using The API

Introduction

Layar version 1.15 allows users to swap out models that are sitting on the servers storage but not currently served on the application stack.

🚧
Model Requirements
It's important to understand that trying to load a model when not enough GPU resources are available will lead to either a unstable environment or the model load failing. Please review the model requirement documentation.

Adding Models To The Server

The model files you which to use will need to be placed on the server at /data/jars/hfmodels. The format of the model files must be .zip, .tar, or .tar.gz. Once the model files have been added to the directory we can utilize the API to check what models are located in that folder.

Checking Available Models

In order to check what models are available we will do a GET to /layar/hfModels. No request body is needed to query this endpoint.

header = {'Accept': 'application/json',
          'Content-Type': 'application/json',
          'Authorization': f"Bearer {token}"}

response = requests.get("https://YOUR_ENVIRONMENT_URL/hfModels,
                        headers = header).json()

The response will look like this.

[{
  "modelName": "Llama-3.2-3B-Instruct",
  "fileName": "Llama-3.2-3B-Instruct.tar",
	"size": "4.7 GB",
	"lastModified": "Oct 10,2025 at 02:13AM GMT"
}]

The list may have multiple results in it depending on how many models you have.

Serving The Model

Once we have the modelName we can do a GET to /layar/hfModels/<modelName>/download.

header = {'Accept': 'application/json',
          'Content-Type': 'application/json',
          'Authorization': f"Bearer {token}"}

response = requests.get("https://YOUR_ENVIRONMENT_URL/layar/hfModels/<modelName>/download,
                        headers = header).json()

Introduction

🚧Model Requirements

Adding Models To The Server

Checking Available Models

Serving The Model

🚧
Model Requirements