Understanding the Async Generate Endpoint

Introduction

This guide goes over how and why you would want use the async generate endpoint, /layar/gpt/generate/async.

When to Use

Both the synchronous and asynchronous allow you to prompt a model, however the asynchronous approach is a good method to ensure performance issues do not become an issue on the server. Too many people using the synchronous endpoint, /layar/gpt/generate, may result in a resource bottleneck. If users have a workflow that asks a lot of questions programmatically, it's best to use the asynchronous endpoint. This allows the requests to be queued, resulting in less requests hitting the GPUs simultaneously.

Using the Endpoint

The endpoint accepts a payload exactly like the one you would use in the synchronous endpoint.

📘
Understanding the Generate Endpoint
If you want to learn more about the payload that can be sent to the Async and Sync endpoint, please review Understanding the Generate Endpoint

Lets look at an example request to the async endpoint.

import requests
from requests.auth import HTTPBasicAuth

envUrl = 'YOUR_ENVIRONMENT_HERE'
fabricId = 8
gptGenAsyncUrl = f'{envUrl}/layar/gpt/generate/async'

header = {'Accept': 'application/json',
          'Content-Type': 'application/json',
          'Authorization': f'Bearer {token}',
          'X-Vyasa-Client': 'composer',
					'X-Vyasa-Data-Fabric': fabricId}

requestId = requests.post(gptGenAsyncUrl,
                              headers = header,
                              json = {
                               'content' : 'What is a placebo?',
                               'sources' : [],
                               'task' : 'generate'
                             }).json().get('requestId')

The response will look like this.

{'requestId': '9718eef5-5747-4f23-b1c6-444fada8ab72'}

We need to take that ID and use it to query if the answer to the prompt is ready. We use the /layar/gpt/generate/async/<requestId>endpoint. We can use a while loop to check when the answer is ready.

content = None
while content == None:
    if status == 'RUNNING':
        status = requests.get(f'{envUrl}/layar/gpt/generate/async/{requestId}',
                                 headers=header).json().get('status')
        else:
          content = requests.get(f'{envUrl}/layar/gpt/generate/async/{requestId}',
                                               headers=header).json().get('result')

Once the async request is done running it will return a JSON dictionary containing the answer to the prompt.

{'chunksUsed': [],
 'content': 'A placebo is a treatment or substance that has no actual '
            'therapeutic effect, but is designed to mimic the appearance and '
            "feel of a real treatment. It's often used in medical research and "
            'clinical trials to compare the effectiveness of a new treatment '
            'against a dummy or inactive treatment.\n'
            '\n'
            'In other words, a placebo is a "sugar pill" or a fake treatment '
            "that looks and feels like the real thing, but doesn't actually "
            'contain any active ingredients. The idea behind using placebos is '
            'to see if the perceived benefits of a treatment are due to the '
            "actual effects of the treatment itself, or if they're due to the "
            'psychological expectation of getting better.\n'
            '\n'
            'For example, in a study on pain relief, one group might receive '
            'an actual pain medication (the active treatment), while another '
            'group receives a sugar pill that looks and tastes like the real '
            'medication (the placebo). If people who receive the sugar pill '
            "report feeling better than those who don't receive any treatment "
            'at all, it suggests that their improvement is due to their '
            'expectation of getting better, rather than any actual effect of '
            'the medication.\n'
            '\n'
            'Placebos can be used in various forms, such as pills, injections, '
            "creams, or even sham surgeries. They're an important tool in "
            "medical research because they help scientists understand what's "
            'happening when people get better with a new treatment – is it '
            'really due to the treatment itself, or just their expectations?',
 'documentsUsed': [],
 'perplexity': 1.0075626,
 'sourcesUsed': [],
 'type': 'assistant'}

📘
Recipe to Utilize Async Endpoint
If you would like to see a full script utilizing this endpoint please review this recipe, Using the Async Endpoint