Understanding the Certara Generative AI Parameters

Introduction

The Certara Generative AI endpoint has a plethora of parameters that can be used to ensure the responses are to your needs. This guide will go over those parameters.

📘
Swagger UI
All Certara AI endpoints and their parameters can be found at https://YOUR_LAYAR_ENVIRONMENT/layar/swagger-ui.html

Parameters

The parameters are as follows:

{
  "content": "string",
  "task": "string",
  "messageHistory": [
    {
      "type": "string",
      "content": "string"
    }
  ],
  "sources": {
    "rawText": "string",
    "documentId": "string",
    "savedListId": "string",
    "provider": "string"  
  },
  "max_tokens": 0,
  "temperature": 0,
  "top_k": 0,
  "top_p": 0,
  "conversation": {
    "chunk_size": 0
  },
  "summarize": {
    "chunk_size": 0
  },
  "prompts": {
    "system": "string",
    "summarization": "string"
  },
  "transientData": true
}

content

The question you are asking CGPT to answer.

🚧
Content Word Limit
The prompt you ask can't be infinitely long. Keeping the content to a few paragraphs will ensure accuracy.

task

Two possible strings, either "generate" or "summarize". This indicates to Certara Generative AI what kind of response you want.

messageHistory

A list of dictionaries and can be used to provide context of previous prompts, which will help dictate the answer given to the current prompt. The parameter requires two additional values, type and content.

type

Two possible strings, either "user" or "system". This allows you to categorize the content as given by the user or Certara Generative AI.

content

The type will determine if the string is a prompt previously given to Certara Generative AI or an answer to a previous prompt.

messageHistory Example

Here is an example using messageHistory.

{
  "content": "What was the frequency of the dosing of Mesalamine?",
  "task": "generate",
  "messageHistory": [
    {
      "type": "user",
      "content": "What is the drug or drugs being studied?"
    },
    {
      "type": "system",
      "content":"The druge being studied is Mesalamine."
    }                
                      ]
}

sources

Let's you dictate what document or set is used to generate an answer to your prompt. There are multiple sub-values that can be used.

rawText

Can take any text as a string, Certara Generative AI will use this raw text to generate a response.

documentId

A string that lets you define a specific document you want Certara Generative AI to use. You can use the guide Document Search to find specific document IDs.

savedListId

A string that lets you define a specific set you want Certara Generative AI to use.

provider

A string that lets you specify a specific provider you want Certara Generative AI to use. This can be your whole environment or a specific provider like PubMed.

🚧
Sources Word Limit
Using documents or raw text that are very long may result in inconsistent responses.

max_tokens

An integer that caps the length of responses. This can be used make responses shorter or more verbose.

temperature

A floating-point number between 0 and 1. You can use this field to give Certara Generative AI more leeway in how it responds. Running the same prompt with a temperature of 0 will cause Certara Generative AI to return the same response, while a temperature of 1 will allow Certara Generative AI to change the response.

top_k

An integer that allows Certara Generative AI to return the X top responses. A higher value will Certara Generative AI to use more diverse phrases in the response.

🚧
Non-sensical Responses
A higher top_k can result in responses that are non-sensical. Experimenting with different values for top_k will help curate your responses.

top_p

A floating-point number. A higher top_p tells Certara Generative AI to provide a response that is more diverse while a lower value causes a safer response.

👍
Top_P Suggestions
Adding top_p isn't necessary if you are just looking for a yes or no response. If you are looking for more verbose or detailed responses a higher top_p will help achieve this.

conversation

A dictionary value that includes one key, chunk_size

chunk_size

An integer that determines the size of the text chunk fed to Certara Generative AI. In order to generate a response, Certara Generative AI will break up text given to it. Making these chunks too large or too small can result in Certara Generative AI giving unwanted responses.

summarize

A dictionary value that includes one key, chunk_size. This parameter works exactly like conversation but for summarization.

chunk_size

An integer that determines the size of the text chunk fed to Certara Generative AI. In order to generate a summary, Certara Generative AI will break up text given to it. Making these chunks too large or too small can result in Certara Generative AI giving incorrect summaries.

🚧
Chunk_size Considerations
You do not need to supply chunk size if embeddings have already been made on the document or set. For more information on how to create embeddings, review Create Embedding.

prompts

A dictionary that allows you to provide additional context for Certara Generative AI without inflating the contents of your initial prompt.

system

A string that gives Certara Generative AI further direction on how to respond to your prompt. I.E. "Use a scientific writer tone." or "Use yes or no responses."

summarization

Two possible strings, either "verbose" or "brief". This will result in a longer or shorter summary.

transientData

A boolean value. If marked true, will causes any provided documents to be discarded after a response is returned. Defaults to false if the value is not provided.

🚧
Increased Response Time
If you are setting transientData to true, be prepared for increased CGPT response times.

Understanding the Certara Generative AI Parameters

Introduction

📘
Swagger UI

Parameters

content

🚧
Content Word Limit

task

messageHistory

type

content

messageHistory Example

sources

rawText

documentId

savedListId

provider

🚧
Sources Word Limit

max_tokens

temperature

top_k

🚧
Non-sensical Responses

top_p

👍
Top_P Suggestions

conversation

chunk_size

summarize

chunk_size

🚧
Chunk_size Considerations

prompts

system

summarization

transientData

🚧
Increased Response Time

Introduction

📘Swagger UI

Parameters

content

🚧Content Word Limit

task

messageHistory

type

content

messageHistory Example

sources

rawText

documentId

savedListId

provider

🚧Sources Word Limit

max_tokens

temperature

top_k

🚧Non-sensical Responses

top_p

👍Top_P Suggestions

conversation

chunk_size

summarize

chunk_size

🚧Chunk_size Considerations

prompts

system

summarization

transientData

🚧Increased Response Time

📘
Swagger UI

🚧
Content Word Limit

🚧
Sources Word Limit

🚧
Non-sensical Responses

👍
Top_P Suggestions

🚧
Chunk_size Considerations

🚧
Increased Response Time