Infuzu Documentation Help

Making Requests

This page describes how to make requests to the Infuzu API. Below you'll find an explanation of the basic request structure, payload parameters, and detailed examples using both cURL and Python.

Overview

All requests to the Infuzu API are made via HTTP POST to the endpoint:

https://chat.infuzu.com/api/v1/chat/completions

Each request must include the appropriate authentication header (such as Infuzu-API-Key or Authorization: Bearer) and a JSON payload that contains at least a list of messages. The API forwards your request to the most appropriate language model using Infuzu’s Intelligent Model Selection (IMS) technology, and returns an answer in a structured JSON format.

Request Structure and Workflow

Your request JSON must include:

  • messages: An array of message objects. Each message must specify a role (e.g., "user", "assistant", or "system") and its content.

  • model (optional): Either a string indicating a specific model (e.g., "deepinfra.deepseek-r1") or an object with advanced configuration options (such as cost limits and weighting factors).

Example of a minimal JSON payload:

{ "messages": [ { "role": "user", "content": "Hello, how are you?" } ] }

When the request is made, the API validates headers, authenticates the request, and processes the payload. On success, you will receive a response containing a ChatCompletionsObject with details like a unique response ID, a list of choices (each containing the message content), creation timestamp, and usage metadata.

Making a Request Using cURL

Below is an example of how to send a basic chat request using cURL:

curl -X POST https://chat.infuzu.com/api/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Infuzu-API-Key: your_api_key_here" \ -d '{ "messages": [ { "role": "user", "content": "Hello, how are you?" } ], }'

Key Points:

  • Headers:

    • Content-Type: application/json: Specifies that the request payload is in JSON format.

    • Infuzu-API-Key: Replace your_api_key_here with your actual API key.

  • Payload:

    • The messages array contains your conversation history. For now, only text is supported.

    • The model parameter shows which model or configuration to use.

Making a Request Using Python

The Infuzu Python library simplifies the process of communicating with the API. Here’s how you can create a chat completion request in Python:

import os from infuzu import create_chat_completion, ChatCompletionsHandlerRequestMessage, InfuzuAPIError # Retrieve the API key from your environment variable api_key = os.getenv("INFUZU_API_KEY") if not api_key: raise ValueError("API key not provided. Set the INFUZU_API_KEY environment variable.") # Define the conversation messages messages = [ ChatCompletionsHandlerRequestMessage( role="user", content="Hello, how are you?" ) ] try: # Make the API request using the default model ("infuzu-ims") response = create_chat_completion( messages=messages, api_key=api_key, ) # Process the response print("Response ID:", response.id) if response.choices and response.choices[0].message: print("Assistant Reply:", response.choices[0].message.content) else: print("No message content returned.") except InfuzuAPIError as e: # Handle API errors gracefully print("API Error:", str(e))

Key Points:

  • Library Setup:

    • The create_chat_completion function consolidates authentication (including reading INFUZU_API_KEY from the environment) and request creation.

  • Messages:

    • Use the ChatCompletionsHandlerRequestMessage class to define each message.

  • Response Handling:

    • The returned object is a ChatCompletionsObject containing an ID, message choices, and usage data.

Payload Parameters and Advanced Configuration

While the basic payload includes just the messages array and an optional model parameter, advanced usage allows you to customize model selection according to cost, error rates, or response latency. For these advanced settings use an object that implements the required parameters, such as:

  • InfuzuModelParams

    • llms: Array of model names to use.

    • exclude_llms: Array of model names you want to ignore.

    • weights: Custom weighting configuration for price, error, start or end latency.

    • max_input_cost & max_output_cost: Filter models based on character cost.

When you need to pass advanced options, simply instantiate an object or provide a JSON structure that includes these parameters.

Handling Responses and Errors

On a successful API call, you receive a ChatCompletionsObject containing:

  • id: Unique identifier for the chat completion.

  • choices: An array of response choices (the best response is typically the first).

  • created: Timestamp of the response.

  • usage: Information about token usage for billing and monitoring purposes.

If errors occur during the request, the API will return an error response with one or more error messages. Common error statuses include:

  • 401 Unauthorized: Missing or invalid API key.

  • 400 Bad Request: Errors in the request payload (e.g., providing multiple authentication headers).

Example error response structure:

{ "errors": [ { "code": "invalid_api_key", "message": "The provided API key is invalid" } ] }

Ensure your application gracefully handles such errors by checking the response status code and processing error messages accordingly.

Summary

  1. Prepare the Request:

    • Use the appropriate endpoint, headers, and JSON payload.

    • Ensure the payload contains valid message objects and optional configuration.

  2. Make the Request:

    • Use cURL or the Infuzu Python library to send the POST request.

  3. Process the Response:

    • Parse the JSON response to access the returned message and related metadata.

    • Handle any API errors and plan for retries or error logging as required.

By following these guidelines and examples, you can efficiently integrate the Infuzu API into your applications and make confident, secure calls to generate high-quality responses.

For further details on advanced configurations, error handling, or monitoring API usage, please refer to the related sections of our documentation. Happy coding!

Last modified: 25 February 2025