Overview

The Infuzu API is a powerful tool designed to enhance your interactions with large language models (LLMs) by leveraging Infuzu's Intelligent Model Selection (IMS) technology. This API serves as a proxy, forwarding your requests to the most suitable LLM based on your query and returning the response to you. What distinguishes the Infuzu API is its ability to predict and select the best model for your specific needs, ensuring high-quality, accurate, and contextually relevant responses.

How It Works

When you send a request to the Infuzu API, it does not simply relay your query to any LLM provider. Instead, it uses IMS to analyze your request and predict which model will provide the most beneficial response. This prediction is based on human-centered benchmarks, ensuring that the results align with what users typically find clear, accurate, and helpful.

Customization & Control

The Infuzu API offers extensive customization options to tailor the model selection and response evaluation to your specific requirements:

Model Selection Preferences: You can specify a list of models for the API to choose from or exclude certain models if they do not meet your needs.
Cost Filtering: Set limits on the maximum cost per character for both input and output, allowing the API to select models within your budget.
Weighting Factors: In addition to quality, you can adjust the importance of other factors such as:
- Price: Prioritize models based on a 3:1 ratio of input to output token prices.
- Error Rate: Consider the reliability of models based on their performance in the past hour.
- Start Latency: Evaluate how quickly each model begins responding.
- End Latency: Measure the time taken for the full response to be completed.

By default, IMS assigns a weight of 100 to "quality," with other factors ignored. You can modify these weights to align with your priorities.

Requesting Multiple Responses

While the default behavior of the Infuzu API is to return the single best response as determined by IMS, you have the option to request multiple responses. This feature is useful if you need to consider different perspectives or alternatives.

Privacy and Security

Infuzu takes your privacy and security seriously:

In-Transit Security: All communication uses TLS and HTTPS, with data encrypted using RSA-2048 or stronger protocols.
No Data Retention: Infuzu does not store any request or response data. All data is temporarily held in memory during processing and is deleted immediately after the response is sent.
Basic Analytics Only: For operational purposes, Infuzu logs only basic metrics like request counts, timing, and character counts, without retaining any input information.
No Retention by Providers: Infuzu collaborates with LLM providers that also adhere to zero data retention policies, ensuring your data's privacy across the entire processing chain.
HIPAA Compliance: The Infuzu API is fully compliant with HIPAA standards, making it suitable for healthcare applications. Business Associate Agreements (BAAs) can be signed upon request for added compliance.

Compatibility & Features

The Infuzu API is designed to be compatible with the OpenAI API, offering a seamless transition for developers familiar with similar tools. Key features include:

Conversational History: You can provide context by sending a list of messages.
Streaming Responses: Although not supported in the current Python library, the API allows for streaming responses, enabling real-time interaction.
Customizable Model Behavior: Easily control inputs, outputs, and model selection preferences across various use cases.

Why Choose the Infuzu API?

Whether you're developing chatbots, automating content generation, or leveraging AI for business solutions, the Infuzu API offers:

Highest-Quality Results: IMS ensures responses align with human preferences for clarity, accuracy, and context.
Flexibility & Cost-Efficiency: Balance quality with cost, speed, and reliability as needed.
Comprehensive Security: Your data remains private and secure, with adherence to HIPAA standards.
Developer-Friendly: Easy integration with familiar API structures means you can start using the Infuzu API with minimal effort.

With the Infuzu API, you can harness the power of advanced AI while maintaining control over quality, cost, and security.

For more information on managing API credits, rate limits, and usage patterns, please refer to the following links:

For further details on any topic, feel free to explore the rest of our documentation, powered by Infuzu Chat's Intelligent Model Selection technology.

Last modified: 25 February 2025