Write agentic apps once, run with any LLM provider ✨

Chat Comparison App

Compare responses from multiple LLM providers side-by-side to evaluate performance, accuracy, and response styles for your specific use cases.

Real-time Model Comparison

Send the same prompt to multiple models and compare their responses instantly

Use Cases

πŸ“ Prompt Engineering

Test how different models interpret your prompts and refine them for optimal results

βš–οΈ Model Selection

Choose the best model for your specific task based on actual performance comparisons

πŸ’° Cost Optimization

Find the most cost-effective model that meets your quality requirements

Key Features

  • Side-by-Side Comparison: View responses from multiple models simultaneously
  • Conversation History: Maintain context across multiple interactions
  • Dynamic Model Selection: Switch between models on the fly
  • Response Time Tracking: Monitor and compare model latencies

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   User UI   │────▢│   AISuite    │────▢│  Provider 1 β”‚
β”‚ (Streamlit) β”‚     β”‚    Client    β”‚     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚              β”‚     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   Unified    │────▢│  Provider 2 β”‚
                    β”‚   Interface  β”‚     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚              β”‚     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”€β”€β”€β”€β–Άβ”‚  Provider N β”‚
                                         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

The app uses AISuite's unified interface to communicate with multiple providers through a single API, making it trivial to add new models or switch between them.

Implementation

chat_comparison.py
import streamlit as st
import aisuite as ai
from dotenv import load_dotenv

# Initialize AISuite client
load_dotenv()
client = ai.Client()

# Configure available models
models = [
    {"name": "GPT-4", "provider": "openai", "model": "gpt-4"},
    {"name": "Claude 3", "provider": "anthropic", "model": "claude-3-sonnet"},
    {"name": "Gemini Pro", "provider": "google", "model": "gemini-pro"},
]

# Streamlit UI
st.set_page_config(layout="wide")
st.title("πŸ€– LLM Response Comparison")

# Model selection
col1, col2 = st.columns(2)
with col1:
    model1 = st.selectbox("Model 1", [m["name"] for m in models], index=0)
with col2:
    model2 = st.selectbox("Model 2", [m["name"] for m in models], index=1)

# User input
user_prompt = st.text_area("Enter your prompt:", height=100)

if st.button("Compare Responses"):
    if user_prompt:
        col1, col2 = st.columns(2)
        
        # Get response from Model 1
        with col1:
            st.subheader(f"{model1} Response")
            with st.spinner("Generating..."):
                model1_config = next(m for m in models if m["name"] == model1)
                response1 = client.chat.completions.create(
                    model=f"{model1_config['provider']}:{model1_config['model']}",
                    messages=[{"role": "user", "content": user_prompt}]
                )
                st.write(response1.choices[0].message.content)
        
        # Get response from Model 2
        with col2:
            st.subheader(f"{model2} Response")
            with st.spinner("Generating..."):
                model2_config = next(m for m in models if m["name"] == model2)
                response2 = client.chat.completions.create(
                    model=f"{model2_config['provider']}:{model2_config['model']}",
                    messages=[{"role": "user", "content": user_prompt}]
                )
                st.write(response2.choices[0].message.content)

✨ AISuite Features Highlighted

  • β–ΈUnified Client: Single client instance works with all providers
  • β–ΈProvider Format: Simple "provider:model" string format
  • β–ΈConsistent API: Same method signature for all providers
  • β–ΈParallel Execution: Compare multiple models simultaneously

Try It Out

Quick Start

  1. 1. Clone the repository:
    git clone https://github.com/andrewyng/aisuite.git cd aisuite/examples/chat-ui
  2. 2. Install dependencies:
    pip install aisuite streamlit python-dotenv
  3. 3. Configure providers:
    # Create .env file with your API keys OPENAI_API_KEY=your-key ANTHROPIC_API_KEY=your-key GOOGLE_API_KEY=your-key
  4. 4. Run the app:
    streamlit run chat.py

Extend It

πŸ’Ύ Add Conversation Export

Save comparison results to JSON or CSV for analysis

# Export to JSON
results = {
  "prompt": user_prompt,
  "responses": responses
}
json.dump(results, file)

πŸ“Š Add Response Metrics

Track response time, token usage, and costs

# Track metrics
start = time.time()
response = client.chat...
latency = time.time() - start
tokens = response.usage

🎯 Add Evaluation Scoring

Let users rate responses to build preference datasets

# User feedback
rating = st.slider(
  "Rate this response",
  min_value=1, max_value=5
)

πŸ”„ Add Batch Processing

Process multiple prompts across models automatically

# Batch comparison
for prompt in prompts:
  responses = compare_all(
    prompt, models
  )