# Shark AI — Full API Documentation for LLMs/Agents

> This file contains API documentation and a model catalog optimized for AI agents and LLMs.
> For a concise overview, see: https://shark.ai/llms.txt
> OpenAPI spec: https://shark.ai/openapi.json
> For detailed per-model docs (parameters, examples), fetch: https://shark.ai/docs/{category}/{model-id}/llms.txt

## Overview

Shark AI is a unified, agent-friendly AI model gateway for text, image, video, audio, and embedding models. Text models support OpenAI Format by default, with Anthropic Format and Gemini Format available on selected models.

- API origin: https://shark.ai
- SDK/client base URL for OpenAI, Anthropic, and custom HTTP clients: `https://shark.ai/api/v1`
- Do not use the API origin as an OpenAI/Anthropic SDK `base_url`; SDK clients should include `/api/v1`.
- Authentication: `Authorization: Bearer <api-key>` or `x-api-key: <api-key>`
- User-facing category names: Text, Image, Video, Audio, Embedding. Text models use the internal `language` category in docs/model URLs.
- Model identity in the database and docs is `category + model ID`; API calls pass only the model ID, while the endpoint/category disambiguates routing.
- Available models: 21

## Agent Decision Flow

1. Call or inspect `GET /api/v1/models` and read each model's `category` and `supported_protocols`.
2. For language models, choose the public format:
   - `openai_chat_completions`: default broad SDK compatibility; use `POST /api/v1/chat/completions`.
   - `anthropic_messages`: prefer for Anthropic/Claude-family models when available; use `POST /api/v1/messages` for native content blocks, tool use, thinking, and signatures.
   - `gemini_generate_content`: prefer for Gemini-family models when available; use `POST /api/v1/models/{model}:generateContent` for native `contents`, `parts`, and `generationConfig`.
3. For image models, use `POST /api/v1/images/generations`. Default is synchronous; set `async: true` for async mode with task polling. Async mode returns 202 with a task ID — poll `GET /api/v1/images/generations/:id` for the result. Cancel pending tasks with `POST /api/v1/images/generations/:id/cancel`.
4. For video models, create `POST /api/v1/videos` then poll `GET /api/v1/videos/:id`. Cancel pending tasks with `POST /api/v1/videos/:id/cancel`.
5. For model-specific parameters, fetch per-model docs: `https://shark.ai/docs/{category}/{model-id}/llms.txt`.

## Language Model Format Selection

| Format | Protocol ID | Endpoint | Prefer when |
|--------|-------------|----------|-------------|
| OpenAI Format | `openai_chat_completions` | `POST /api/v1/chat/completions` | You need OpenAI SDK compatibility or one common chat interface across providers |
| Anthropic Format | `anthropic_messages` | `POST /api/v1/messages` | Claude/Anthropic model supports it; you need native tool use, thinking blocks, or thinking signatures |
| Gemini Format | `gemini_generate_content` | `POST /api/v1/models/{model}:generateContent` | Gemini model supports it; you need native Gemini `contents`, `parts`, multimodal behavior, or `generationConfig` |

## Endpoints

| Method | Path | Description |
|--------|------|-------------|
| POST | /api/v1/chat/completions | Text generation (streaming supported) |
| POST | /api/v1/messages | Anthropic Messages for selected text models |
| POST | /api/v1/models/{model}:generateContent | Gemini Generate Content for selected text models |
| POST | /api/v1/images/generations | Image generation (sync or async) |
| GET | /api/v1/images/generations/:id | Poll async image task status |
| POST | /api/v1/images/generations/:id/cancel | Cancel pending image task |
| POST | /api/v1/videos | Create async video task |
| GET | /api/v1/videos/:id | Poll video task status |
| POST | /api/v1/videos/:id/cancel | Cancel pending video task |
| GET | /api/v1/models | List all available models |
| GET | /api/v1/models/:model?category=language | Retrieve one model; category disambiguates duplicate model IDs |

## Authentication

Requests accept Bearer auth. Anthropic SDK clients may also use `x-api-key`:
```
Authorization: Bearer sk-your-api-key
x-api-key: sk-your-api-key
```

Optional headers:
- `X-App-UID`: User identifier for per-user tracking
- `X-App-User-Credits`: Credit balance for pre-check (returns 402 if insufficient)
- `X-Tracing-ID`: Custom tracing ID

## Credit System

- 1 USD = 10,000 credits
- Successful generation responses include a `credit` field when pricing is configured
- Chat: billed per input/output token
- Image: billed per generation
- Video: billed per second at given resolution
- Streaming: credit sent as SSE event `data: {"credit": 4.5}` before `data: [DONE]`

## Error Codes

All errors return: `{"error": {"message": "...", "type": "error_type"}}`

| HTTP | Type | Description | Action |
|------|------|-------------|--------|
| 400 | invalid_request_error | Invalid parameters | Check request body |
| 400 | content_moderation | Content blocked | Adjust prompt |
| 400 | input_too_large | Input exceeds limit | Reduce input |
| 401 | auth_error | Invalid API key | Check API key |
| 402 | insufficient_credits | Not enough credits | Top up |
| 404 | model_not_found | Model not found | Check model ID |
| 429 | rate_limit_error | Too many requests | Retry with backoff |
| 502 | provider_error | Upstream error | Retry later |
| 504 | timeout_error | Timed out | Simplify or retry |

## Quick Start

### OpenAI SDK Python
```python
from openai import OpenAI

client = OpenAI(base_url="https://shark.ai/api/v1", api_key="sk-...")
response = client.chat.completions.create(
    model="anthropic/claude-opus-4.7",
    messages=[{"role": "user", "content": "Hello"}],
)
print(response.choices[0].message.content)
```

### OpenAI SDK JavaScript
```javascript
import OpenAI from 'openai';

const client = new OpenAI({ baseURL: 'https://shark.ai/api/v1', apiKey: 'sk-...' });
const response = await client.chat.completions.create({
  model: 'anthropic/claude-opus-4.7',
  messages: [{ role: 'user', content: 'Hello' }],
});
console.log(response.choices[0].message.content);
```

### Anthropic SDK Python
```python
from anthropic import Anthropic

client = Anthropic(base_url="https://shark.ai/api/v1", api_key="sk-...")
response = client.messages.create(
    model="anthropic/claude-opus-4.7",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
)
print(response.content[0].text)
```

### Anthropic SDK JavaScript
```javascript
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({ baseURL: 'https://shark.ai/api/v1', apiKey: 'sk-...' });
const response = await client.messages.create({
  model: 'anthropic/claude-opus-4.7',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Hello' }],
});
console.log(response.content[0].text);
```

### Gemini Format JavaScript
```javascript
const model = encodeURIComponent('google/gemini-3.1-flash-image-preview');
const response = await fetch(`https://shark.ai/api/v1/models/${model}:generateContent`, {
  method: 'POST',
  headers: { Authorization: 'Bearer sk-...', 'Content-Type': 'application/json' },
  body: JSON.stringify({
    contents: [{ role: "user", parts: [{ text: "Generate a short product caption." }] }],
    generationConfig: { maxOutputTokens: 512 },
  }),
});
const data = await response.json();
console.log(data.candidates?.[0]?.content?.parts?.[0]?.text);
```

---

## POST /api/v1/messages

Anthropic Messages-compatible endpoint for selected text models. Use this endpoint when you want Claude/Anthropic SDK style messages, tool_use/tool_result blocks, or thinking blocks.

### Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| model | string | Yes | Model ID supporting Anthropic Messages |
| max_tokens | integer | Yes | Maximum output tokens |
| system | string/object array | No | System prompt |
| messages | array | Yes | Anthropic user/assistant messages |
| stream | boolean | No | Enable Anthropic SSE streaming |
| tools | array | No | Anthropic tool definitions |
| thinking | object | No | Extended thinking config for supported models |

### Example
```bash
curl -X POST https://shark.ai/api/v1/messages \
  -H "Authorization: Bearer <api-key>" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "anthropic/claude-opus-4.7",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "user",
      "content": "Hello!"
    }
  ]
}'
```

---

## POST /api/v1/models/{model}:generateContent

Gemini Generate Content-compatible endpoint for selected text models. Prefer this endpoint for Gemini-family models when you need native `contents`, `parts`, multimodal file parts, or `generationConfig`.

Path parameter `model` must be URL-encoded when it contains `/`, for example `google%2Fgemini-3.1-flash-image-preview`.

### Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| contents | array | Yes | Gemini content turns with `role` and `parts` |
| systemInstruction | object | No | Gemini system instruction content |
| generationConfig | object | No | Gemini generation parameters such as `temperature`, `topP`, `maxOutputTokens` |
| tools | array | No | Gemini tool declarations for supported models |

### Example
```bash
curl -X POST https://shark.ai/api/v1/models/google%2Fgemini-3.1-flash-image-preview:generateContent \
  -H "Authorization: Bearer <api-key>" \
  -H "Content-Type: application/json" \
  -d '{
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "text": "Hello!"
        }
      ]
    }
  ],
  "generationConfig": {
    "maxOutputTokens": 512
  }
}'
```

---

## POST /api/v1/chat/completions

OpenAI-compatible chat completions. Supports streaming, function calling, multimodal I/O.

### Common Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| model | string | Yes | Model ID |
| messages | array | Yes | Array of {role, content} objects |
| stream | boolean | No | Enable SSE streaming (default: false) |
| temperature | number | No | 0.0 – 2.0 |
| max_tokens | integer | No | Maximum output tokens |
| top_p | number | No | Nucleus sampling, 0.0 – 1.0 |
| stop | string/array | No | Stop sequences |
| modalities | array | No | Output modalities, e.g. ["text", "image"] |

### Multimodal Input

For image/video/audio input, use content array:
```json
{
  "role": "user",
  "content": [
    {
      "type": "text",
      "text": "Describe this image"
    },
    {
      "type": "image_url",
      "image_url": {
        "url": "https://example.com/photo.jpg"
      }
    }
  ]
}
```

Content types: `text`, `image_url`, `video_url`, `audio_url`. URLs can be HTTP or base64 data URIs.

### Compatibility Notes

These notes matter in multi-turn conversations when OpenAI Format routes to providers with native signed state. Agents must preserve returned state fields exactly in later turns:

- Gemini image editing: generated image URLs may look like `data:image/png;thought_signature=...;base64,...`. Pass `image_url.url` back unchanged in the next turn.
- Anthropic thinking: assistant messages may include `reasoning_details`. Preserve this field unchanged in later turns so the gateway can restore signed thinking blocks.

Gemini image-edit turn 2 example:
```json
{
  "model": "google/gemini-3.1-flash-image-preview",
  "messages": [
    {
      "role": "user",
      "content": "Generate an image of a dress model."
    },
    {
      "role": "assistant",
      "content": "",
      "images": [
        {
          "type": "image_url",
          "image_url": {
            "url": "data:image/png;thought_signature=abc123;base64,iVBOR..."
          }
        }
      ]
    },
    {
      "role": "user",
      "content": "Make her raise one hand."
    }
  ],
  "modalities": [
    "text",
    "image"
  ]
}
```

Anthropic thinking turn 2 example:
```json
{
  "model": "anthropic/claude-opus-4.7",
  "messages": [
    {
      "role": "user",
      "content": "Think carefully, then answer."
    },
    {
      "role": "assistant",
      "content": "Final answer text.",
      "reasoning_details": [
        {
          "type": "thinking",
          "thinking": "Internal reasoning summary...",
          "signature": "sig_abc123"
        }
      ]
    },
    {
      "role": "user",
      "content": "Continue from your previous reasoning."
    }
  ],
  "thinking": {
    "type": "enabled",
    "budget_tokens": 1024
  },
  "max_tokens": 2048
}
```

### Response (non-streaming)
```json
{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "vendor/model",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello!"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 5,
    "total_tokens": 15
  },
  "credit": 1.5
}
```

### Streaming SSE Format
```
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-xxx","choices":[],"usage":{"prompt_tokens":10,"completion_tokens":5}}
data: {"credit": 1.5}
data: [DONE]
```

---

## POST /api/v1/images/generations

Image generation supporting sync (default) and async modes. Returns Shark AI signed image URLs. Download returned URLs as soon as possible; signed URLs are short-lived.
This URL normalization only applies to the Image category endpoint. Text models that output images through chat/messages keep their protocol-native response shape.

### Common Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| model | string | Yes | Model ID |
| prompt | string | Yes | Text description of the image |
| async | boolean | No | Set true for async mode (returns 202 with task ID). Default: false |
| timeout | integer | No | Async mode: max wait time in seconds (30-1800). Default: 300 |
| ... | | | Model-specific parameters — see per-model docs |

### Sync Response (200)
```json
{
  "created": 1700000000,
  "data": [
    {
      "url": "https://cos.example.com/images/...signed..."
    }
  ],
  "usage": {
    "input_tokens": 14,
    "output_tokens": 1542,
    "total_tokens": 1556
  },
  "credit": 400
}
```

### Async Response (202)
```json
{
  "id": "task-uuid",
  "object": "image.generation.task",
  "status": "pending",
  "created": 1700000000,
  "timeout": 300,
  "polling_url": "/api/v1/images/generations/task-uuid",
  "cancel_url": "/api/v1/images/generations/task-uuid/cancel"
}
```

### GET /api/v1/images/generations/:id — Poll Async Task

Poll every 2-5 seconds. Status: pending → running → completed | failed | cancelled

Completed:
```json
{
  "id": "task-uuid",
  "object": "image.generation.task",
  "status": "completed",
  "data": [
    {
      "url": "https://cos.example.com/images/...signed..."
    }
  ],
  "usage": {
    "input_tokens": 14,
    "output_tokens": 1542,
    "total_tokens": 1556
  },
  "credit": 400
}
```

### POST /api/v1/images/generations/:id/cancel

Cancel a pending image task. Only `pending` tasks can be cancelled. Returns 409 if already started.

---

## POST /api/v1/videos (async)

Asynchronous video generation. Create a task, then poll for the result. Supports timeout and cancellation.

### Common Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| model | string | Yes | Model ID |
| prompt | string | Yes | Video description |
| duration | integer | No | Duration in seconds |
| resolution | string | No | e.g. 480p, 720p, 1080p |
| aspect_ratio | string | No | e.g. 16:9, 9:16, 1:1 |
| frame_images | array | No | First/last frame for image-to-video |
| timeout | integer | No | Max wait time in seconds (60-3600). Default: 600 |
| seed | integer | No | Random seed (-1 for random) |

### Create Response (202)
```json
{
  "id": "task-uuid",
  "status": "pending",
  "polling_url": "/api/v1/videos/task-uuid"
}
```

### GET /api/v1/videos/:id — Poll Status

Poll every 3-5 seconds. Status: pending → running → completed | failed | cancelled

Completed:
```json
{
  "id": "task-uuid",
  "status": "completed",
  "output": {
    "urls": [
      "https://cdn.example.com/video.mp4"
    ]
  },
  "credit": 4000
}
```

Failed:
```json
{
  "id": "task-uuid",
  "status": "failed",
  "error": "Content was blocked by the safety system."
}
```

### POST /api/v1/videos/:id/cancel

Cancel a pending video task. Only `pending` tasks can be cancelled. Returns 409 if already started.

---

## Available Models

> For detailed per-model documentation (full parameter schema, code examples), fetch:
> `https://shark.ai/docs/{category}/{model-id}/llms.txt`

### Image Models

Endpoint: `POST /api/v1/images/generations`

| Model ID | Display Name | Vendor | Supported API Formats | Details |
|----------|-------------|--------|-----------------------|---------|
| `bytedance/seedream-v4.5` | bytedance/seedream-v4.5 | bytedance | /api/v1/images/generations |  — [details](https://shark.ai/docs/image/bytedance%2Fseedream-v4.5/llms.txt) |
| `google/gemini-3.1-flash-image-preview` | gemini-3.1-flash-image-preview | google | /api/v1/images/generations | in: image, out: image — [details](https://shark.ai/docs/image/google%2Fgemini-3.1-flash-image-preview/llms.txt) |
| `midjourney/text-to-image` | midjourney/text-to-image | midjourney | /api/v1/images/generations |  — [details](https://shark.ai/docs/image/midjourney%2Ftext-to-image/llms.txt) |
| `openai/gpt-image-2` | gpt-image-2 | openai | /api/v1/images/generations |  — [details](https://shark.ai/docs/image/openai%2Fgpt-image-2/llms.txt) |
| `x-ai/grok-imagine-image/edit` | x-ai/grok-imagine-image/edit | x-ai | /api/v1/images/generations |  — [details](https://shark.ai/docs/image/x-ai%2Fgrok-imagine-image%2Fedit/llms.txt) |

### Text Models

Available language endpoints depend on each model's `supported_protocols`:
Prefer the provider-native compatible format when available, then fall back to OpenAI Format for broad SDK compatibility.
- OpenAI Format (`openai_chat_completions`): `POST /api/v1/chat/completions`
- Anthropic Format (`anthropic_messages`): `POST /api/v1/messages`
- Gemini Format (`gemini_generate_content`): `POST /api/v1/models/{model}:generateContent`

| Model ID | Display Name | Vendor | Supported API Formats | Details |
|----------|-------------|--------|-----------------------|---------|
| `anthropic/claude-opus-4.6` | claude-opus-4.6 | anthropic | OpenAI Format (`/chat/completions`), Anthropic Format (`/messages`) | in: image — [details](https://shark.ai/docs/language/anthropic%2Fclaude-opus-4.6/llms.txt) |
| `anthropic/claude-opus-4.7` | claude-opus-4.7 | anthropic | OpenAI Format (`/chat/completions`), Anthropic Format (`/messages`) | 1000K ctx, 128K out, in: image — [details](https://shark.ai/docs/language/anthropic%2Fclaude-opus-4.7/llms.txt) |
| `anthropic/claude-sonnet-4.6` | claude-sonnet-4.6 | anthropic | OpenAI Format (`/chat/completions`), Anthropic Format (`/messages`) | in: image — [details](https://shark.ai/docs/language/anthropic%2Fclaude-sonnet-4.6/llms.txt) |
| `deepseek/deepseek-v4-flash` | deepseek-v4-flash | deepseek | OpenAI Format (`/chat/completions`) |  — [details](https://shark.ai/docs/language/deepseek%2Fdeepseek-v4-flash/llms.txt) |
| `deepseek/deepseek-v4-pro` | deepseek-v4-pro | deepseek | OpenAI Format (`/chat/completions`) |  — [details](https://shark.ai/docs/language/deepseek%2Fdeepseek-v4-pro/llms.txt) |
| `google/gemini-3.1-flash-image-preview` | gemini-3.1-flash-image-preview | google | OpenAI Format (`/chat/completions`), Gemini Format (`/models/{model}:generateContent`) | in: image, out: image — [details](https://shark.ai/docs/language/google%2Fgemini-3.1-flash-image-preview/llms.txt) |
| `google/gemini-3.1-pro-preview` | gemini-3.1-pro-preview | google | OpenAI Format (`/chat/completions`), Gemini Format (`/models/{model}:generateContent`) | in: image+audio — [details](https://shark.ai/docs/language/google%2Fgemini-3.1-pro-preview/llms.txt) |
| `minimax/minimax-m2.7` | minimax-m2.7 | minimax | OpenAI Format (`/chat/completions`) |  — [details](https://shark.ai/docs/language/minimax%2Fminimax-m2.7/llms.txt) |
| `openai/gpt-4o` | gpt-4o | openai | OpenAI Format (`/chat/completions`) | in: image — [details](https://shark.ai/docs/language/openai%2Fgpt-4o/llms.txt) |
| `openai/gpt-4o-mini` | gpt-4o-mini | openai | OpenAI Format (`/chat/completions`) | in: image — [details](https://shark.ai/docs/language/openai%2Fgpt-4o-mini/llms.txt) |
| `openai/gpt-5.4` | gpt-5.4 | openai | OpenAI Format (`/chat/completions`) |  — [details](https://shark.ai/docs/language/openai%2Fgpt-5.4/llms.txt) |
| `x-ai/grok-4-1-fast-non-reasoning` | grok-4-1-fast-non-reasoning | x-ai | OpenAI Format (`/chat/completions`) |  — [details](https://shark.ai/docs/language/x-ai%2Fgrok-4-1-fast-non-reasoning/llms.txt) |
| `x-ai/grok-4-1-fast-reasoning` | grok-4-1-fast-reasoning | x-ai | OpenAI Format (`/chat/completions`) |  — [details](https://shark.ai/docs/language/x-ai%2Fgrok-4-1-fast-reasoning/llms.txt) |

### Video Models

Endpoint: `POST /api/v1/videos`

| Model ID | Display Name | Vendor | Supported API Formats | Details |
|----------|-------------|--------|-----------------------|---------|
| `alibaba/wan-2.6` | Wan 2.6 | alibaba | /api/v1/videos |  — [details](https://shark.ai/docs/video/alibaba%2Fwan-2.6/llms.txt) |
| `alibaba/wan-2.7` | Wan 2.7 | alibaba | /api/v1/videos |  — [details](https://shark.ai/docs/video/alibaba%2Fwan-2.7/llms.txt) |
| `bytedance/seedance-2.0` | seedance-2.0 | bytedance | /api/v1/videos |  — [details](https://shark.ai/docs/video/bytedance%2Fseedance-2.0/llms.txt) |