API Documentation
Welcome to the Voice Clone & Text-to-Speech API documentation. Our API allows you to clone voices and generate natural-sounding speech programmatically.
Quick Start
Base URL
https://your-domain.com/api/v1
Authentication
All API requests require authentication using an API key. Include your API key in the Authorization
header:
Authorization: Bearer YOUR_API_KEY
Getting Your API Key
- Sign up for an account
- Navigate to your Dashboard
- Go to API Keys section
- Create a new API key
- Copy the API key for use in your requests
Credits System
Our API uses a credits-based billing system:
- Voice Clone: 1000 credits per voice (fixed cost)
- Text-to-Speech: 1 credit per UTF-8 byte
- English text: ~1 credit per character
- Chinese text: ~3 credits per character (CJK characters use 3 bytes in UTF-8)
- Maximum text length: 5000 bytes (5000 credits)
Rate Limits
Rate limit headers are included in all responses:
X-RateLimit-Limit
: Maximum requests allowedX-RateLimit-Remaining
: Remaining requests in current windowX-RateLimit-Reset
: Unix timestamp when the rate limit resets
Complete API System
Endpoint | Method | Credits | Function |
---|---|---|---|
POST | 1000 | Clone voice | |
GET | 0 | Get voices list | |
POST | Dynamic | Generate speech | |
GET | 0 | Query status | |
GET | 0 | Query balance |
Typical Usage Flow
- Clone Voice →
/api/v1/voice-clone
- Get List →
/api/v1/voices?status=COMPLETED
- Select Voice → Select
voiceId
from the list or usevoiceId
from voice-clone response - Generate Speech →
/api/v1/tts
using the selectedvoiceId
- Query Status →
/api/v1/tts/{speechId}
Poll until completion
APIs
Voice Clone API
POST /api/v1/voice-clone
Clone a voice model from an audio file.
Credits Cost
Fixed: 1000 credits per voice clone
Content Type
Content-Type: multipart/form-data
Request Parameters
Parameter | Type | Required | Description |
---|---|---|---|
voice | File | Yes | Audio file (MP3, WAV, WebM, max 20MB) |
title | string | No | Voice name (auto-generated if not provided) |
description | string | No | Voice description (max 500 chars) |
coverImage | string (URL) | No | Cover image URL |
tags | string or JSON | No | Tags as JSON array or comma-separated string |
Example Requests
cURL - Basic (voice file only)
curl -X POST https://your-domain.com/api/v1/voice-clone \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "voice=@/path/to/voice.mp3"
cURL - With Metadata
curl -X POST https://your-domain.com/api/v1/voice-clone \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "voice=@/path/to/voice.mp3" \
-F "title=My Voice Clone" \
-F "description=A custom voice description" \
-F "coverImage=https://example.com/cover.jpg" \
-F 'tags=["male","young","energetic"]'
JavaScript (Fetch)
async function cloneVoice(audioFile) {
const formData = new FormData();
formData.append('voice', audioFile); // File object from <input type="file">
formData.append('title', 'My Voice Clone');
formData.append('description', 'A custom voice');
formData.append('tags', JSON.stringify(['male', 'energetic']));
try {
const response = await fetch('https://your-domain.com/api/v1/voice-clone', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
},
body: formData
});
const data = await response.json();
if (response.ok) {
console.log('Voice cloned successfully!');
console.log('Voice ID:', data.voiceId);
console.log('Audio URL:', data.voice.audioUrl);
} else {
console.error('Error:', data.error);
if (data.code === 'USAGE_LIMIT_EXCEEDED') {
console.log(`Need ${data.needCredits} credits, have ${data.credits}`);
}
}
} catch (error) {
console.error('Network error:', error);
}
}
JavaScript (Node.js with FormData)
const FormData = require('form-data');
const fs = require('fs');
const axios = require('axios');
async function cloneVoice() {
const formData = new FormData();
formData.append('voice', fs.createReadStream('/path/to/voice.mp3'));
formData.append('title', 'My Voice Clone');
formData.append('description', 'A custom voice');
formData.append('tags', JSON.stringify(['male', 'energetic']));
try {
const response = await axios.post(
'https://your-domain.com/api/v1/voice-clone',
formData,
{
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
...formData.getHeaders()
}
}
);
console.log('Voice cloned successfully!');
console.log('Voice ID:', response.data.voiceId);
} catch (error) {
console.error('Error:', error.response?.data || error.message);
}
}
cloneVoice();
Python
import requests
def clone_voice():
url = "https://your-domain.com/api/v1/voice-clone"
headers = {
"Authorization": "Bearer YOUR_API_KEY"
}
# Open the audio file
files = {
'voice': open('/path/to/voice.mp3', 'rb')
}
data = {
'title': 'My Voice Clone',
'description': 'A custom voice',
'tags': '["male","energetic"]' # JSON string
}
try:
response = requests.post(url, headers=headers, files=files, data=data)
response.raise_for_status()
result = response.json()
print(f"Voice cloned successfully!")
print(f"Voice ID: {result['voiceId']}")
print(f"Audio URL: {result['voice']['audioUrl']}")
except requests.exceptions.RequestException as e:
print(f"Error: {e}")
if hasattr(e, 'response') and e.response is not None:
print(f"Details: {e.response.json()}")
clone_voice()
Success Response
{
"success": true,
"voiceId": "clxxxx",
"voice": {
"id": "clxxxx",
"title": "My Voice Clone",
"description": "A custom voice description",
"coverImage": "https://example.com/cover.jpg",
"audioUrl": "https://your-storage.com/voices/user123/1234567890-voice.mp3",
"duration": 125.3,
"tags": ["male", "young"],
"createdAt": "2024-01-01T00:00:00.000Z"
}
}
Error Responses
Insufficient Credits
{
"error": "Insufficient credits",
"code": "USAGE_LIMIT_EXCEEDED",
"credits": 500,
"needCredits": 1000
}
Invalid File Type
{
"code": "INVALID_PARAMETERS",
"error": "Invalid file type. Supported formats: MP3, WAV, WebM"
}
File Too Large
{
"code": "INVALID_PARAMETERS",
"error": "File size exceeds 20MB limit"
}
Get Voices List API
GET /api/v1/voices
Query the list of user's cloned voices with pagination and filtering.
Credits Cost
Free - No credits consumed
Query Parameters
Parameter | Type | Required | Description |
---|---|---|---|
page | number | No | Page number (default: 1, min: 1) |
limit | number | No | Items per page (default: 20, min: 1, max: 100) |
status | string | No | Filter by status: PROCESSING , COMPLETED , FAILED |
type | string | No | Filter by type: USER , SYSTEM |
sortBy | string | No | Sort field: createdAt , updatedAt , title , duration , popularity (default: createdAt ) |
sortOrder | string | No | Sort order: asc , desc (default: desc ) |
Example Requests
JavaScript (Fetch)
async function getVoices(page = 1, limit = 20) {
try {
const response = await fetch(
`https://your-domain.com/api/v1/voices?page=${page}&limit=${limit}&status=COMPLETED&sortBy=createdAt&sortOrder=desc`,
{
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
}
}
);
const data = await response.json();
if (response.ok) {
console.log('Total voices:', data.pagination.total);
console.log('Voices:', data.voices);
data.voices.forEach(voice => {
console.log(`- ${voice.title} (${voice.status}), Usage: ${voice.usageCount} times`);
});
} else {
console.error('Error:', data.error);
}
} catch (error) {
console.error('Network error:', error);
}
}
cURL
curl -X GET "https://your-domain.com/api/v1/voices?page=1&limit=20&status=COMPLETED" \
-H "Authorization: Bearer YOUR_API_KEY"
Python
import requests
def get_voices(page=1, limit=20, status=None):
url = "https://your-domain.com/api/v1/voices"
headers = {
"Authorization": "Bearer YOUR_API_KEY"
}
params = {
"page": page,
"limit": limit
}
if status:
params["status"] = status
try:
response = requests.get(url, headers=headers, params=params)
response.raise_for_status()
result = response.json()
print(f"Total voices: {result['pagination']['total']}")
for voice in result['voices']:
print(f"- {voice['title']} ({voice['status']}), Usage: {voice['usageCount']} times")
return result
except requests.exceptions.RequestException as e:
print(f"Error: {e}")
get_voices(page=1, limit=20, status="COMPLETED")
Success Response
{
"success": true,
"voices": [
{
"id": "clxxxx",
"title": "My Voice Clone",
"description": "A custom voice description",
"coverImage": "https://example.com/cover.jpg",
"audioUrl": "https://your-storage.com/voices/user123/voice.mp3",
"duration": 125,
"status": "COMPLETED",
"type": "USER",
"isPublic": false,
"popularity": 0,
"tags": ["male", "young"],
"usageCount": 15,
"createdAt": "2024-01-01T00:00:00.000Z",
"updatedAt": "2024-01-01T00:00:00.000Z"
}
],
"pagination": {
"page": 1,
"limit": 20,
"total": 1,
"totalPages": 1,
"hasNext": false,
"hasPrev": false
}
}
Response Fields
Field | Type | Description |
---|---|---|
voices | array | Array of voice objects |
voices[].id | string | Voice ID (use this for TTS requests) |
voices[].title | string | Voice name |
voices[].status | string | Processing status ( PROCESSING , COMPLETED , FAILED ) |
voices[].usageCount | number | Number of times this voice has been used for TTS |
pagination.total | number | Total number of voices |
pagination.hasNext | boolean | Whether there are more pages |
Text-to-Speech API
POST /api/v1/tts
Generate speech from text using a cloned or system voice.
Credits Cost
Dynamic: 1 credit per UTF-8 byte
- English text: ~1 credit per character
- Chinese text: ~3 credits per character (CJK characters use 3 bytes)
- Maximum: 5000 bytes (5000 credits max)
Content Type
Content-Type: application/json
Request Parameters
Parameter | Type | Required | Description |
---|---|---|---|
text | string | Yes | Text to convert to speech (max 5000 UTF-8 bytes) |
voiceId | string | Yes | Voice ID (user's cloned voice or public system voice) |
format | string | No | Output format: mp3 , wav , opus , pcm (default: mp3 ) |
Example Requests
JavaScript (Fetch)
async function generateSpeech() {
try {
const response = await fetch('https://your-domain.com/api/v1/tts', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'Hello, this is a test speech.',
voiceId: 'clxxxx',
format: 'mp3'
})
});
const data = await response.json();
if (response.ok) {
console.log('Speech generated!');
console.log('Speech ID:', data.speechId);
console.log('Credits used:', data.creditsUsed);
console.log('Status:', data.speech.status); // PROCESSING
} else {
console.error('Error:', data.error);
}
} catch (error) {
console.error('Network error:', error);
}
}
generateSpeech();
cURL
curl -X POST https://your-domain.com/api/v1/tts \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"text": "Hello, this is a test speech.",
"voiceId": "clxxxx",
"format": "mp3"
}'
Python
import requests
def generate_speech():
url = "https://your-domain.com/api/v1/tts"
headers = {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
}
data = {
"text": "Hello, this is a test speech.",
"voiceId": "clxxxx",
"format": "mp3"
}
try:
response = requests.post(url, headers=headers, json=data)
response.raise_for_status()
result = response.json()
print(f"Speech generated!")
print(f"Speech ID: {result['speechId']}")
print(f"Credits used: {result['creditsUsed']}")
print(f"Status: {result['speech']['status']}")
except requests.exceptions.RequestException as e:
print(f"Error: {e}")
if hasattr(e, 'response') and e.response is not None:
print(f"Details: {e.response.json()}")
generate_speech()
Success Response
{
"success": true,
"speechId": "clxxxx",
"creditsUsed": 29,
"speech": {
"id": "clxxxx",
"text": "Hello, this is a test speech.",
"format": "mp3",
"status": "PROCESSING",
"creditsUsed": 29,
"createdAt": "2024-01-01T00:00:00.000Z"
}
}
Note: TTS generation is asynchronous. The response returns immediately with status: "PROCESSING"
. You need to poll or use webhooks to get the final audio URL when status
becomes "COMPLETED"
.
Error Responses
Invalid Voice
{
"code": "NOT_FOUND",
"error": "Voice not found or not available"
}
Text Too Long
{
"code": "INVALID_PARAMETERS",
"error": "Text exceeds maximum length of 5000 bytes"
}
Insufficient Credits
{
"error": "Insufficient credits",
"code": "USAGE_LIMIT_EXCEEDED",
"credits": 10,
"needCredits": 29
}
Get Speech Status API
GET /api/v1/tts/{speechId}
Query the status and details of a TTS generation task.
URL Parameters
Parameter | Type | Required | Description |
---|---|---|---|
speechId | string | Yes | The speech ID returned from TTS API |
Example Requests
JavaScript (Fetch)
async function checkSpeechStatus(speechId) {
try {
const response = await fetch(`https://your-domain.com/api/v1/tts/${speechId}`, {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
}
});
const data = await response.json();
if (response.ok) {
console.log('Status:', data.speech.status);
if (data.speech.status === 'COMPLETED') {
console.log('Audio URL:', data.speech.audioUrl);
console.log('Duration:', data.speech.duration);
}
} else {
console.error('Error:', data.error);
}
} catch (error) {
console.error('Network error:', error);
}
}
cURL
curl -X GET https://your-domain.com/api/v1/tts/clxxxx \
-H "Authorization: Bearer YOUR_API_KEY"
Python
import requests
def check_speech_status(speech_id):
url = f"https://your-domain.com/api/v1/tts/{speech_id}"
headers = {
"Authorization": "Bearer YOUR_API_KEY"
}
try:
response = requests.get(url, headers=headers)
response.raise_for_status()
result = response.json()
print(f"Status: {result['speech']['status']}")
if result['speech']['status'] == 'COMPLETED':
print(f"Audio URL: {result['speech']['audioUrl']}")
print(f"Duration: {result['speech']['duration']}")
except requests.exceptions.RequestException as e:
print(f"Error: {e}")
check_speech_status("clxxxx")
Success Response
{
"success": true,
"speech": {
"id": "clxxxx",
"text": "Hello, this is a test speech.",
"audioUrl": "https://your-storage.com/speeches/user123/speech.mp3",
"format": "mp3",
"duration": 3.5,
"status": "COMPLETED",
"creditsUsed": 29,
"createdAt": "2024-01-01T00:00:00.000Z",
"updatedAt": "2024-01-01T00:00:15.000Z",
"voice": {
"id": "clxxxx",
"title": "My Voice",
"coverImage": "https://example.com/cover.jpg",
"type": "USER"
}
}
}
Status Values
Status | Description |
---|---|
PROCESSING | Speech is being generated (audioUrl will be null) |
COMPLETED | Speech generation completed (audioUrl available) |
FAILED | Speech generation failed |
Error Responses
Speech Not Found
{
"code": "NOT_FOUND",
"error": "Speech not found or access denied"
}
Get Credits Balance API
GET /api/v1/credits
Query the current credits balance for the authenticated user.
Example Requests
JavaScript (Fetch)
async function getCreditsBalance() {
try {
const response = await fetch('https://your-domain.com/api/v1/credits', {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
}
});
const data = await response.json();
if (response.ok) {
console.log('Total credits:', data.credits);
console.log('Regular credits:', data.breakdown.regular);
console.log('Permanent credits:', data.breakdown.permanent);
console.log('Reset at:', data.breakdown.resetAt);
} else {
console.error('Error:', data.error);
}
} catch (error) {
console.error('Network error:', error);
}
}
cURL
curl -X GET https://your-domain.com/api/v1/credits \
-H "Authorization: Bearer YOUR_API_KEY"
Python
import requests
def get_credits_balance():
url = "https://your-domain.com/api/v1/credits"
headers = {
"Authorization": "Bearer YOUR_API_KEY"
}
try:
response = requests.get(url, headers=headers)
response.raise_for_status()
result = response.json()
print(f"Total credits: {result['credits']}")
print(f"Regular credits: {result['breakdown']['regular']}")
print(f"Permanent credits: {result['breakdown']['permanent']}")
except requests.exceptions.RequestException as e:
print(f"Error: {e}")
get_credits_balance()
Success Response
{
"success": true,
"credits": 5000,
"breakdown": {
"regular": 3000,
"permanent": 2000,
"resetAt": "2024-02-01T00:00:00.000Z"
}
}
Response Fields
Field | Type | Description |
---|---|---|
credits | number | Total available credits (regular + permanent) |
breakdown.regular | number | Monthly credits that reset (from subscription plan) |
breakdown.permanent | number | Purchased credits that never expire |
breakdown.resetAt | string | ISO date when regular credits will reset |
Error Responses
All errors follow a consistent format:
{
"code": "ERROR_CODE",
"error": "Error message"
}
Common Error Codes
Code | Status | Description |
---|---|---|
INVALID_PARAMETERS | 400 | One or more parameters are invalid |
NOT_FOUND | 404 | Resource not found (e.g., voice not available) |
UNAUTHORIZED | 401 | Invalid or missing API key |
USAGE_LIMIT_EXCEEDED | 429 | Insufficient credits to complete the request |
RATE_LIMITED | 429 | Rate limit exceeded |
INTERNAL_ERROR | 500 | Internal server error |
Error Response Examples
Insufficient Credits
{
"error": "Insufficient credits",
"code": "USAGE_LIMIT_EXCEEDED",
"credits": 100,
"needCredits": 1000
}
Invalid Parameters
{
"code": "INVALID_PARAMETERS",
"error": "Text exceeds maximum length of 5000 bytes"
}
Unauthorized
{
"code": "UNAUTHORIZED",
"error": "Authentication required"
}
Best Practices
1. Check Credits Before API Calls
Always check your credits balance before making requests, especially for voice cloning:
async function checkCreditsAndCloneVoice(audioFile) {
// Check if you have enough credits (1000 for voice clone)
const balance = await fetch('/api/user/credits', {
headers: { 'Authorization': 'Bearer YOUR_API_KEY' }
});
const { credits } = await balance.json();
if (credits < 1000) {
console.error('Insufficient credits. Need 1000, have', credits);
return;
}
// Proceed with voice cloning
const formData = new FormData();
formData.append('voice', audioFile);
const response = await fetch('/api/v1/voice-clone', {
method: 'POST',
headers: { 'Authorization': 'Bearer YOUR_API_KEY' },
body: formData
});
return response.json();
}
2. Calculate TTS Credits in Advance
For TTS requests, calculate the required credits before making the API call:
function calculateTTSCredits(text) {
return new TextEncoder().encode(text).length;
}
async function generateSpeechSafely(text, voiceId) {
const requiredCredits = calculateTTSCredits(text);
console.log(`This request will cost ${requiredCredits} credits`);
// Make the request
const response = await fetch('/api/v1/tts', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({ text, voiceId })
});
return response.json();
}
3. Handle Asynchronous TTS Processing
TTS generation is asynchronous. Implement polling or use webhooks:
async function generateAndWaitForSpeech(text, voiceId) {
// Create TTS request
const createResponse = await fetch('/api/v1/tts', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({ text, voiceId })
});
const { speechId } = await createResponse.json();
// Poll for completion
let attempts = 0;
const maxAttempts = 30;
while (attempts < maxAttempts) {
const statusResponse = await fetch(`/api/v1/tts/${speechId}`, {
headers: { 'Authorization': 'Bearer YOUR_API_KEY' }
});
const data = await statusResponse.json();
if (data.speech.status === 'COMPLETED') {
console.log('Audio ready:', data.speech.audioUrl);
return data.speech;
} else if (data.speech.status === 'FAILED') {
throw new Error('Speech generation failed');
}
// Wait 3 seconds before next check
await new Promise(resolve => setTimeout(resolve, 3000));
attempts++;
}
throw new Error('Timeout waiting for speech generation');
}
4. Handle Rate Limits
Implement exponential backoff for rate limit errors:
async function makeRequestWithRetry(url, options, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
const response = await fetch(url, options);
if (response.status === 429) {
const resetTime = response.headers.get('X-RateLimit-Reset');
const waitTime = resetTime
? Math.max(1000, (parseInt(resetTime) * 1000) - Date.now())
: 1000 * Math.pow(2, i); // Exponential backoff
console.log(`Rate limited. Waiting ${waitTime}ms...`);
await new Promise(resolve => setTimeout(resolve, waitTime));
continue;
}
return response;
}
throw new Error('Max retries exceeded');
}
Support
- Dashboard: Access your account and manage API keys
- Contact: Reach out through your account dashboard for technical support