> ## Documentation Index
> Fetch the complete documentation index at: https://docs.modelslab.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Changelog

> Follow along with updates across ModelsLab API and Developer Console.

<Update label="2026-04-07" description="Voice Cloning API: 48 Languages, Faster Inference">
  <Frame>
    <img className="block" src="https://assets.modelslab.ai/generations/040d8cfe-1831-497e-93ee-78933315598b.png" alt="Voice Cloning API" />
  </Frame>

  ## Voice Cloning API: Expanded Language Support & Performance Improvements

  The [Voice Cloning](/voice-cloning/voice-cloning) endpoint has been upgraded with broader language coverage and faster inference times for more natural-sounding output.

  #### API Endpoint

  * **Standard API**: `POST /api/v6/voice/text_to_audio`

  #### What's New

  * **48 languages supported** — up from 19, now covering major South Asian, Southeast Asian, Middle Eastern, and European languages
  * **Faster inference** — reduced latency for quicker audio generation
  * **More natural output** — improved prosody and pronunciation across all supported languages
  * **Fully backward compatible** — all existing language values continue to work

  #### Newly Added Languages

  Assamese, Bengali, Finnish, Gujarati, Hebrew, Indonesian, Kannada, Maithili, Malay, Malayalam, Marathi, Min Nan Chinese, Nepali, Odia, Punjabi, Sindhi, Sinhala, Slovak, Swahili, Tamil, Telugu, Ukrainian, Urdu, Vietnamese, Welsh, Yue Chinese
</Update>

<Update label="2026-03-27" description="MusicGen API: New Parameters">
  ## MusicGen API: New Parameters

  Added `duration`, `output_format`, and `bitrate` parameters to the [MusicGen API](/voice-cloning/music-gen).

  #### API Endpoint

  * **Standard API**: `POST /api/v6/voice/music_gen`

  #### New Parameters

  * **`duration`**: Set the length of generated music in seconds. Any value between 30 and 480. Default: `30`.
  * **`output_format`**: Choose output format — `wav`, `mp3`, or `flac`. Default: `wav`.
  * **`bitrate`**: Set audio bitrate — `128k`, `192k`, or `320k`. Default: `320k`.
</Update>

<Update label="2026-01-26" description="Video API: Scene Maker Deprecated">
  ## Video API: Scene Maker Endpoint Deprecated

  The Scene Maker endpoint has been deprecated and is no longer supported as of January 26, 2026.

  #### Deprecated Endpoint

  * **Standard API**: `POST /api/v6/video/scene_maker`
</Update>

<Update label="2026-02-09" description="Enterprise API: Speech-to-Text Released">
  ## New Enterprise API Endpoint: [Speech-to-Text](/enterprise-api/speech-to-text/speech-to-text)

  Enterprise Speech-to-Text is now available for converting audio into text transcription.

  #### API Endpoint

  * **Enterprise API**: `POST /api/v1/enterprise/speech_to_text/transcribe`

  #### Key Features

  * Convert speech audio into text transcription
  * Multi-language transcription support
  * Optional timestamp controls with `timestamp_level` (`null`, `word`, `sentence`)
  * Webhook support via `webhook` and `track_id` for async tracking
</Update>

<Update label="2025-02-05" description="Song Generator API: ACE-Step v1.5 Model">
  <Frame>
    <img className="block" src="https://assets.modelslab.ai/generations/5441a55a-b930-4e15-b14d-61fc8b08c3d8.png" alt="Song Generator API" />
  </Frame>

  ## Song Generator API powered by ACE-Step v1.5

  Major upgrade to the Song Generator API with the new **ACE-Step v1.5** model for professional-grade song creation.

  #### API Endpoint

  * **Standard API**: `POST /api/v6/voice/song_generator`

  #### What's New

  * **ACE-Step v1.5 Model**: State-of-the-art AI model for high-quality song generation with vocal synthesis
  * **50+ Languages**: Generate songs with vocals in languages from Arabic to Chinese, Cantonese to Spanish
  * **Flexible Duration**: Create songs from 30 seconds to 8 minutes (30-480 seconds)
  * **Instrumental Mode**: Generate instrumental versions without vocals using `instrumental` parameter
  * **Smart Lyrics Generation**: Automatic lyrics generation based on prompt and caption, or use your own lyrics
  * **Advanced Control**:
    * `caption` parameter for music style, instruments, atmosphere, and production style
    * `lyrics` parameter for song structure, vocal styles, and energy control
    * `prompt` parameter for automatic lyrics generation
    * Language-specific vocal synthesis with proper pronunciation

  #### New Documentation

  * **[Song Generator API Reference](/voice-cloning/song-generator)**: Complete API documentation with examples
  * **[Song Generation Guide](/guides/song-generation-guide)**: Professional guide with best practices, caption writing tips, lyrics structure guidance, duration calculation, and real-world examples

  #### Key Features

  * Professional music structure tags: `[Intro]`, `[Verse]`, `[Chorus]`, `[Bridge]`, `[Outro]`
  * Vocal control tags: `[raspy vocal]`, `[whispered]`, `[falsetto]`, `[powerful belting]`
  * Energy control: `[high energy]`, `[building energy]`, `[explosive]`, `[melancholic]`
  * Consistent caption-lyrics matching for optimal results
  * Duration calculation guidelines based on lyrics length and structure

  #### Getting Started

  Check out the [Song Generation Guide](/guides/song-generation-guide) for detailed examples and best practices for creating professional songs with the ACE-Step v1.5 model.
</Update>

<Update label="2025-11-05" description="Song Generator API and Lyrics Generator API: New Parameters">
  ## New Parameter in Song Generator API Endpoint: [Song Generator API](/voice-cloning/song-generator)

  Added `model_id` parameter to select between `diffrhythm-short` and `diffrhythm-long` models for song generation.

  * `diffrhythm-short`: Generates shorter with maximum duration of 1 minute 35 seconds.
  * `diffrhythm-long`: Generates longer songs with maximum duration of 4 minute 45 seconds.

  ## New parameter in Lyrics Generator API Endpoint: [Lyrics Generator API](/voice-cloning/lyrics-generator)

  Added `length` parameter to specify desired length of generated lyrics.

  * `short`: Generates shorter lyrics with maximum duration of 1 minute 35 seconds.
  * `long`: Generates longer lyrics with maximum duration of 4 minute 45 seconds.

  #### API Endpoints

  * **Standard API**: `POST /api/v6/voice/song_generator`
  * **Standard API**: `POST /api/v6/voice/lyrics_generator`

  #### Key Features

  * Select between short and long models for song generation
  * Specify desired length of generated lyrics
</Update>

<Update label="2025-10-30" description="Enterprise API: Qwen Text to Image">
  ## New Enterprise API Endpoint: [Qwen Text to Image](/enterprise-api/qwen/text-to-img)

  Generate high-definition images from text using the Qwen model.

  #### API Endpoint

  * **Enterprise API**: `POST /api/v1/enterprise/qwen/text2img`

  #### Key Features

  * Generate high-definition images from text using Qwen model
  * Supports various image styles and attributes
  * Resolution up to 1024x1024 pixels.
</Update>

<Update label="2025-10-30" description="Video API: Watermark Remover">
  ## New Video API Endpoint: [Watermark Remover](/video-api/watermark-remover)

  Remove watermarks from SORA videos.

  #### API Endpoint

  * **Standard API**: `POST /api/v6/video/watermark_remover`

  #### Key Features

  * SORA watermark detection and removal
  * Preserves video quality
</Update>

<Update label="2025-10-29" description="Image Editing API: Caption">
  ## New Image Editing Endpoint: [Caption](/image-editing/caption)

  Simple and powerful image captioning endpoint to generate descriptive text from images.

  #### API Endpoint

  * **Standard API**: `POST /api/v6/image_editing/caption`

  #### Key Features

  * Automatic image caption generation
  * Customizable caption length (short, normal, long)
  * Supports multiple image formats: `png`, `jpeg`, `jpg`
</Update>

<Update label="2025-10-28" description="Documentation: Flux Kontext Moved">
  ## Flux Kontext Dev Moved to Image Editing API

  Flux Kontext Image to Image endpoint moved from Image Generation API to Image Editing API section for better organization.

  * **New Location**: [Image Editing API → Flux Kontext Image to Image](/image-editing/flux-kontext-img-to-image)
  * **Endpoint**: `POST /api/v6/images/img2img`
  * Fixed OpenAPI playground display
</Update>

<Update label="2025-10-27" description="Image Editing API: Qwen Edit">
  ## New Image Editing Endpoint: [Qwen Edit](/image-editing/qwen-edit)

  Added Qwen Edit endpoint for AI-powered image editing using the Qwen model.

  #### API Endpoints

  * **Standard API**: `POST /api/v6/image_editing/qwen_edit`
  * **Enterprise API**: `POST /api/v1/enterprise/image_editing/qwen_edit`

  #### Key Features

  * Prompt-based image editing and manipulation
  * Support for single or multiple images (up to 4 images)
</Update>

<Update label="2025-10-17" description="Interior API: Object Removal and Interior Mixer">
  ## New Interior API Endpoints

  Added two new endpoints to the Interior API for enhanced object manipulation capabilities:

  #### Object Removal

  * **Endpoint**: `POST /api/v6/interior/object_removal`
  * Remove unwanted objects from interior images using AI
  * Parameters: `init_image`, `object_name`, `base64`, `webhook`, `track_id`
  * Simple text-based object identification

  #### Interior Mixer

  * **Endpoint**: `POST /api/v6/interior/interior_mixer`
  * Add objects from one image into another room image
  * Parameters: `init_image`, `object_image`, `prompt`, `width`, `height`, `guidance_scale`, `num_inference_steps`
  * Intelligent object placement with prompt-based positioning
  * Configurable inference steps (default: 8) and guidance scale

  #### Documentation Updates

  * Added complete API reference documentation for both endpoints
  * Updated OpenAPI specification with new schemas
  * Added visual indicators for new endpoints in the overview
</Update>

<Update label="2025-10-02" description="Rate Limits Documentation">
  ## Rate Limits Documentation

  Added comprehensive rate limits documentation with plan-specific queue limits:

  * **Pay as you go plan**: 5 queued API requests
  * **Standard plan**: 10 queued API requests
  * **Unlimited Premium Plan**: 15 queued API requests

  #### Key Features

  * **Sequential Processing**: Requests are processed one after another in queue order
  * **Queue Management**: New requests are added to the queue and processed when previous ones complete
  * **Real-time Enforcement**: Limits are enforced in real-time as requests come in
  * **FIFO Processing**: Requests are processed in First-In-First-Out order

  #### Enterprise API Updates

  * Added Reset S3 endpoint to Enterprise API General section
  * Updated S3 management capabilities for dedicated servers
</Update>

<Update label="2025-09-30" description="Wan 2.5 Release">
  ## New Model: Wan 2.5

  Added Wan 2.5 to ModelsLab with enhanced video generation capabilities:

  * **Text to Video**: Generate videos from text prompts with audio support
  * **Image to Video**: Transform static images into dynamic videos with sound
  * **Audio Integration**: Built-in audio support for complete multimedia experiences
  * **Enhanced Quality**: Improved motion smoothness and visual realism

  #### Available Models

  * [Wan 2.5 Text to Video](https://modelslab.com/models/alibaba_cloud/wan25-text-to-video-audioSupport)
  * [Wan 2.5 Image to Video](https://modelslab.com/models/alibaba_cloud/wan25-image-to-video)
</Update>

<Update label="2025-09-18" description="NSFW Image Check: threshold parameter">
  ## Added threshold parameter to NSFW Image Check

  * Added `threshold` parameter to `POST /nsfw_image_check`.
    * Type: number; range: 0–1; default: 0.5.
    * Controls sensitivity for NSFW detection in images/videos.
</Update>
