Mosaic

The Infinite Model of Humanity

Changelog

Track the latest features and improvements of Mosaic

Mosaic M1 2025-10-22

Video Avatar Generation: Added 'Talking Head' API, supporting driving photos with audio or text to generate speaking videos.
Long-term Memory: API now supports session context and a vector memory store, allowing the AI to 'remember' past conversations for coherent multi-turn dialogue.
Deep Personality Fine-tuning: Opened a Fine-tuning interface, supporting deep customization of the AI's language style (no longer limited to Prompts) using private datasets.
Performance Optimization: API response speed increased by 40%, memory usage reduced by 25%.

Mosaic M1 2025-01-02

Core Interaction API: Provided a complete RESTful API chain for Speech-to-Text (STT), Large Language Model (LLM), and Text-to-Speech (TTS).
High-Fidelity Voice Cloning: Supports 'Few-shot' voice cloning, generating a target person's voice from just a few minutes of audio.
Multimodal Emotion Perception: API supports analyzing emotions from user text, speech tone, and images, enabling the AI to truly 'understand' user sentiments.
Personality & Knowledge Base: Supports setting AI personality via System Prompts and attaching private knowledge bases through RAG.

Successfully Applied Cases

Integrated into AI companion products, enabling emotional and conversational experiences.

Empowers intelligent toys with real-time speech, emotion, and vision understanding.

Used in AI-driven digital human and character cloning, enabling realistic voices and personalities.