mirror of
https://github.com/Soul-AILab/SoulX-Podcast.git
synced 2026-05-06 13:40:25 +08:00
Major updates: - Add vLLM engine support with automatic fallback to HuggingFace - Complete REST API implementation with sync/async modes - Add comprehensive API documentation - Organize scripts into dedicated directory API Features: - Support both HuggingFace and vLLM inference engines - Sync and async generation endpoints - Task queue management with concurrency control - Health check with engine information - Automatic file cleanup Configuration: - Environment variable based configuration - Engine validation and auto-fallback - Configurable concurrency limits Documentation: - README_API.md: Complete API usage guide - CHANGELOG_API.md: API version history - VLLM_UPGRADE_SUMMARY.md: Detailed upgrade guide - scripts/README.md: Scripts documentation Scripts Organization: - Move all test and utility scripts to scripts/ - Add configuration test script - Add singleton pattern test Performance: - vLLM engine provides 2-3x speedup - Better GPU memory utilization - Support for prefix caching 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>