News

This project is a multimodal emotional interaction AI application that integrates three interaction modes: voice, text, and gesture. Built with C# and WPF, the application incorporates Baidu Speech ...
Abstract: With the rise of large language models (LLMs), numerous studies have incorporated LLMs into the speech domain, yielding substantial improvements in sentence-level speech-to-text translation ...