New AI Speech Framework Offers Reliable Suicide Risk Detection Across Multiple Tasks
Researchers from the Shanghai Artificial Intelligence Laboratory and Tsinghua University have developed a speech-based AI model, MoDE, that unifies ten different assessment tasks to improve suicide risk detection among adolescents. The system not only achieves higher accuracy than task-specific models but also offers more reliable confidence calibration, making it a promising aid for clinical use.
Suicide continues to be one of the leading causes of death among adolescents, and the challenge of identifying those at risk in time remains a pressing public health issue. Traditional detection methods often depend on direct self-disclosure, yet many young people struggle to articulate their distress. In this context, researchers from the Shanghai Artificial Intelligence Laboratory and Tsinghua University’s Department of Electronic Engineering and Vanke School of Public Health have presented a breakthrough study that leverages artificial intelligence to detect suicide risk through speech. Their innovation unifies diverse speech-based assessment tasks into a single large language model (LLM)-powered framework, showing not only improved accuracy but also enhanced reliability. This step marks a significant shift from fragmented approaches to a more practical, scalable solution that could support clinicians in the future.
Voices of Adolescents as a Window to Risk
The study draws on one of the most comprehensive adolescent speech datasets assembled to date. It involves 1,223 Chinese adolescents aged 10 to 18, more than half of whom were clinically assessed as being at suicide risk using the well-established MINI-KID suicidality module. Participants were guided through ten structured speech tasks designed to draw out a range of cognitive, emotional, and linguistic features. These included the Animal Fluency Test, self-introduction, recalling a happy memory, explaining strategies for managing distress, reading passages and word lists, describing faces with various emotions, and creatively suggesting possible uses for an empty box. Each task offered unique insights into mental states, with acoustic cues such as pitch, jitter, and fluency complementing linguistic markers. Collectively, these tasks provided a nuanced dataset, ethically collected under strict oversight, to test how effectively speech could be used for suicide risk detection.
Building a Smarter AI: DoRA and MoDE
At the technological core of the research lies Qwen2.5-Omni-7B, a multimodal large language model capable of processing both voice and text. To make it efficient and adaptable, the team used DoRA (Weight-Decomposed Low-Rank Adaptation), which allows selective fine-tuning of model weights without the computational burden of retraining everything. Building on this, the researchers introduced MoDE (Mixture of DoRA Experts)—an innovative architecture where multiple specialised experts are dynamically activated based on the nature of the task. A routing mechanism selects the relevant experts for each speech input, enabling the system to balance task-specific knowledge with cross-task synergies. To stabilise training and avoid overreliance on any single expert, the model integrated load balancing and temperature scaling mechanisms. This design allowed MoDE to automatically discover optimal task–expert alignments, creating a more flexible and robust detection system.
Accuracy, Trust, and Insights from the Results
The results demonstrated clear advantages of the MoDE approach. Compared to baseline models that relied on separate training for each task or joint tuning across all tasks, MoDE achieved a 4.5 percent average improvement in accuracy. Beyond raw performance, the system also excelled in confidence calibration, meaning its probability estimates were more aligned with actual outcomes. This reliability is critical in medical contexts where false positives and negatives can carry heavy consequences. The study also explored how experts within the system specialised in different tasks. Reading-based exercises strongly activated one expert, while facial description tasks engaged another, and the creative empty box exercise drew on a unique blend of experts. When researchers attempted manual supervision by pre-assigning experts to tasks, results actually worsened, proving that automatic routing allowed the model to capture hidden synergies better than human-guided assignments. Moreover, rejection analysis revealed that when low-confidence predictions were withheld, accuracy in the retained subset increased significantly—underscoring MoDE’s practical value for decision support in clinical practice.
Ethical Considerations and Future Promise
While the findings are promising, the researchers emphasised important limitations and ethical considerations. The conclusions are based on the MINI-KID suicidality module, which assesses current risk through self-reporting. Although reliable, it cannot capture the full complexity of suicidal behaviour or predict future outcomes with certainty. As such, the system should be regarded as an aid to clinicians rather than a replacement for professional judgment. The authors also stressed the importance of privacy, careful deployment, and ensuring human oversight when integrating artificial intelligence into mental health care.
Despite these caveats, the study signals a major advancement in suicide risk detection. By consolidating multiple assessment tasks into one flexible model, MoDE enhances both accuracy and trustworthiness while reducing the need for fragmented systems. Its ability to reveal inter-task relationships and self-organise expert specialisation demonstrates the power of cross-task learning. For practitioners, this means a single tool could potentially provide richer and more reliable insights into adolescent mental health. For policymakers and researchers, it underscores the value of developing scalable, ethically grounded technologies that could one day help save young lives.
The collaboration between the Shanghai Artificial Intelligence Laboratory and Tsinghua University showcases how artificial intelligence can be responsibly harnessed in sensitive domains. By unifying diverse speech-based assessments within a single architecture, their model not only raises the technical bar for suicide risk detection but also opens the door to practical deployment in clinics, schools and community programs. In an era where adolescent mental health challenges are intensifying, this research represents a hopeful stride toward early, accurate, and trustworthy interventions that place human wellbeing at the centre of technological progress.
- FIRST PUBLISHED IN:
- Devdiscourse
ALSO READ
Jammu Division Modernizes with Safer Sealing and Coaches
BJP's New State Office: A Beacon of Modern Political Infrastructure
Delhi's Education Model Under Fire: Sood's Critique
Indian scientists decode Himalayan air motions to sharpen monsoon prediction models
Tripura's Determined Path to a Model State in Law and Order

