Classifying depression severity in online chats through human-coded psycholinguistic analysis using DSM-5 and Beck Depression Inventory
DOI:
https://doi.org/10.33736/jcshd.10058.2025Keywords:
online chats, depression, mental health, social media, psycholinguisticAbstract
Despite advances in artificial intelligence, accurately detecting the severity of depression in online communications remains a challenge, underscoring the need for expert-led psycholinguistic analysis. This study employs such an approach to examine depression and other mental health issues in online chat data. Depression severity was classified using DSM-5 and Beck's Depression Inventory by five mental health professionals, with results tested for inter-rater reliability. Human-coded psycholinguistics adds clinical nuance to the classification. A random sample of 4,000 chat entries was analysed, with five professionals independently categorising each entry based on the DSM-5 and Beck Depression Inventory criteria into the categories of no depression, mild, moderate, severe, or unknown. The analysis showed a high average inter-rater reliability, indicating substantial agreement among raters. Results revealed that 7% of chats exhibited some level of depression (2% mild, 2% moderate, 3% severe), 19% indicated other mental health issues such as anxiety, and 58% were ambiguous. These findings suggest that psycholinguistic analysis of online communication has strong potential for early detection of mental health issues. Integrating such features into digital tools could enhance early identification on online platforms, enabling timely intervention and better mental health support within communities.
References
Al-Mosaiwi, M., & Johnstone, T. (2018). In an absolute state: Elevated use of absolutist words is a marker specific to anxiety, depression, and suicidal ideation. Clinical Psychological Science, 6(4), 529–542. https://doi.org/10.1177/2167702617747074
Amanat, A., Rizwan, M., Javed, A. R., Abdelhaq, M., Alsaqour, R., Pandya, S., & Uddin, M. (2022). Deep learning for depression detection from textual data. Electronics, 11(5), 676. https://doi.org/10.3390/electronics11050676
American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). https://doi.org/10.1176/appi.books.9780890425596
Amin, R., Schreynemackers, S., Oppenheimer, H., Petrovic, M., Hegerl, U., & Reich, H. (2025). Use of mobile sensing data for longitudinal monitoring and prediction of depression severity: Systematic review. Journal of Medical Internet Research, 27, e57418. https://doi.org/10.2196/57418
Barawi, M. H., Lin, C., & Siddharthan, A. (2017). Automatically labelling sentiment-bearing topics with descriptive sentence labels. In F. Frasincar, A. Ittoo, L. Nguyen, & E. Métais (Eds.), Lecture notes in computer science: Vol. 10260. Natural language processing and information systems (pp. 478–484). Springer. https://doi.org/10.1007/978-3-319-59569-6_38
Beck, A. T., Ward, C. H., Mendelson, M., Mock, J., & Erbaugh, J. (1961). An inventory for measuring depression. Archives of General Psychiatry, 4(6), 561–571. https://doi.org/10.1001/archpsyc.1961.01710120031004
Bijou, S. W., Umbreit, J., Ghezzi, P. M., & Chao, C.-C. (1986). Psychological linguistics: A natural science approach to the study of language interactions. The Analysis of Verbal Behavior, 4, 23–29. https://doi.org/10.1007/bf03392812
Brockmeyer, T., Zimmermann, J., Kulessa, D., Hautzinger, M., Bents, H., Friederich, H.-C., Herzog, W., & Backenstrass, M. (2015). Me, myself, and I: Self-referent word use as an indicator of self-focused attention in relation to depression and anxiety. Frontiers in Psychology, 6, 1564. https://doi.org/10.3389/fpsyg.2015.01564
Burkhardt, H. A., Alexopoulos, G. S., Pullmann, M. D., Hull, T. D., Areán, P. A., & Cohen, T. (2021). Behavioral activation and depression symptomatology: Longitudinal assessment of linguistic indicators in text-based therapy sessions. Journal of Medical Internet Research, 23(7), e28244. https://doi.org/10.2196/28244
Cacheda, F., Fernandez, D., Novoa, F. J., & Carneiro, V. (2019). Early detection of depression: Social network analysis and random forest techniques. Journal of Medical Internet Research, 21(6), e12554. https://doi.org/10.2196/12554
Cai, Y., Wang, H., Ye, H., Jin, Y., & Gao, W. (2023). Depression detection on online social network with multivariate time series feature of user depressive symptoms. Expert Systems with Applications, 217, 119538. https://doi.org/10.1016/j.eswa.2023.119538
Chancellor, S., & De Choudhury, M. (2020). Methods in predictive techniques for mental health status on social media: A critical review. NPJ Digital Medicine, 3, 43. https://doi.org/10.1038/s41746-020-0233-7
Chiong, R., Budhi, G. S., Dhakal, S., & Chiong, F. (2021). A textual-based featuring approach for depression detection using machine learning classifiers and social media texts. Computers in Biology and Medicine, 135, 104499. https://doi.org/10.1016/j.compbiomed.2021.104499
DeSouza, D. D., Robin, J., Gumus, M., & Yeung, A. (2021). Natural language processing as an emerging tool to detect late-life depression. Frontiers in Psychiatry, 12. https://doi.org/10.3389/fpsyt.2021.719125
Dewangan, D., Selot, S., & Panicker, S. (2022). Implementation of machine learning techniques for depression in text messages: A survey. I-Manager's Journal on Computer Science, 9(4), 13–20. https://doi.org/10.26634/jcom.9.4.18549
Geisinger, K. F., Bracken, B. A., Carlson, J. F., Hansen, J.-I. C., Kuncel, N. R., Reise, S. P., & Rodriguez, M. C. (2013). APA handbook of testing and assessment in psychology, Vol. 3: Testing and assessment in school psychology and education. American Psychological Association. https://doi.org/10.1037/14049-000
Han, K., Rezapour, R., Nakamura, K., Devkota, D., Miller, D. C., & Diesner, J. (2023). An expert-in-the-loop method for domain-specific document categorization based on small training data. Journal of the Association for Information Science and Technology, 74(6), 669–684. https://doi.org/10.1002/asi.24714
Huang, X. (2022). Ideal construction of chatbot based on intelligent depression detection techniques. IEEE International Conference on Electrical Engineering, Big Data and Algorithms, China, 511–515. https://doi.org/10.1109/EEBDA53927.2022.9744938
Jackson, J. C., Watts, J., List, J. -M., Puryear, C., Drabble, R., & Lindquist, K. A. (2022). From text to thought: How analyzing language can advance psychological science. Perspectives on Psychological Science, 17(3), 805–826. https://doi.org/10.1177/17456916211004899
Korhonen, J., Axelin, A., Katajisto, J., Lahti, M., & MEGA Consortium/Research Team. (2022). Construct validity and internal consistency of the revised Mental Health Literacy Scale in South African and Zambian contexts. Nursing Open, 9(2), 966–977. https://doi.org/10.1002/nop2.1132
Kroenke, K., Spitzer, R. L., & Williams, J. B. W. (2001). The PHQ-9. Journal of General Internal Medicine, 16, 606–613 https://doi.org/10.1046/j.1525-1497.2001.016009606.x
Lange, R. T. (2011). Inter-rater reliability. In J. S. Kreutzer, J. DeLuca, & B. Caplan (Eds.), Encyclopedia of clinical neuropsychology (pp. 1348). Springer. https://doi.org/10.1007/978-0-387-79948-3_1203
Li, I., Li, Y., Li, T., Alvarez-Napagao, S., Garcia-Gasulla, D., & Suzumura, T. (2020). What are we depressed about when we talk about COVID-19: Mental health analysis on tweets using natural language processing. In M. Bramer & R. Ellis (Eds.), Lecture notes in computer science: Vol. 12498. Proceedings of the 2020 SGAI International Conference (pp. 358–370). Springer. https://doi.org/10.1007/978-3-030-63799-6_27
Low, D. M., Rumker, L., Talkar, T., Torous, J., Cecchi, G., & Ghosh, S. S. (2020). Natural language processing reveals vulnerable mental health support groups and heightened health anxiety on reddit during COVID-19: Observational study. Journal of Medical Internet Research, 22(10), e22635. https://doi.org/10.2196/22635
Mehta, Y., Fatehi, S., Kazameini, A., Stachl, C., Cambria, E., & Eetemadi, S. (2020). Bottom-up and top-down: Predicting personality with psycholinguistic and language model features. IEEE International Conference on Data Mining, Italy, 1184–1189. https://doi.org/10.1109/ICDM50108.2020.00146
Ostic, D., Qalati, S. A., Barbosa, B., Shah, S. M. M., Galvan Vela, E., Herzallah, A. M., & Liu, F. (2021). Effects of social media use on psychological well-being: A mediated model. Frontiers in Psychology, 12, 678766. https://doi.org/10.3389/fpsyg.2021.678766
Oudin, A., Maatoug, R., Bourla, A., Ferreri, F., Bonnot, O., Millet, B., Schoeller, F., Mouchabac, S., & Adrien, V. (2023). Digital phenotyping: Data-driven psychiatry to redefine mental health. Journal of Medical Internet Research, 25, e44502. https://doi.org/10.2196/44502
Pennebaker, J. W., Mehl, M. R., & Niederhoffer, K. G. (2003). Psychological aspects of natural language use: Our words, our selves. Annual Review of Psychology, 54, 547–577. https://doi.org/10.1146/annurev.psych.54.101601.145041
Ren, X., Burkhardt, H. A., Areán, P. A., Hull, T. D., & Cohen, T. (2023). Deep representations of first-person pronouns for prediction of depression symptom severity. Annual Symposium Proceedings Archive, 1226–1235.
Solmi, M., Radua, J., Olivola, M., Croce, E., Soardo, L., Salazar de Pablo, G., Il Shin, J., Kirkbride, J. B., Jones, P., Kim, J. H., Kim, J. Y., Carvalho, A. F., Seeman, M. V., Correll, C. U., & Fusar-Poli, P. (2022). Age at onset of mental disorders worldwide: Large-scale meta-analysis of 192 epidemiological studies. Molecular Psychiatry, 27, 281–295. https://doi.org/10.1038/s41380-021-01161-7
Tinsley, H. E., & Weiss, D. J. (1975). Interrater reliability and agreement of subjective judgments. Journal of Counseling Psychology, 22(4), 358–376. https://doi.org/10.1037/h0076640
Zahra, T., & Ahmed, S. (2025). Generational differences in emoji interpretation: A study of millennial, Gen Z, and baby boomers. Advance Social Science Archive Journal, 3(2), 857–864.
Ziemer, K. S., & Korkmaz, G. (2017). Using text to predict psychological and physical health: A comparison of human raters and computerised text analysis. Computers in Human Behavior, 76, 122–127. https://doi.org/10.1016/j.chb.2017.06.038
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 UNIMAS Publisher

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Copyright Transfer Statement for Journal
1) In signing this statement, the author(s) grant UNIMAS Publisher an exclusive license to publish their original research papers. The author(s) also grant UNIMAS Publisher permission to reproduce, recreate, translate, extract or summarize, and to distribute and display in any forms, formats, and media. The author(s) can reuse their papers in their future printed work without first requiring permission from UNIMAS Publisher, provided that the author(s) acknowledge and reference publication in the Journal.
2) For open access articles, the author(s) agree that their articles published under UNIMAS Publisher are distributed under the terms of the CC-BY-NC-SA (Creative Commons Attribution-Non Commercial-Share Alike 4.0 International License) which permits unrestricted use, distribution, and reproduction in any medium, for non-commercial purposes, provided the original work of the author(s) is properly cited.
3) The author(s) is/are responsible to ensure his or her or their submitted work is original and does not infringe any existing copyright, trademark, patent, statutory right, or propriety right of others. Corresponding author(s) has (have) obtained permission from all co-authors prior to submission to the journal. Upon submission of the manuscript, the author(s) agree that no similar work has been or will be submitted or published elsewhere in any language. If submitted manuscript includes materials from others, the authors have obtained the permission from the copyright owners.
4) In signing this statement, the author(s) declare(s) that the researches in which they have conducted are in compliance with the current laws of the respective country and UNIMAS Journal Publication Ethics Policy. Any experimentation or research involving human or the use of animal samples must obtain approval from Human or Animal Ethics Committee in their respective institutions. The author(s) agree and understand that UNIMAS Publisher is not responsible for any compensational claims or failure caused by the author(s) in fulfilling the above-mentioned requirements. The author(s) must accept the responsibility for releasing their materials upon request by Chief Editor or UNIMAS Publisher.
5) The author(s) should have participated sufficiently in the work and ensured the appropriateness of the content of the article. The author(s) should also agree that he or she has no commercial attachments (e.g. patent or license arrangement, equity interest, consultancies, etc.) that might pose any conflict of interest with the submitted manuscript. The author(s) also agree to make any relevant materials and data available upon request by the editor or UNIMAS Publisher.