Distinct but correct: generating diversified and entity-revised medical response

Li, Bin; Sun, Bin; Li, Shutao; Chen, Encheng; Liu, Hongru; Weng, Yixuan; Bai, Yongping; Hu, Meiling

doi:10.1007/s11432-021-3534-9

Distinct but correct: generating diversified and entity-revised medical response

Research Paper
Published: 21 February 2024

Volume 67, article number 132106, (2024)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Bin Li¹,
Bin Sun¹,
Shutao Li¹,
Encheng Chen²,
Hongru Liu³,
Yixuan Weng⁴,
Yongping Bai⁵ &
…
Meiling Hu⁶

56 Accesses
Explore all metrics

Abstract

Medical dialogue generation (MDG) is applied for building medical dialogue systems for intelligent consultation. Such systems can communicate with patients in real time, thereby improving the efficiency of clinical diagnosis. However, predicting correct entities and correctly generating distinct responses remain a great challenge. Inspired by actual doctors’ responses to patients, we consider MDG a two-stage task: entity prediction and dialogue generation. For entity prediction, we design an ent-mac post pre-training strategy by leveraging external medical entity knowledge to enhance the pre-trained model. For dialogue generation, we propose an entity-aware fusion MDG method in which predicted entities are integrated into the dialogue generation model through different encoding fusion mechanisms, using information from different sources. Because the diverse beam search algorithm can produce responses with entities that deviate from the predicted entities, an entity-revised diverse beam search is proposed to correct the entities entailed in the generated responses and make the generated responses more distinct. The experimental results on the China Conference on Knowledge Graph and Semantic Computing 2021 (A/B tests) and the International Conference on Learning Representations 2021 (online test) datasets show that the proposed method outperforms several state-of-the-art methods, which demonstrates its practicability and effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware Medical Dialogue Generation

Knowledge grounded medical dialogue generation using augmented graphs

Article Open access 27 February 2023

End-to-End Pre-trained Dialogue System for Automatic Diagnosis

References

Zhang S H, Cai Y, Li J. Visualization of COVID-19 spread based on spread and extinction indexes. Sci China Inf Sci, 2020, 63: 164102
Article Google Scholar
Wei Z Y, Liu Q L, Tou B L, et al. Task-oriented dialogue system for automatic diagnosis. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018. 201–207
Xu L, Zhou Q X, Gong K, et al. End-to-end knowledge-routed relational dialogue system for automatic diagnosis. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2019. 7346–7353
Zeng G T, Yang W M, Ju Z Q, et al. Meddialog: a large-scale medical dialogue dataset. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, 2020. 9241–9250
Liu W G, Tang J H, Qin J H, et al. MedDG: a large-scale medical consultation dataset for building medical dialogue system. 2020. ArXiv:2010.07497
Lin S, Zhou P, Liang X D, et al. Graph-evolving meta-learning for low-resource medical dialogue generation. 2020. ArXiv:2012.11988
Jiang F, Jiang Y, Zhi H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol, 2017, 2: 230–243
Article PubMed PubMed Central Google Scholar
Xia Y, Zhou J B, Shi Z H, et al. Generative adversarial regularized mutual information policy gradient framework for automatic diagnosis. In: Proceedings of the AAAI Conference on Artificial Intelligence, 2020. 1062–1069
Cui Y M, Che W X, Liu T, et al. Revisiting pre-trained models for Chinese natural language processing. In: Proceedings of Conference on Empirical Methods in Natural Language Processing: Findings, 2020. 657–668
Rogers A, Kovaleva O, Rumshisky A. A primer in bertology: what we know about how BERT works. Trans Assoc Comput Linguist, 2020, 8: 842–866
Article Google Scholar
Qiu X P, Sun T X, Xu Y G, et al. Pre-trained models for natural language processing: a survey. Sci China Tech Sci, 2020, 63: 1872–1897
Article Google Scholar
Han X, Zhang Z Y, Ding N, et al. Pre-trained models: past, present and future. AI Open, 2021, 2: 225–250
Article Google Scholar
Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019. 4171–4186
Cui Y M, Che W X, Liu T, et al. Pre-training with whole word masking for chinese bert. 2019. ArXiv:1906.08101
Mikolov T, Sutskever I, Chen, K, et al. Distributed representations of words and phrases and their compositionality. In: Proceedings of Advances in Neural Information Processing Systems, 2013. 3111–3119
Zhang Z Y, Han X, Liu Z Y, et al. ERNIE: enhanced language representation with informative entities. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019. 1441–1451
Han S Y, Zhang Y H, Ma Y S, et al. THUOCL: Tsinghua Open Chinese Lexicon. 2016. http://thuocl.thunlp.org/
Liao K B, Liu Q L, Wei Z Y, et al. Task-oriented dialogue system for automatic disease diagnosis via hierarchical reinforcement learning. 2020. ArXiv:2004.14254
Kulikov I, Miller A H, Cho K, et al. Importance of search and evaluation strategies in neural dialogue modeling. In: Proceedings of the 12th International Conference on Natural Language Generation, 2019. 76–87
Fan A, Lewis M, Dauphin Y. Hierarchical neural story generation. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018. 889–898
Holtzman A, Buys J, Du L, et al. The curious case of neural text degeneration. 2019. ArXiv:1904.09751
Cohen E, Beck C. Empirical analysis of beam search performance degradation in neural sequence models. In: Proceedings of International Conference on Machine Learning PMLR, 2019. 1290–1299
Vijayakumar A K, Cogswell M, Selvaraju R R, et al. Diverse beam search: decoding diverse solutions from neural sequence models. 2016. ArXiv:1610.02424
Gururangan S, Marasović A, Swayamdipta S, et al. Don’t stop pretraining: adapt language models to domains and tasks. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020. 8342–8360
Lewis M, Liu Y, Goyal N, et al. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020. 7871–7880
Brown T B, Mann B, Ryder N, et al. Language models are few-shot learners. 2020. ArXiv:2005.14165
Raffel C, Shazeer N, Roberts A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res, 2020, 21: 5485–5551
MathSciNet Google Scholar
Zhang J Q, Zhao Y, Saleh M, et al. Pegasus: pre-training with extracted gap-sentences for abstractive summarization. In: International Conference on Machine Learning, 2020. 11328–11339
Zhang Z Y, Gu Y X, Han X, et al. CPM-2: large-scale cost-effective pre-trained language models. 2021. ArXiv:2106.10715
Miyato T, Dai A M, Goodfellow I. Adversarial training methods for semi-supervised text classification. 2016. ArXiv:1605.07725
Drucker H, Cortes C, Jackel L D, et al. Boosting and other ensemble methods. Neural Comput, 1994, 6: 1289–1301
Article Google Scholar
Loshchilov I, Hutter F. Fixing weight decay regularization in ADAM. 2017. ArXiv:1711.05101
Chen B, Cherry C. A systematic comparison of smoothing techniques for sentence-level BLEU. In: Proceedings of the 9th Workshop on Statistical Machine Translation, 2014. 362–367
Li J W, Galley M, Brockett C, et al. A diversity-promoting objective function for neural conversation models. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016. 110–119
Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014. 580–587
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems, 2017. 5998–6008
Randolph J J. Free-Marginal Multirater Kappa (multirater K): an alternative to Fleiss’ fixed-marginal multirater Kappa. 2005. https://eric.ed.gov/?id=ED490661

Download references

Acknowledgements

This work was supported by National Key Research and Development Project (Grant No. 2018YFB1305200), National Natural Science Foundation of China (Grant No. 62171183), and Project of Hunan Provincial Health Commission (Grant No. 202114010841).

Author information

Authors and Affiliations

College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China
Bin Li, Bin Sun & Shutao Li
School of Mathematics, Sun Yat-sen University, Guangzhou, 510275, China
Encheng Chen
JD Technology, Beijing, 101100, China
Hongru Liu
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Yixuan Weng
Xiangya Hospital of Central South University, Changsha, 410008, China
Yongping Bai
Teaching and Research Section of Clinical Nursing, Xiangya Hospital of Central South University, Changsha, 410008, China
Meiling Hu

Authors

Bin Li
View author publications
You can also search for this author in PubMed Google Scholar
Bin Sun
View author publications
You can also search for this author in PubMed Google Scholar
Shutao Li
View author publications
You can also search for this author in PubMed Google Scholar
Encheng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hongru Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yixuan Weng
View author publications
You can also search for this author in PubMed Google Scholar
Yongping Bai
View author publications
You can also search for this author in PubMed Google Scholar
Meiling Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shutao Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, B., Sun, B., Li, S. et al. Distinct but correct: generating diversified and entity-revised medical response. Sci. China Inf. Sci. 67, 132106 (2024). https://doi.org/10.1007/s11432-021-3534-9

Download citation

Received: 18 October 2021
Revised: 01 February 2022
Accepted: 24 June 2022
Published: 21 February 2024
DOI: https://doi.org/10.1007/s11432-021-3534-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Distinct but correct: generating diversified and entity-revised medical response

Abstract

Access this article

Similar content being viewed by others

MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware Medical Dialogue Generation

Knowledge grounded medical dialogue generation using augmented graphs

End-to-End Pre-trained Dialogue System for Automatic Diagnosis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Distinct but correct: generating diversified and entity-revised medical response

Abstract

Access this article

Similar content being viewed by others

MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware Medical Dialogue Generation

Knowledge grounded medical dialogue generation using augmented graphs

End-to-End Pre-trained Dialogue System for Automatic Diagnosis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation