205
Views
0
CrossRef citations to date
0
Altmetric
REVIEW

Natural Language Processing for Radiation Oncology: Personalizing Treatment Pathways

, , & ORCID Icon
Pages 65-76 | Received 23 Aug 2023, Accepted 29 Jan 2024, Published online: 12 Feb 2024

References

  • Socher R, Chen D, Manning CD, Ng A. Reasoning with neural tensor networks for knowledge base completion. Advan Neur Informat Process Syst. 2013;2013:26.
  • Pennington J, Socher R, Manning CD. Glove: global vectors for word representation. Paper presented at: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP); 2014.
  • Young T, Hazarika D, Poria S, Cambria E. Recent trends in deep learning based natural language processing. IEEE Comput Intel Mag. 2018;13(3):55–75. doi:10.1109/MCI.2018.2840738
  • Coughlin S, Roberts D, O’Neill K, Brooks P. Looking to tomorrow’s healthcare today: a participatory health perspective. Internal Med J. 2018;48(1):92–96. doi:10.1111/imj.13661
  • Warner JL, Jain SK, Levy MA. Integrating cancer genomic data into electronic health records. Genome Med. 2016;8:1–13. doi:10.1186/s13073-016-0371-3
  • Savova GK, Danciu I, Alamudun F, et al. Use of natural language processing to extract clinical cancer phenotypes from electronic medical records. Cancer Res. 2019;79(21):5463–5470. doi:10.1158/0008-5472.CAN-19-0579
  • Hutchins WJ. Machine Translation: Past, Present, Future. Ellis Horwood Chichester; 1986.
  • Christopher DM, Hinrich S. Foundations of statistical natural language processing; 1999.
  • Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–1780. doi:10.1162/neco.1997.9.8.1735
  • Rumelhart DE, Hinton GE, Williams RJ. Learning Internal Representations by Error Propagation. Institute for Cognitive Science, University of California, San Diego La; 1985.
  • Cho K, Van Merriënboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv. 2014;2014:1.
  • Bengio Y, Frasconi P, Simard P. The problem of learning long-term dependencies in recurrent networks. Paper presented at: Neural Networks, 1993, IEEE International Conference on1993; 1993.
  • Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Advan Neur Informat Process Syst. 2017;2017:30.
  • Kenton JDM, Toutanova LK. BERT: pre-training of deep bidirectional transformers for language understanding. Paper presented at: Proceedings of NAACL-HLT; 2019.
  • Radford A, Narasimhan K, Salimans T, Sutskever I. Improving language understanding by generative pre-training; 2018.
  • Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov RR, Le QV. Xlnet: generalized autoregressive pretraining for language understanding. Advan Neur Informat Process Syst. 2019;32.
  • Lee J, Yoon W, Kim S, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2020;36(4):1234–1240. doi:10.1093/bioinformatics/btz682
  • Akbik A, Blythe D, Vollgraf R. Contextual string embeddings for sequence labeling. Paper presented at: Proceedings of the 27th international conference on computational linguistics; 2018.
  • Koubaa A. GPT-4 vs. GPT-3.5: a concise showdown; 2023.
  • Fan A, Bhosale S, Schwenk H, et al. Beyond English-centric multilingual machine translation. J Mach Learn Res. 2021;22(1):4839–4886.
  • Aharoni R, Johnson M, Firat O. Massively Multilingual Neural Machine Translation. Paper presented at: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers); 2019.
  • Naseem U, Razzak I, Musial K, Imran M. Transformer based deep intelligent contextual embedding for twitter sentiment analysis. Future Gener Comput Syst. 2020;113:58–69. doi:10.1016/j.future.2020.06.050
  • Rosenthal S, Farra N, Nakov P. SemEval-2017 task 4: sentiment analysis in Twitter. arXiv preprint arXiv. 2019;2019:1.
  • Zech J, Pain M, Titano J, et al. Natural language–based machine learning models for the annotation of clinical radiology reports. Radiology. 2018;287(2):570–580. doi:10.1148/radiol.2018171093
  • Rajkomar A, Oren E, Chen K, et al. Scalable and accurate deep learning with electronic health records. Npj Digital Med. 2018;1(1):18. doi:10.1038/s41746-018-0029-1
  • Laranjo L, Dunn AG, Tong HL, et al. Conversational agents in healthcare: a systematic review. J Am Med Inf Assoc. 2018;25(9):1248–1258. doi:10.1093/jamia/ocy072
  • Stricker T, Catenacci DV, Seiwert TY. Molecular profiling of cancer--the future of personalized cancer medicine: a primer on cancer biology and the tools necessary to bring molecular testing to the clinic. Semin Oncol. 2011;38(2):173–185. doi:10.1053/j.seminoncol.2011.01.013
  • Gambardella V, Tarazona N, Cejalvo JM, et al. Personalized medicine: recent progress in cancer therapy. Cancers. 2020;12(4):1009. doi:10.3390/cancers12041009
  • Esfahani K, Elkrief A, Calabrese C, et al. Moving towards personalized treatments of immune-related adverse events. Nat Rev Clin Oncol. 2020;17(8):504–515. doi:10.1038/s41571-020-0352-8
  • Esteva A, Robicquet A, Ramsundar B, et al. A guide to deep learning in healthcare. Nature Med. 2019;25(1):24–29. doi:10.1038/s41591-018-0316-z
  • Aliper A, Plis S, Artemov A, Ulloa A, Mamoshina P, Zhavoronkov A. Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Mol Pharmaceut. 2016;13(7):2524–2530. doi:10.1021/acs.molpharmaceut.6b00248
  • Friedman C, Alderson PO, Austin JH, Cimino JJ, Johnson SB. A general natural-language text processor for clinical radiology. J Am Med Inf Assoc. 1994;1(2):161–174. doi:10.1136/jamia.1994.95236146
  • Savova GK, Masanz JJ, Ogren PV, et al. Mayo clinical text analysis and knowledge extraction system (cTAKES): architecture, component evaluation and applications. J Am Med Inf Assoc. 2010;17(5):507–513. doi:10.1136/jamia.2009.001560
  • Hong JC, Fairchild AT, Tanksley JP, Palta M, Tenenbaum JD. Natural language processing for abstraction of cancer treatment toxicities: accuracy versus human experts. JAMIA Open. 2020;3(4):513–517. doi:10.1093/jamiaopen/ooaa064
  • Lindvall C, Deng C-Y, Agaronnik ND, et al. Deep learning for cancer symptoms monitoring on the basis of electronic health record unstructured clinical notes. JCO Clin Cancer Inform. 2022;6:e2100136.
  • Chen S, Guevara M, Ramirez N, et al. Natural language processing to automatically extract the presence and severity of esophagitis in notes of patients undergoing radiotherapy. arXiv preprint arXiv. 2023;2023:1.
  • Yang X, Chen A, PourNejatian N, et al. A large language model for electronic health records. Npj Digital Med. 2022;5(1):194. doi:10.1038/s41746-022-00742-2
  • Tyczynski JE, Potluri R, Kilpatrick R, Mazumder D, Ghosh A, Liede A. Incidence and risk factors of pneumonitis in patients with non-small cell lung cancer: an observational analysis of real-world data. Oncol Therapy. 2021;9(2):471–488. doi:10.1007/s40487-021-00150-8
  • Sinha S, Garriga M, Naik N, et al. Disparities in electronic health record patient portal enrollment among oncology patients. JAMA Oncol. 2021;7(6):935–937. doi:10.1001/jamaoncol.2021.0540
  • Gerber DE, Laccetti AL, Chen B, et al. Predictors and intensity of online access to electronic medical records among patients with cancer. J Oncol Pract. 2014;10(5):e307–e312. doi:10.1200/JOP.2013.001347
  • Girault A, Ferrua M, Lalloué B, et al. Internet-based technologies to improve cancer care coordination: current use and attitudes among cancer patients. Eur J Cancer. 2015;51(4):551–557. doi:10.1016/j.ejca.2014.12.001
  • Hong J, Davoudi A, Yu S, Mowery DL. Annotation and extraction of age and temporally-related events from clinical histories. BMC Med Inf Decis Making. 2020;20(11):1–15. doi:10.1186/s12911-020-01333-5
  • Zhu VJ, Lenert LA, Bunnell BE, Obeid JS, Jefferson M, Halbert CH. Automatically identifying social isolation from clinical narratives for patients with prostate Cancer. BMC Med Inf Decis Making. 2019;19(1):1–9. doi:10.1186/s12911-018-0723-6
  • Derton A, Murray A, Liu D, et al. Exploring methods to understand cancer disparities using natural language processing of clinical notes. Int J Radiat Oncol Biol Phys. 2022;114(3):S21. doi:10.1016/j.ijrobp.2022.07.369
  • Derton A, Guevara M, Chen S, et al. Natural language processing methods to empirically explore social contexts and needs in cancer patient notes. JCO Clin Cancer Inform. 2023;7:e2200196. doi:10.1200/CCI.22.00196
  • Wang Z, Huang H, Cui L, et al. Using natural language processing techniques to provide personalized educational materials for chronic disease patients in China: development and assessment of a knowledge-based health recommender system. JMIR Med Inform. 2020;8(4):e17642. doi:10.2196/17642
  • Davoudi A, Tissot H, Doucette A, et al. Using natural language processing to classify serious illness communication with oncology patients. Paper presented at: AMIA Annual Symposium Proceedings; 2022.
  • Bitterman DS, Miller TA, Mak RH, Savova GK. Clinical natural language processing for radiation oncology: a review and practical primer. Int J Radiat Oncol Biol Phys. 2021;110(3):641–655. doi:10.1016/j.ijrobp.2021.01.044
  • Ni L, Phuong C, Hong J. Natural Language Processing for Radiation Oncology. In: Artificial Intelligence in Radiation Oncology. World Scientific; 2023:157–183.
  • Murphy SN, Weber G, Mendis M, et al. Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). J Am Med Inf Assoc. 2010;17(2):124–130. doi:10.1136/jamia.2009.000893
  • Uzuner Ö, Goldstein I, Luo Y, Kohane I. Identifying patient smoking status from medical discharge records. J Am Med Inf Assoc. 2008;15(1):14–24. doi:10.1197/jamia.M2408
  • Banerjee I, Gensheimer MF, Wood DJ, et al. Probabilistic prognostic estimates of survival in metastatic cancer patients (PPES-Met) utilizing free-text clinical narratives. Sci Rep. 2018;8(1):10037.
  • Liu K, Kulkarni O, Witteveen-Lane M, Chen B, Chesla D. MetBERT: a generalizable and pre-trained deep learning model for the prediction of metastatic cancer from clinical notes. Paper presented at: AMIA Annual Symposium Proceedings; 2022.
  • Lin H, Ginart JB, Chen W, et al. OncoBERT: building an interpretable transfer learning bidirectional encoder representations from transformers framework for longitudinal survival prediction of cancer patients; 2023.
  • Jiang LY, Liu XC, Nejatian NP, et al. Health system-scale language models are all-purpose prediction engines. Nature. 2023;2023:1–6.
  • Ching T, Himmelstein DS, Beaulieu-Jones BK, et al. Opportunities and obstacles for deep learning in biology and medicine. J Royal Soc Interface. 2018;15(141):20170387. doi:10.1098/rsif.2017.0387
  • Rajkomar A, Dean J, Kohane I. Machine learning in medicine. N Engl J Med. 2019;380(14):1347–1358. doi:10.1056/NEJMra1814259
  • Beam AL, Kohane IS. Big data and machine learning in health care. JAMA. 2018;319(13):1317–1318. doi:10.1001/jama.2017.18391
  • Shickel B, Tighe PJ, Bihorac A, Rashidi P. Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE J Biomed Health Inform. 2017;22(5):1589–1604. doi:10.1109/JBHI.2017.2767063
  • Char DS, Shah NH, Magnus D. Implementing machine learning in health care—addressing ethical challenges. New Engl J Med. 2018;378(11):981. doi:10.1056/NEJMp1714229
  • Hripcsak G, Duke JD, Shah NH, et al. Observational Health Data Sciences and Informatics (OHDSI): opportunities for observational researchers. Stud Health Technol Inform. 2015;216:574. doi:10.1038/psp.2013.52
  • Bonawitz K, Eichner H, Grieskamp W, et al. Towards federated learning at scale: system design. Proceed Mach Learn Syst. 2019;1:374–388.
  • McMahan B, Moore E, Ramage D, Hampson S, y Arcas BA. Communication-efficient learning of deep networks from decentralized data. Paper presented at: Artificial intelligence and statistics; 2017.
  • El Emam K, Rodgers S, Malin B. Anonymising and sharing individual patient data. BMJ. 2015;2015:350.
  • Rocher L, Hendrickx JM, De Montjoye Y-A. Estimating the success of re-identifications in incomplete datasets using generative models. Nat Commun. 2019;10(1):1–9. doi:10.1038/s41467-018-07882-8
  • Cohen R, Elhadad M, Elhadad N. Redundancy in electronic health record corpora: analysis, impact on text mining performance and mitigation strategies. BMC Bioinf. 2013;14(1):1–15. doi:10.1186/1471-2105-14-10
  • Sujansky W. Heterogeneous database integration in biomedicine. J Biomed Informat. 2001;34(4):285–298. doi:10.1006/jbin.2001.1024
  • Johnson AE, Pollard TJ, Shen L, et al. MIMIC-III, a freely accessible critical care database. Scientific Data. 2016;3(1):1–9. doi:10.1038/sdata.2016.35
  • Johnson AE, Bulgarelli L, Shen L, et al. MIMIC-IV, a freely accessible electronic health record dataset. Scientific Data. 2023;10(1):1. doi:10.1038/s41597-022-01899-x
  • Rea S, Pathak J, Savova G, et al. Building a robust, scalable and standards-driven infrastructure for secondary use of EHR data: the SHARPn project. J Biomed Informat. 2012;45(4):763–771. doi:10.1016/j.jbi.2012.01.009
  • Fernández-Breis JT, Maldonado JA, Marcos M, et al. Leveraging electronic healthcare record standards and semantic web technologies for the identification of patient cohorts. J Am Med Inf Assoc. 2013;20(e2):e288–e296. doi:10.1136/amiajnl-2013-001923
  • Albright D, Lanfranchi A, Fredriksen A, et al. Towards comprehensive syntactic and semantic annotations of the clinical narrative. J Am Med Inf Assoc. 2013;20(5):922–930. doi:10.1136/amiajnl-2012-001317
  • Bolukbasi T, Chang K-W, Zou JY, Saligrama V, Kalai AT. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. Advan Neur Informat Process Syst. 2016;29:1.
  • Caliskan A, Bryson JJ, Narayanan A. Semantics derived automatically from language corpora contain human-like biases. Science. 2017;356(6334):183–186. doi:10.1126/science.aal4230
  • Lauscher A, Glavaš G. Are we consistently biased? Multidimensional analysis of biases in distributional word vectors. Paper presented at: Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (* SEM 2019); 2019.
  • Liang PP, Wu C, Morency L-P, Salakhutdinov R. Towards understanding and mitigating social biases in language models. Paper presented at: International Conference on Machine Learning; 2021.
  • Zhang H, Lu AX, Abdalla M, McDermott M, Ghassemi M. Hurtful words: quantifying biases in clinical contextual word embeddings. Paper presented at: proceedings of the ACM Conference on Health, Inference, and Learning; 2020.
  • Prabhakaran V, Hutchinson B, Mitchell M. Perturbation Sensitivity Analysis to Detect Unintended Model Biases. Paper presented at: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP); 2019.
  • Nadeem M, Bethke A, Reddy S. StereoSet: measuring stereotypical bias in pretrained language models. Paper presented at: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers); 2021.
  • Robinson R. Assessing gender bias in medical and scientific masked language models with StereoSet. arXiv preprint arXiv. 2021.
  • Recasens M, Danescu-Niculescu-Mizil C, Jurafsky D. Linguistic models for analyzing and detecting biased language. Paper presented at: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers); 2013.
  • Bordia S, Bowman S. Identifying and reducing gender bias in word-level language models. Paper presented at: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop; 2019.
  • Singhal K, Azizi S, Tu T, et al. Large language models encode clinical knowledge. Nature. 2023;2023:1–9.
  • Lauscher A, Lueken T, Glavaš G. Sustainable modular debiasing of language models. arXiv preprint arXiv. 2021;2021:1.
  • Mitchell M, Wu S, Zaldivar A, et al. Model cards for model reporting. Paper presented at: Proceedings of the conference on fairness, accountability, and transparency; 2019.
  • Blodgett SL, Barocas S, Daumé III H, Wallach H. Language (Technology) is Power: a Critical Survey of “Bias” NLP. Paper presented at: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics; 2020.
  • Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D. A survey of methods for explaining black box models. ACM Comput Surv. 2018;51(5):1–42. doi:10.1145/3236009
  • Doshi-Velez F, Kim B. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv. 2017;2017:1.
  • Rogers A, Kovaleva O, Rumshisky A. A primer in BERTology: what we know about how BERT works. Transact Assoc Comput Linguist. 2021;8:842–866.
  • Chen JH, Asch SM. Machine learning and prediction in medicine—beyond the peak of inflated expectations. New Engl J Med. 2017;376(26):2507. doi:10.1056/NEJMp1702071
  • Vig J. A multiscale visualization of attention in the transformer model. arXiv preprint arXiv. 2019;2019:1.
  • Ribeiro MT, Singh S, Guestrin C. “Why should I trust you?” Explaining the predictions of any classifier. Paper presented at: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining; 2016.
  • Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Advan Neur Informat Process Syst. 2017;2017:30.
  • Friedman CP, Wong AK, Blumenthal D. Achieving a nationwide learning health system. Sci Trans Med. 2010;2(57):57cm29–57cm29. doi:10.1126/scitranslmed.3001456
  • Ruder S. An overview of multi-task learning in deep neural networks. arXiv preprint arXiv. 2017;2017:1.
  • Zhang K, Yu J, Yan Z, et al. BiomedGPT: a unified and generalist biomedical generative pre-trained transformer for vision, language, and multimodal tasks. arXiv preprint arXiv. 2023;2023:1.