Natural Language Processing and Automated Speech Recognition to Identify Older Adults with Cognitive Impairment Funded Grant uri icon

description

  • Project Summary The purpose of this proposal is to develop two strategies, natural language processing (NLP) and automated speech analysis (ASA), to enable automated identification of patients with cognitive impairment (CI), from mild cognitive impairment (MCI) to Alzheimer’s Disease Related Dementias (ADRD) in clinical settings. The number of older adults in the United States with MCI and ADRD is increasing and yet the ability of clinicians and researchers to identify them at scale has advanced little over recent decades and screening with clinical assessments is done inconsistently. Alternative strategies using available data, like analysis of diagnostic codes in the clinical record or insurance claims, have very low sensitivity. NLP and ASA used with machine learning are technologies that could greatly increase ability to detect MCI and ADRD in clinical contexts. NLP automatically converts text in the electronic health record (EHR) into structured concepts suitable for analysis. Thus, clinicians’ documentation of signs and symptoms or orders of tests and services that reflect or address cognitive limitations can be efficiently captured, possibly long before the clinician uses an ADRD-related diagnostic code. ASA directly measures cognition by recognizing different features of cognition captured in speech. Extracting features through both NLP and ASA could thus provide a unique measure of cognition and its impact on the individual and their caregivers. Early detection of MCI and ADRD can help researchers identify appropriate patients for research and help clinicians and health systems target patients for preventive care and care coordination. For these reasons, more efficient, highly scalable strategies are needed to identify people with MCI and ADRD. The Specific Aims of this proposal are to (1) Develop and validate a ML algorithm using features extracted from the EHR with NLP to identify patients with CI, (2) Develop and validate a ML algorithm using features extracted from ASA of audio recordings of patient-provider encounters during routine primary care visits to identify patients with CI, (3) Develop and validate a ML algorithm using both NLP and ASA extracted features to create an integrated CI diagnostic algorithm. We will develop machine learning algorithms using NLP and ASA extracted features trained against neurocognitive assessment data on 800 primary care patients in New York City and validate them in an independent sample of 200 patients in Chicago. In secondary analyses we will train ML algorithms to identify MCI and its subtypes. This project will be the most rigorous development of NLP, ASA, and ML algorithms for CI yet performed, the first to test ASA in primary care settings, and the first to test NLP and ASA feature extraction strategies in combination. The multi-disciplinary team of clinicians, health services researchers, and neurocognitive and data scientists will apply machine learning to develop these highly scalable, automated technologies for identification of MCI and ADRD. 1

date/time interval

  • 2020 - 2025