BACKGROUND
Background: During the pandemic, dementia patients were identified as a vulnerable population. Twitter became an important source of information for people seeking updates on COVID-19, and therefore, identifying tweets relevant to dementia can be an important support for dementia patients and their caregivers. However, mining and coding relevant tweets can be daunting due to the sheer volume and high percentage of irrelevant tweets.
OBJECTIVE
Objective: The objective of this study was to automate the identifying tweets relevant to dementia and COVID-19 using natural language processing (NLP) and machine learning (ML) algorithms.
METHODS
We employed a combination of NLP and ML algorithms with manually annotated tweets to identify tweets relevant to dementia and COVID-19. We utilized three datasets containing more than 100,000 tweets and assessed the capability of various ML algorithms in correctly identifying relevant tweets.
RESULTS
Our results showed that (pre-trained) transfer learning algorithms outperformed traditional ML algorithms in identifying tweets relevant to dementia and COVID-19. Among the algorithms tested, the transfer learning algorithm ALBERT achieved an accuracy of 0.8292 and an AUC of 0.8353. ALBERT substantially outperformed the other algorithms tested, further emphasizing the superior performance of transfer learning algorithms for tweet classification.
CONCLUSIONS
Transfer learning algorithms like ALBERT are highly effective in identifying topic-specific tweets, even when trained with limited or adjacent data, highlighting their superiority over other ML algorithms. Such an automated approach reduces the workload of manual coding of tweets and facilitates their analysis for researchers and policymakers to support dementia patients and their caregivers.