Interdisciplinary Approach to Identify and Characterize COVID-19 Misinformation on Twitter: Mixed Methods Study

Author:

Isip Tan Iris ThieleORCID,Cleofas JeromeORCID,Solano GeoffreyORCID,Pillejera Jeanne GeneviveORCID,Catapang Jasper KyleORCID

Abstract

Background Studying COVID-19 misinformation on Twitter presents methodological challenges. A computational approach can analyze large data sets, but it is limited when interpreting context. A qualitative approach allows for a deeper analysis of content, but it is labor-intensive and feasible only for smaller data sets. Objective We aimed to identify and characterize tweets containing COVID-19 misinformation. Methods Tweets geolocated to the Philippines (January 1 to March 21, 2020) containing the words coronavirus, covid, and ncov were mined using the GetOldTweets3 Python library. This primary corpus (N=12,631) was subjected to biterm topic modeling. Key informant interviews were conducted to elicit examples of COVID-19 misinformation and determine keywords. Using NVivo (QSR International) and a combination of word frequency and text search using key informant interview keywords, subcorpus A (n=5881) was constituted and manually coded to identify misinformation. Constant comparative, iterative, and consensual analyses were used to further characterize these tweets. Tweets containing key informant interview keywords were extracted from the primary corpus and processed to constitute subcorpus B (n=4634), of which 506 tweets were manually labeled as misinformation. This training set was subjected to natural language processing to identify tweets with misinformation in the primary corpus. These tweets were further manually coded to confirm labeling. Results Biterm topic modeling of the primary corpus revealed the following topics: uncertainty, lawmaker’s response, safety measures, testing, loved ones, health standards, panic buying, tragedies other than COVID-19, economy, COVID-19 statistics, precautions, health measures, international issues, adherence to guidelines, and frontliners. These were categorized into 4 major topics: nature of COVID-19, contexts and consequences, people and agents of COVID-19, and COVID-19 prevention and management. Manual coding of subcorpus A identified 398 tweets with misinformation in the following formats: misleading content (n=179), satire and/or parody (n=77), false connection (n=53), conspiracy (n=47), and false context (n=42). The discursive strategies identified were humor (n=109), fear mongering (n=67), anger and disgust (n=59), political commentary (n=59), performing credibility (n=45), overpositivity (n=32), and marketing (n=27). Natural language processing identified 165 tweets with misinformation. However, a manual review showed that 69.7% (115/165) of tweets did not contain misinformation. Conclusions An interdisciplinary approach was used to identify tweets with COVID-19 misinformation. Natural language processing mislabeled tweets, likely due to tweets written in Filipino or a combination of the Filipino and English languages. Identifying the formats and discursive strategies of tweets with misinformation required iterative, manual, and emergent coding by human coders with experiential and cultural knowledge of Twitter. An interdisciplinary team composed of experts in health, health informatics, social science, and computer science combined computational and qualitative methods to gain a better understanding of COVID-19 misinformation on Twitter.

Publisher

JMIR Publications Inc.

Subject

Health Informatics,Medicine (miscellaneous)

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3