Assessing Data Quality in the Age of Digital Social Research: A Systematic Review

Author:

Daikeler Jessica1ORCID,Fröhling Leon1ORCID,Sen Indira2,Birkenmaier Lukas1,Gummer Tobias13ORCID,Schwalbach Jan1ORCID,Silber Henning13,Weiß Bernd1ORCID,Weller Katrin1,Lechner Clemens1

Affiliation:

1. GESIS - Leibniz Institute for the Social Sciences, Germany

2. University of Konstanz, Germany

3. University of Mannheim, Mannheim, Germany

Abstract

While survey data has long been the focus of quantitative social science analyses, observational and content data, although long-established, are gaining renewed attention; especially when this type of data is obtained by and for observing digital content and behavior. Today, digital technologies allow social scientists to track “everyday behavior” and to extract opinions from public discussions on online platforms. These new types of digital traces of human behavior, together with computational methods for analyzing them, have opened new avenues for analyzing, understanding, and addressing social science research questions. However, even the most innovative and extensive amounts of data are hollow if they are not of high quality. But what does data quality mean for modern social science data? To investigate this rather abstract question the present study focuses on four objectives. First, we provide researchers with a decision tree to identify appropriate data quality frameworks for a given use case. Second, we determine which data types and quality dimensions are already addressed in the existing frameworks. Third, we identify gaps with respect to different data types and data quality dimensions within the existing frameworks which need to be filled. And fourth, we provide a detailed literature overview for the intrinsic and extrinsic perspectives on data quality. By conducting a systematic literature review based on text mining methods, we identified and reviewed 58 data quality frameworks. In our decision tree, the three categories, namely, data type, the perspective it takes, and its level of granularity, help researchers to find appropriate data quality frameworks. We, furthermore, discovered gaps in the available frameworks with respect to visual and especially linked data and point out in our review that even famous frameworks might miss important aspects. The article ends with a critical discussion of the current state of the literature and potential future research avenues.

Publisher

SAGE Publications

Reference94 articles.

1. Agarwal N., Yiliyasi Y. (2010). Information quality challenges in social media, ln Proceedings of the International Conference on Information Quality (ICIQ), University of Arkansas at Little Rock, USA, November 12-14 , 2010.

2. Total Error in a Big Data World: Adapting the TSE Framework to Big Data

3. A comprehensive data quality methodology for web and structured data

4. Methodologies for data quality assessment and improvement

5. From Data Quality to Big Data Quality

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3