Abstract
AbstractThe concept of collaborative R&D has been increasing interest among scholars and policy-makers, making collaboration a pivotal determinant to innovate nowadays. The availability of reliable data is a necessary condition to obtain valuable results. Specifically, in a collaborative environment, we must avoid mistaken identities among organizations. In many datasets, indeed, the same organization can appear in a non-univocal way. Thus its information is shared among multiple entities. In this work, we propose a novel methodology to disambiguate organization names. In particular, we combine supervised and unsupervised techniques to design a “hybrid” methodology that is neither fully automated nor completely manual, and easy to adapt to many different datasets. Thus, the flexibility and potential scalability of the methodology make this paper a worthwhile contribution to different research fields. We provide an empirical application of the methodology to the dataset of participants in projects funded by the first three European Framework Programmes. This choice is because we can test the quality of our procedure by comparing the refined dataset it returns to a well-recognized benchmark (i.e., the EUPRO database) in terms of the connection structure of the collaborative networks. Our results show the advantages of our approach based on the quality of the obtained dataset, and the efficiency of the designed methodology, leaving space for the integration of affiliation hierarchies in the future.
Funder
Università degli Studi di Roma La Sapienza
Publisher
Springer Science and Business Media LLC
Subject
Library and Information Sciences,Computer Science Applications,General Social Sciences
Reference55 articles.
1. Akbaritabar, A. (2021). A quantitative view of the structure of institutional scientific collaborations using the example of berlin. Quantitative Science Studies, 2(2), 753–777. https://doi.org/10.1162/qss_a_00131
2. Amancio, D. R., da F.Costa, L., et al. (2015). Topological-collaborative approach for disambiguating authors’ names in collaborative networks. Scientometrics, 102(1), 465–485. https://doi.org/10.1007/s11192-014-1381-9
3. Balsmeier, B., Chavosh, A., Li, G. C., Fierro, G., Johnson, K., Kaulagi, A., O’Reagan, D., Yeh, B., & Fleming, L. (2015). Automated disambiguation of US patent grants and applications. Working paper 8.
4. Barabási, A. L. (2016). Network science. Cambridge University Press.
5. Campos, P., Brazdil, P., & Mota, I. (2013). Comparing strategies of collaborative networks for r &d: An agent-based study. Computational Economics, 42(1), 1–22. https://doi.org/10.1007/s10614-013-9376-9
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献