Numerous initiatives are currently underway to disambiguate databases worldwide. In this paper, we propose a methodology for disambiguating research entities using big data techniques, adopting an approach that goes from local to global databases. Our objective is to enhance the quality of data in the OpenAlex database by leveraging information from Brazilian databases, particularly data from the Lattes Platform and the Brazilian Federal Agency for Support and Evaluation of Graduate Education. We compare similar names of authors and institutions, employing Digital Object Identifiers to link entities, along with an adaptation of the Levenshtein distance algorithm. The proposed method is straightforward to implement in tabular databases and facilitates disambiguation, thereby contributing to open science practices and providing an effective solution for research information systems. The findings indicate the potential for integrating local and global databases to address issues related to ambiguous names and incomplete metadata.
Rafols, Ismael; Costas, Rodrigo; Bezuidenhout, Louise; Brasil, André
For science and technology to contribute to social justice, scientometrics analyses need to be able to produce descriptions that are appropriate to specific contexts and values. Given that the participation of stakeholders in the analyses is crucial for these perspectives to be realized, this requires research information analyses that are “open”. In this paper, we propose that this “openness” in research information concerns several dimensions. In the first place, research information should be open in the sense of being transparent and accessible. Second, research information should be open in the sense of being inclusive and diverse, which includes two dimensions: that stakeholders can actually use it, and that the contents include different types of knowledge. Third, the research information should be provided in forms that empower interrogation by stakeholders, so that they can retrieve and construct the descriptions of science and technology that are more appropriate for their context. This last step implies efforts to make visible many scientific contributions from the global south which are currently invisible. We call this vision “the multiversatory”: an approach to the observation of science that facilitates pluralistic analyses of the knowledge created across a variety of places and contexts, the pluriverse.