Dataset for paper: " Knowledge Graph Embeddings based Approach for Author Name Disambiguation using Literals" (Q6709459)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Dataset for paper: " Knowledge Graph Embeddings based Approach for Author Name Disambiguation using Literals" |
Dataset published at Zenodo repository.
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Dataset for paper: " Knowledge Graph Embeddings based Approach for Author Name Disambiguation using Literals" |
Dataset published at Zenodo repository. |
Statements
This dataset consists in two distinct scholarlyknowledge graph created from two publicly available bibliographic datasets: 1) a triplestore covering information about the journal Scientometricsprovided byOpenCitations (available here), and 2) the AMiner AND benchmark from 2018available here. This KG was extractedfor a research project on knowledge graph embeddings (KGEs)for author disambiguation. Structural triples of the knowledge graphs are split into training, testing and validation for applying representation learning methods. Textual literals and numeric literals were stored separately in order to implement multimodal approaches for KGEs (seearXiv:1802.00934). For the same reason, textual literals and numeric literals are already stored into sentence embeddings and anumeric matrixrespectively in the filestextual_literals.npyandnumeric_literals.npy in order to simplify the representation learning task. The file and_eval.json of each KGcontains the evaluation dataset used for evaluating our AND architecture. For the script used to gather this dataset see https://github.com/sntcristian/and-kge/tree/main/src/AMiner-534K andhttps://github.com/sntcristian/and-kge/tree/main/src/OC-782K.
0 references
11 November 2021
0 references