Gitome: A curated dataset for GitHub README-related tasks
DOI10.5281/zenodo.10311456Zenodo10311456MaRDI QIDQ6706838
Dataset published at Zenodo repository.
Author name not available (Why is that?)
Publication date: 8 December 2023
Copyright license: No records found.
AboutThis repository contains the source code implementation used to replicate the experimental results obtained in the submitted to the 21st International Conference on Mining Software Repositories (MSR204)."Gitome: A curated dataset for GitHub README-related tasks"authored by:Claudio Di Sipio, Juri Di Rocco, Riccardo Rubei, Phuong Than Nguyen, and Davide Di Ruscio,Università degli Studi dell'Aquila, ItalyData descriptionThe dataset is structured as follows:emf_metamodel.zip: It contains the Ecore project with the Gitome data modelexisting_dumps.zip: It contains the existing datasets used to build Gitomelang_aggr_stats.csv: It contains the language data to compute the statistics presented in the paperlangs.csv: It contains all the languages and their frequencyoutput_dataset.zip: It contains the benchmarking dataset obtained by parsing the README filesrepository_lists.zip: It contains the list of repositories for each considered dataset (with possible duplicates)topics.csv: It contains all the topics and their frequencytopics_aggr_stats.csv: It contains the topics data to compute the statistics presented in the papergitome_repo.txt: It contains the list of the URLs of the considered GitHub repositoriesHow to collect GitomeTo collect all the data stored in this archive, please refer to the supporting Github repository https://github.com/MDEGroup/Gitome-MSR2024.
This page was built for dataset: Gitome: A curated dataset for GitHub README-related tasks