A multi-species benchmark for training and validating large scale mass spectrometry proteomics machine learning models (Q6708688)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: A multi-species benchmark for training and validating large scale mass spectrometry proteomics machine learning models |
Dataset published at Zenodo repository.
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | A multi-species benchmark for training and validating large scale mass spectrometry proteomics machine learning models |
Dataset published at Zenodo repository. |
Statements
This is a de novo sequencing benchmark dataset derived from ninepublicly available mass spectrometry datasets. There are two versionsof the benchmark: main and balanced. The balanced version randomlyeliminates some spectra associated with some species in order tocreate a smaller, more evenly balanced dataset. Also provided are twozip files containing the raw data as well as intermediate results.Details about how the benchmark was created are provided in anassociated zenodo release, which contains the source code as well as amanuscript describing the benchmark. This release fixes a bug that incorrectly detected shared peptides between different species. It also includes the annotated spectra in mzSpecLib format.
0 references
4 September 2024
0 references