Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Notice: Unexpected clearActionName after getActionName already called in /var/www/html/w/includes/Context/RequestContext.php on line 321
Test datasets for Hi-C scaffolding - MaRDI portal

Deprecated: Use of MediaWiki\Skin\SkinTemplate::injectLegacyMenusIntoPersonalTools was deprecated in Please make sure Skin option menus contains `user-menu` (and possibly `notifications`, `user-interface-preferences`, `user-page`) 1.46. [Called from MediaWiki\Skin\SkinTemplate::getPortletsTemplateData in /var/www/html/w/includes/Skin/SkinTemplate.php at line 691] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Deprecated: Use of QuickTemplate::(get/html/text/haveData) with parameter `personal_urls` was deprecated in MediaWiki Use content_navigation instead. [Called from MediaWiki\Skin\QuickTemplate::get in /var/www/html/w/includes/Skin/QuickTemplate.php at line 131] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Test datasets for Hi-C scaffolding

From MaRDI portal
(Redirected from Dataset:6682178)
Test datasets for Hi-C scaffolding



DOI10.5281/zenodo.7079219Zenodo7079219MaRDI QIDQ6682178

Dataset published at Zenodo repository.

Author name not available (Why is that?)

Publication date: 14 September 2022

Copyright license: No records found.



We provided two datasets for testing Hi-C scaffolding tools. For the CHM13 test dataset, we randomly chunked the first 10Mb of chr1, chr2 and chr3 of the T2T-CHM13v1.1 human genome assembly (Nurk etal. 2022) into 57 contigs. The Hi-C data downloaded from the telomere-to-telomere consortium GitHub repository (https://github.com/marbl/CHM13) were mapped to the reference genome and the reads mapped to these regions were extracted to generateHi-C alignment files. For the LYZE01 test dataset, the Saccharomyces cerevisiae strain W303 genome assembly (Matheson et al. 2017) was split at positions with gaps (N) to get the original contigs. Anindependent Hi-C data library wasdownloaded from the NCBI repository (GEO Accession GSM2417297) anddownsampled to approximately 20X. The downsampled Hi-C data were mapped to the contigs to generate Hi-C alignment files. We provided five files for each test dataset:the contig file in FASTA format, the FASTA index file generated with SAMtools faidx command, and the Hi-C alignment file in BAM format sorted by coordinate,in BAM format sorted by query names (with the identifier qn in the file name), and in BED format.






This page was built for dataset: Test datasets for Hi-C scaffolding