Gossypium tomentosum (AD3) genome HAU_v1
Overview
About the assembly Here, we sequenced the allotetraploid wild species G. tomentosum (AD3) germplasm 'HAZU-2020-AD3' by generating 229.32 gigabases (Gb) PacBio single-molecule long reads (N50 length of 21.44 Kb) with 102.90 fold genome coverage (Table 1). The completed assembly of the genome captured 2229 megabases (Mb) genome sequences and contained 1286 contigs (contig N50 = 11.98 Mb; Table 1). The assessment of assembly completeness, sequence consistency, accuracy and heterozygosity showed a high quality G. tomentosum genome assembly (Table 1). The final assembly captured 2.23 Gb genome sequence, of which 98.40% of the entire genome sequence was organized orientation and divided into 26 chromosomes using Hi-C data; and 96.06% of the remaining scaffolds was less than 0.1 Mb in length. While it covered approximately 94.64% of the total genome size with an estimated 2.36 Gb by k-mer genome survey analysis. The assembly integrity of genetic regions was supported 98.68% (1421) of highly conserved core proteins in the BUSCO dataset, which further confirmed the high completeness and quality of the current genome assembly. Table 1: Summary of G. tomentosum genome assembly and annotation.
Publication: Shen, et al. (2021). Gossypium tomentosum genome and interspecific ultra-dense genetic maps reveal genomic structures, recombination landscape and flowering depression in cotton. Genomics 113, 1999–2009. doi: 10.1016/j.ygeno.2021.04.036 Assembly
The chromosomes (pseudomolecules) for Gossypium tomentosum genome. These files belong to the HAU-AD3 Assembly v1 and NCBI BioProject PRJNA629964)
Functional Analysis
Functional annotation files for the Gossypium tomentosum HAU Genome v1.0 are available for download below. The Gossypium tomentosum HAU Genome v1.0 proteins were analyzed using InterProScan in order to assign InterPro domains and Gene Ontology (GO) terms. Pathways analysis was performed using the KEGG Automatic Annotation Server (KAAS). Downloads
Genes
The predicted gene model, their alignments and proteins for G. tomentosum genome. These files belong to the HAU-AD3 Assembly v1.
Homology
Homology of the Gossypium tomentosum HAU Genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-9 was used for the NCBI nr (Release 2021-09) and 1e-6 for the Arabidoposis proteins (Araport11), UniProtKB/SwissProt (Release 2021-09), and UniProtKB/TrEMBL (Release 2021-09) databases. The best hit reports are available for download in Excel format.
Protein Homologs
Markers
Marker alignments were performed by the CottonGen Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map marker sequences from CottonGen to the Gossypium tomentosum HAU me assembly. Markers required 90% identity over 97% of their length. For SSRs & RFLPs, gap size was restricted to 1000bp or less with less than 2 gaps. For dbSNPs and Indels gap size was restricted to 2bp with less than 2 gaps. The available files are in GFF3 format. Markers available in CottonGen are linked to JBrowse.
Publications
Shen, C., Wang, N., Zhu, D., Wang, P., Wang, M., Wen, T., et al. (2021). Gossypium tomentosum genome and interspecific ultra-dense genetic maps reveal genomic structures, recombination landscape and flowering depression in cotton. Genomics 113, 1999–2009. doi: 10.1016/j.ygeno.2021.04.036 Repeats
Total repeats of the chromosomes (pseudomolecules) for Gossypium tomentosum genome. This file belong to the HAU-AD3 Assembly v1
Transcript Alignments
Transcript alignments were performed by the CottonGen Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map transcripts to the G. tomentosum genome assembly. Alignments with an alignment length of 97% and 90% identify were preserved. The available files are in GFF3 format.
Links
|