Gossypium hirsutum (AD1) 'YM11' genome XAAS_v1

Overview
Analysis NameGossypium hirsutum (AD1) 'YM11' genome XAAS_v1
MethodPacBio HiFi (hifiasm v. 0.16.1)
Source (v1.0)
Date performed2024-02-08

To characterize genomes corresponding to upland cotton with cold tolerance features that are suitable for the Xinjiang climate, YM11, an accession with elite cold tolerance and adaptation to the climate of Xinjiang, was selected for genome de novo assembly.  The sprouted seeds of YM11 (pure lines) were uniformly planted in seedling trays and then placed in a greenhouse at a 12 h/12 h day/night regime, 32 ◦C/27 ◦C accordingly. The planting medium was a mixture of peat soil, perlite, and vermiculite at a ratio of 3:1:1. During the three-leaf stage, the seedlings were uniformly placed in an artificial climate chamber at 4 ◦C with a light-to-dark cycle of 14 h:10 h and illumination intensity of 1500~2000 lx for cold stress treatment. The sampling time was set at 0 h and 6 h. Transcriptome sequencing (RNA-seq) and metabolome analysis were performed at the two-time points with three and six biological replicates, respectively. Approximately 2 g of leaves from each biological replicate were frozen in liquid nitrogen immediately, ground into fine powder, and then transferred to a -80 ◦C refrigerator for storage. In particular, two replicated samples (~1 g) at both 0 h and 6 h from the same tissue used in RNA-seq were selected for the CUT&Tag assay. In addition, approximately 5 g young leaves of YM11 were sampled to extract genomic DNA for genome assembly, and a collection of YM11 root, stem, leaf, flower, and boll mixed samples from the field was used to conduct RNA-seq for transcriptome-assisted genome annotation.

Global statistics of the genome assembly of YM11.

Category YM11
Total assembly (Mb) 2343.06
Contig N50 (Mb) 88.96
Contig number 725
Scaffold N50 (Mb) 108.48
Scaffold N90 (Mb) 61.46
Longest scaffold (Mb) 128.15
Anchored and oriented contigs (%) 98.01
GC content (%) 34.63
Repeat sequence (%) 68.24
Number of gene models 73.821
Number of exon 363,314
Number of intron 289,348
Mean coding sequence length (bp) 1196.75
Mean number of exons per gene 4.92
Mean exon length (bp) 243.54
Mean intron length (bp) 345.25
Assembly

The chromosomes (pseudomolecules) for Gossypium hirsutum cv 'Yuan Mian 11' genome. These files belong to the Gossypium hirsutum (AD1) 'YM11' genome XAAS_v1

Chromosomes (FASTA format) G.hirsutum_XAAS_AD1-YM11.genome.fasta
Genes

The predicted gene model, their alignments and proteins for Gossypium hirsutum cv 'Yuan Mian 11' genome. These files belong to the Gossypium hirsutum (AD1) 'YM11' genome XAAS_v1

Predicted gene models with exons (GFF3 format) G.hirsutum_XAAS_AD1-YM11.annotation.gff3.gz
Coding sequences, mRNA (FASTA format) G.hirsutum_XAAS_AD1-YM11.cds.fa.gz
Protein sequences (FASTA format) G.hirsutum_XAAS_AD1-YM11.pep.fa.gz
Homology

Homology of the Gossypium hirsutum YM11 genome v1.0 proteins was determined by pairwise sequence comparison using the blastp algorithm against various protein databases. An expectation value cutoff less than 1e-6  for the Arabidoposis proteins (Araport11, 2022-09), UniProtKB/SwissProt (Release 2023-07), and UniProtKB/TrEMBL (Release 2023-07) databases. The best hit reports are available for download in Excel format. 

Protein Homologs

G.hirsutum YM11 Genome v1.0 proteins with arabidopsis (Araport11) homologs (EXCEL file) AD1_YM11_v1_vs_tair.xlsx.gz
G.hirsutum YM11 Genome v1.0 proteins with arabidopsis (Araport11) (FASTA file) AD1_YM11_v1_vs_tair_hit.fasta.gz
G.hirsutum YM11 Genome v1.0 proteins without arabidopsis (Araport11) (FASTA file) AD1_YM11_v1_vs_tair_noHit.fasta.gz
G.hirsutum YM11 Genome v1.0 proteins with SwissProt homologs (EXCEL file) AD1_YM11_v1_vs_swissprot.xlsx.gz
G.hirsutum YM11 Genome v1.0 proteins with SwissProt (FASTA file) AD1_YM11_v1_vs_swissprot_hit.fasta.gz
G.hirsutum YM11 Genome v1.0 proteins without SwissProt (FASTA file) AD1_YM11_v1_vs_swissprot_noHit.fasta.gz
G.hirsutum YM11 Genome v1.0 proteins with TrEMBL homologs (EXCEL file) AD1_YM11_v1_vs_trembl.xlsx.gz
G.hirsutum YM11 Genome v1.0 proteins with TrEMBL (FASTA file) AD1_YM11_v1_vs_trembl_hit.fasta.gz
G.hirsutum YM11 Genome v1.0 proteins without TrEMBL (FASTA file) AD1_YM11_v1_vs_trembl_noHit.fasta.gz
Markers
Marker alignments were performed by the CottonGen Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map marker sequences from CottonGen to the Gossypium hirsutum YM11 v1.0 assembly. Markers required 90% identity over 97% of their length. For SSRs & RFLPs, gap size was restricted to 1000bp or less with less than 2 gaps. For dbSNPs and Indels gap size was restricted to 2bp with less than 2 gaps. The available files are in GFF3 format. Markers available in CottonGen are linked to JBrowse.
 
CottonGen SNP markers mapped to genome G.hirsutum_AD1_YM11_SNP
CottonGen RFLP markers mapped to genome G.hirsutum_AD1_YM11_RFLP
CottonGen SSR markers mapped to genome G.hirsutum_AD1_YM11_SSR
CottonGen InDel markers mapped to genome G.hirsutum_AD1_YM11_InDel
Publication

Wang J, Liang Y, Gong Z, Zheng J, Li Z, Zhou G, Xu Y, Li X. Genomic and epigenomic insights into the mechanism of cold response in upland cotton (Gossypium hirsutum). Plant physiology and biochemistry: PPB. 2023 Nov 25; 206:108206.

Transcript Alignments
Transcript alignments were performed by the CottonGen Team of Main Bioinformatics Lab at WSU. The alignment tool 'BLAT' was used to map transcripts to the G. hirsutum YM11 genome assembly. Alignments with an alignment length of 97% and 90% identify were preserved. The available files are in GFF3 format.
G. arboreum CottonGen RefTrans v1 AD1_YM11_g.arboreum_cottongen_reftransV1
G. hirsutum CottonGen RefTrans v1 AD1_YM11_g.hirsutum_cottongen_reftransV1
G. barbadense CottonGen RefTrans v1 AD1_YM11_g.barbadense_cottongen_reftransV1
G. raimondii CottonGen RefTrans v1 AD1_YM11_g.raimondii_cottongen_reftransV1