web
You’re offline. This is a read only version of the page.
close
Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at opm.gov.
Print Article: KA-03514

I have sequenced an entire genome (transcriptome). How do I submit it to GenBank?

You can submit genome/transcriptome sequences to GenBank as long as they are assembled. If you only have raw sequencing reads, you need to submit these to the Sequence Read Archive (SRA) database. Sequencing data of entire prokaryotic or eukaryotic genomes or transcriptomes are considered large projects and you will be required to register your project(s) and your sample(s) in the BioProject and BioSample databases respectively. These registrations distinguish large data submissions from standard GenBank submissions that have no such requirement. Note that the BioProject and BioSample databases contain metadata, but not the actual sequence data. If you are submitting both SRA reads and assembled GenBank sequences, you need to register the corresponding project(s) and sample(s) only once*, but you still have to submit your sequence data to each of the databases separately. Depending on your data type, follow the submission guidelines as listed in this article and use email contacts provided in the guidelines for help.

GenBank:
  • Complete Genomes (for example, complete prokaryotic genomes or complete chromosomes of lower eukaryotes)
  • Whole Genome Shotgun (WGS) projects (genome assemblies of incomplete genomes or incomplete chromosomes of prokaryotes or eukaryotes that are generally being sequenced with a whole genome shotgun strategy)
  • Transcriptome Shotgun Assemblies (TSA) (computationally assembled transcript sequences from primary data such as ESTs, traces and next generation sequencing technologies)
SRA:
  • SRA submission guide (genetic data and the associated quality scores produced by next generation sequencing technologies)

*You can pre-register the BioProject(s) and BioSample(s), or create them during the data submission of WGS or SRA sequences. Since TSA submissions require prior submission to SRA, the TSA submission will use the BioProject/BioSample registered when the reads were submitted.