The powerful high-throughput DNA sequencing technologies catalyzed by the Human Genome Project, which have contributed to dramatic advances in biomedicine, are now being directed to characterizing the genomes of plants and microbes. Leading this effort is the US Department of Energy (DOE) Joint Genome Institute (JGI), a national user facility that unites the expertise of five national laboratories to advance genomics in support of the DOE mission areas of bioenergy, carbon cycling, and bioremediation.
10.00–10.45
3. DNA Sequencing –Chris Daum
JGIʼs future depends on new sequencing technologies and applications developed based on these technologies. With multiple sequencing platforms available, JGIʼs R&D team has been aimed to develop sequencing applications based on the strength provided by different platforms. Our areas of development lie in de novo whole genome shotgun sequencing, transcriptome sequencing, and metagenomic sample diversity study. Examples of JGIʼs available sequencing applications in genomic research will be discussed.
10.45–11.00
Break – Discussions Continue
11.00–11.45
4. Genome Assembly Overview– Rob Riley
Genome assembly is the process of inferring the full DNA sequence of a microbial, viral, or eukaryotic chromosome, from many shorter pieces, known as reads. While the reads produced by current sequencing technologies are getting longer, they still fall short of the full lengths of chromosomes, or even annotatable genomic features such as genes, operons, or biosynthetic gene clusters. Computational methods for genome assembly are therefore a crucial part of genome sequencing and analysis. I will give an overview of the theory and practice of genome assembly with applications to both short and long read data.
11.45–13.00
Lunch & Facility Tour
13.00–13.45
5. Introduction to IMG annotations and functional categories – Natalia Ivanova
Annotation of microbial genomes usually starts with finding the genes coding for stable RNAs (rRNA and tRNA) and protein-coding genes (CDSs). The principles underlying gene prediction in microbial genomes, as well as different implementations of these algorithms and most popular gene finding tools will be discussed.Genome analysis and gene function prediction depends on the comparison of sequences to the existing information stored in databases. They can either be simple repositories of nucleotide or protein sequence, or contain curated information related to the function of the genetic elements. Used in combination, bioinformatics databases constitute the most powerful method for gene function prediction. In this presentation, methods commonly used for functional annotation will be discussed.
13.45–14.30
6. Identifying datasets (browse, search, curation and set construction) – Rekha Seshadri
14.30–
15:15
7. Compare Datasets (Abundance profiles and Dataset clustering) – Simon Roux
Lunch – Discussions Continue
Group Photo & Users: Exercise Group discussions
13.00–13.30
Continue Discussions for Part II – Rekha Seshadri
13.30-14.00
12. VEGA & NeLLi thrusts – Simon Roux & Frederick Schulz
14.00–14.30
13. Identifying mobile genetic elements with geNomad – Antonio Camargo
14.30–15.00
14. IMG/VR & IMG/PR Data & Content– Simon Roux & Antonio Camargo
15.00–15.15
Break – Discussions Continue
15.15–15.30
IMG/VR & IMG/PR Navigation – Simon Roux & Antonio Camargo
15.30–16.00
IMG/VR & IMG/PR Exercises – Users
16.00-17.00
Working Groups – Dataset selection and curation – Users (Independent Work)
Wednesday
IMG Tutorials
Time
Title
Presenter
09.00–10.30
15. Introduction to Metagenome data & Demo – Natalia Ivanova
The main differences between genomes and metagenomes in terms of data and analysis tools will be reviewed.
A snapshot of microbial community structure can be derived from analysis of metagenomic data. IMG/M methods and tools for establishing the taxonomic identity of community members will be presented along with tools for determining the fine population structure, genetic variation and genome dynamics of the dominant populations. Methods for assessing the diversity and abundance of microbial communities will be discussed.
10.30–10.45
16. Group management, sharing data/analysis – Amy Chen
10.45–12.00
Working Groups– Users (Independent Work)
12.00-13.00
Lunch – Discussions Continue
13.00-13.30
17. SIP Metagenomics Overview – Rex Malmstrom
13.00-14.15
18. Fungal / Algal Programs and Systems – Sajeet Haridas & Sara Calhoun
14.15-14.30
Break – Discussions Continue
14.30-17.00
Working Groups – User Groups
Thursday
Working Group Project Discussions
Time
Title
Presenter
09.00-09.30
19. GOLD overview and project registration – Supratim Mukherjee
The Genomes Online Database (GOLD) is data management system that catalogs sequencing projects and their associated metadata from around the world. There are three different sources of projects in GOLD: internal projects from the Department of Energy Joint Genome Institute (DOE JGI) that are entered automatically, external projects entered by GOLD users and projects from public databases such as NCBI. GOLD serves as the entry point for projects submitted for analysis to the IMG data management system and ensures that projects are correctly defined along with their necessary metadata. This presentation will provide an overview of the commonly used GOLD terminologies, a description of its four-level organization system and a tutorial on how to enter sequencing projects in GOLD.
09.30-10.00
20. NMDC and Submitting reads to NMDC for assembly [tutorial] – Emiley Eloe-Fadrosh