De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

  • Brian J. Haas - , Broad Institute of Harvard University and MIT (Author)
  • Alexie Papanicolaou - , Commonwealth Scientific & Industrial Research Organisation (CSIRO) (Author)
  • Moran Yassour - , Broad Institute of Harvard University and MIT (Author)
  • Manfred Grabherr - (Author)
  • Philip D. Blood - (Author)
  • Joshua Bowden - (Author)
  • Matthew Brian Couger - (Author)
  • David Eccles - (Author)
  • Bo Li - (Author)
  • Matthias Lieber - , Center for Information Services and High Performance Computing (ZIH) (Author)
  • Matthew D. MacManes - (Author)
  • Michael Ott - , Commonwealth Scientific & Industrial Research Organisation (CSIRO) (Author)
  • Joshua Orvis - (Author)
  • Nathalie Pochet - , Broad Institute of Harvard University and MIT (Author)
  • Francesco Strozzi - (Author)
  • Nathan Weeks - (Author)
  • Rick Westerman - (Author)
  • Thomas William - , GWT-TUD GmbH (Author)
  • Colin N. Dewey - (Author)
  • Robert Henschel - , Indiana University Bloomington (Author)
  • Richard D. LeDuc - , Indiana University Bloomington (Author)
  • Nir Friedman - (Author)
  • Aviv Regev - , Broad Institute of Harvard University and MIT (Author)

Abstract

De novo assembly of RNA-seq data enables researchers to study transcriptomes without the need for a genome sequence; this approach can be usefully applied, for instance, in research on 'non-model organisms' of ecological and evolutionary importance, cancer samples or the microbiome. In this protocol we describe the use of the Trinity platform for de novo transcriptome assembly from RNA-seq data in non-model organisms. We also present Trinity-supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples and approaches to identify protein-coding genes. In the procedure, we provide a workflow for genome-independent transcriptome analysis leveraging the Trinity platform. The software, documentation and demonstrations are freely available from http://trinityrnaseq.sourceforge.net. The run time of this protocol is highly dependent on the size and complexity of data to be analyzed. The example data set analyzed in the procedure detailed herein can be processed in less than 5 h.

Details

Original languageEnglish
Pages (from-to)1494-1512
Number of pages19
JournalNature protocols
Issue number8
Publication statusPublished - 2013
Peer-reviewedYes

External IDs

Scopus 84880266648
ORCID /0000-0003-3137-0648/work/142238863

Keywords

Sustainable Development Goals