Dual redundant sequencing strategy: Full-length gene characterisation of 1056 novel and confirmatory HLA alleles

Research output: Contribution to journalResearch articleContributedpeer-review

Contributors

  • V. Albrecht - , DKMS Life Science Lab gGmbH (Author)
  • C. Zweiniger - , DKMS Life Science Lab gGmbH (Author)
  • V. Surendranath - , DKMS Life Science Lab gGmbH (Author)
  • K. Lang - , DKMS Life Science Lab gGmbH (Author)
  • G. Schöfl - , DKMS Life Science Lab gGmbH (Author)
  • A. Dahl - , DRESDEN-concept Genome Center (CMCB Core Facility) (Author)
  • S. Winkler - , Max Planck Institute of Molecular Cell Biology and Genetics (Author)
  • V. Lange - , DKMS Life Science Lab gGmbH (Author)
  • I. Böhme - , DKMS Life Science Lab gGmbH (Author)
  • A. H. Schmidt - , DKMS Life Science Lab gGmbH, DKMS Donor Center gGmbH (Author)

Abstract

The high-throughput department of DKMS Life Science Lab encounters novel human leukocyte antigen (HLA) alleles on a daily basis. To characterise these alleles, we have developed a system to sequence the whole gene from 5′- to 3′-UTR for the HLA loci A, B, C, DQB1 and DPB1 for submission to the European Molecular Biology Laboratory – European Nucleotide Archive (EMBL-ENA) and the IPD-IMGT/HLA Database. Our workflow is based on a dual redundant sequencing strategy. Using shotgun sequencing on an Illumina MiSeq instrument and single molecule real-time (SMRT) sequencing on a PacBio RS II instrument, we are able to achieve highly accurate HLA full-length consensus sequences. Remaining conflicts are resolved using the R package DR2S (Dual Redundant Reference Sequencing). Given the relatively high throughput of this strategy, we have developed the semi-automated web service TypeLoader, to aid in the submission of sequences to the EMBL-ENA and the IPD-IMGT/HLA Database. In the IPD-IMGT/HLA Database release 3.24.0 (April 2016; prior to the submission of the sequences described here), only 5.2% of all known HLA alleles have been fully characterised together with intronic and UTR sequences. So far, we have applied our strategy to characterise and submit 1056 HLA alleles, thereby more than doubling the number of fully characterised alleles. Given the increasing application of next generation sequencing (NGS) for full gene characterisation in clinical practice, extending the HLA database concomitantly is highly desirable. Therefore, we propose this dual redundant sequencing strategy as a workflow for submission of novel full-length alleles and characterisation of sequences that are as yet incomplete. This would help to mitigate the predominance of partially known alleles in the database.

Details

Original languageEnglish
Pages (from-to)79-87
Number of pages9
JournalHLA
Volume90
Issue number2
Publication statusPublished - Aug 2017
Peer-reviewedYes

External IDs

PubMed 28547825

Keywords

Keywords

  • full-length gene sequencing, HLA typing, NGS, novel HLA alleles, PacBio

Library keywords