ALPHA: A Novel Algorithm-Hardware Co-design for Accelerating DNA Seed Location Filtering

Research output: Contribution to book/Conference proceedings/Anthology/ReportConference contributionContributedpeer-review

Contributors

Abstract

Sequence alignment is a fundamental operation in genomic analysis where DNA fragments called reads are mapped to a long reference DNA sequence. There exist a number of (in)exact alignment algorithms with varying sensitivity for both local and global alignments, however, they are all computationally expensive. With the advent of high-throughput sequencing (HTS) technologies that generate a mammoth amount of data, there is increased pressure on improving the performance and capacity of the analysis algorithms in general and the mapping algorithms in particular. While many works focus on improving the performance of the aligner themselves, recently it has been demonstrated that restricting the mapping space for input reads and filtering out mapping positions that will result in a poor match can significantly improve the performance of the alignment operation. However, this is only true if it is guaranteed that the filtering operation can be performed significantly faster. Otherwise, it can easily outweigh the benefits of the aligner. To expedite this pre-alignment filtering, among others, the recently proposed GRIM-Filter uses highly-parallel processing-in-memory operations benefiting from light-weight computational units on the logic-in-memory layer. However, the significant amount of data transferring between the memory and logic-in-memory layers quickly becomes a performance and energy bottleneck for the memory subsystem and ultimately for the overall system. By analyzing input genomes, we found that there are unexpected data-reuse opportunities in the filtering operation. We propose an algorithm-hardware co-design that exploits the data-reuse in the seed location filtering operation and, compared to the GRIM-Filter, cuts the number of memory accesses by 22-54 percent. This reduction in memory accesses improves the overall performance and energy consumption by 19-44 and 21-49 percent, respectively.

Details

Original languageEnglish
Title of host publicationIEEE Transactions on Emerging Topics in Computing
Pages1464-1475
Number of pages12
Volume10
Edition3
Publication statusPublished - 2022
Peer-reviewedYes

Publication series

SeriesIEEE Transactions on Emerging Topics in Computing

External IDs

Mendeley 23ace85f-9de0-30c3-9ce9-9b9dd01fdb03
ORCID /0000-0002-5007-445X/work/165453994

Keywords

Research priority areas of TU Dresden

Sustainable Development Goals

Keywords

  • Bioinformatics, DNA, DNA sequence alignment, Genome sequencing, Genomics, Heuristic algorithms, Memory management, Metadata, processing-in-memory, seed location filtering, Sequential analysis