A distributed method of the Minimap2 tool
摘要
Minimap2 is a popular tool for aligning long reads or draft genome assembly to a reference sequence, especially suitable for third-generation nanopore sequencing. However, Minimap2 is designed for single-node execution and does not support distributed computing. This limits its performance and scalability when dealing with large-scale longread datasets. We use Apache Spark framework to parallelize Minimap2 tool based on MapReduce model. The result shows that our method can achieve significant speedup and scalability while maintaining high accuracy and sensitivity.