elPrep v4.0.0 released


(Pascal Costanza) #1

Hi,

We are happy to announce that elPrep v4.0.0 is released.

elPrep is a high-performance tool for preparing .sam/.bam files for variant calling in sequencing pipelines. It can be used as a drop-in replacement for SAMtools/Picard/GATK4, and was extensively tested with different pipelines for variant analysis with GATK. The key advantage of elPrep is that it only performs a single-pass to process a .sam/.bam file, independent of the number of processing steps that need to be applied in a particular pipeline, greatly improving runtime performance.

For more details see https://github.com/exascience/elprep

Since version 3, elPrep is implemented in Go. We are currently working on two publications about elPrep: One is about our performance evaluation of C++, Go, and Java, which we have performed to decide which programming language to use for further development of elPrep. Go performed better than both C++ and Java for our case, and the paper will describe in more detail how we came to that conclusion. The other publication will be a detailed description of the new features in elPrep v4.0.0.

elPrep v4.0.0 comes with following new features:

  • added base quality score recalibration (BQSR)
  • added optical duplicate marking
  • added metrics (MultiQC compatible)
  • support for SAM File Format version 1.6
  • support for FASTA and VCF files
  • support for BAM/BGZF files
  • support for elPrep-specific elsites and elfasta formats
  • split/filter/merge (sfm) mode now implemented in Go instead of Python; therefore no more dependency on Python
  • added --log-path option to all tools
  • various API and performance improvements
  • changed license to the GNU Affero General Public License version 3 as published by the Free Software Foundation, with Additional Terms
  • updated demos
  • adopted support for Go modules / semantic versioning

elPrep uses our pargo library for parallel programming in Go, which now has been tagged as v1.0.0 to support semantic versioning as well. See https://github.com/exascience/pargo for the pargo library.

Pascal