We are happy to announce that elPrep v4.0.0 is released.
elPrep is a high-performance tool for preparing .sam/.bam files for variant calling in sequencing pipelines. It can be used as a drop-in replacement for SAMtools/Picard/GATK4, and was extensively tested with different pipelines for variant analysis with GATK. The key advantage of elPrep is that it only performs a single-pass to process a .sam/.bam file, independent of the number of processing steps that need to be applied in a particular pipeline, greatly improving runtime performance.
For more details see https://github.com/exascience/elprep
Since version 3, elPrep is implemented in Go. We are currently working on two publications about elPrep: One is about our performance evaluation of C++, Go, and Java, which we have performed to decide which programming language to use for further development of elPrep. Go performed better than both C++ and Java for our case, and the paper will describe in more detail how we came to that conclusion. The other publication will be a detailed description of the new features in elPrep v4.0.0.
elPrep v4.0.0 comes with following new features:
- added base quality score recalibration (BQSR)
- added optical duplicate marking
- added metrics (MultiQC compatible)
- support for SAM File Format version 1.6
- support for FASTA and VCF files
- support for BAM/BGZF files
- support for elPrep-specific elsites and elfasta formats
- split/filter/merge (sfm) mode now implemented in Go instead of Python; therefore no more dependency on Python
- added --log-path option to all tools
- various API and performance improvements
- changed license to the GNU Affero General Public License version 3 as published by the Free Software Foundation, with Additional Terms
- updated demos
- adopted support for Go modules / semantic versioning
elPrep uses our pargo library for parallel programming in Go, which now has been tagged as v1.0.0 to support semantic versioning as well. See https://github.com/exascience/pargo for the pargo library.