V2 Pipeline Implementation (Part 1)#16
Conversation
There was a problem hiding this comment.
Can you add the final step of building the numpy matrices? Also, what happened to the sorting the gvcf step after recombining gvcfs? Is that not it's own step anymore?
There was a problem hiding this comment.
Is the converting from ps4g to numpy done in seq_sim though? I think that's more of a grits command correct?
There was a problem hiding this comment.
Yeah, but since I am pulling in the grits project, I can make a process builder extension to call that Python script.
| var successCount = 0 | ||
| var failureCount = 0 | ||
|
|
||
| pairs.forEach { (baseSample, donorSample) -> |
There was a problem hiding this comment.
You might want to refactor this a bit. It's also recommended to do a standard for loop when you only do a forEach on things. You might also want to refactor out the inner loop as well.
| Path.of(it).toAbsolutePath().normalize() | ||
| } ?: gvcfOutputDir | ||
| if (gvcfInput == null) { | ||
| throw RuntimeException("Cannot run split-gvcfs: no GVCF input available (specify 'input' in config or run maf-to-gvcf first)") |
There was a problem hiding this comment.
Will it ever get to this exception throw? Due to the ?: operator on line 236?
There was a problem hiding this comment.
Looks like there are a bunch of these lower down as well.
There was a problem hiding this comment.
Theoretically, yes. This is if users want to start midway through the pipeline or want to skip a step. It provides some checks to stop cascades of downstream errors.
Summary
Adds steps and user defined methods for splitting gVCFs into "donor" and "base" sets. "Donor" set gets down-sampled and "base" set does not. A "base-donor" pair row (as defined in a tab-delimited file provided by the user) will get fed into the
mutate-assembliescommand:Features
split-gvcfscommand for splitting into "base" and "donor" sub-directories (mainly needed for automation purposes)mutate-assembliescommand with a "batch-mode". Adds the parameters:--keyfile,--base-dir, and--mutation-donor-dir.OrchestrateV2automation chainsplit-gvcfsandmutate-assembliesbatch modeBug Fixes
Breaking Changes
Checklist
versioninbuild.gradle.kts(REQUIRED - see below)