Skip copying specified files with an exclusion pattern#11
Conversation
| parser.add_argument( | ||
| "-x", | ||
| "--exclude", | ||
| metavar="REGEX", | ||
| type=str, | ||
| default=None, | ||
| help="skip copying FASTQ files containing the given substring", | ||
| ) |
There was a problem hiding this comment.
New -x/--exclude argument in the CLI.
| subdir="seq", | ||
| link=False, | ||
| verbose=False, | ||
| excl_pattern=None, |
There was a problem hiding this comment.
New excl_pattern argument in the Python API.
| if self.excl_pattern and re.search(self.excl_pattern, source_path.name): | ||
| continue |
There was a problem hiding this comment.
Core change in this PR. All other changes are supporting updates to the interfaces or tests of the new behavior.
|
Calling @RyanBerger98 for the code review, although would welcome any feedback you might have @rnmitchell. Thanks! |
|
This looks good to me! |
RyanBerger98
left a comment
There was a problem hiding this comment.
Updates look good and functionality is as expected. Minor comments/concerns.
There was a problem hiding this comment.
I recommend updating the __len__ attribute of the FastqCopier class. Right now if using the exclude flag, ezfastq will say its copying over 2 files when it's really only copying 1.
def __len__(self):
return len([fastq for fastq in self])
Simplifies the length attribute and makes it more accurate.
| [bold cyan]Examples:[/bold cyan] | ||
| [dim]ezfastq /path/to/fastqs/ sample1 sample2 sample3 | ||
| ezfastq /path/to/fastqs/ s1:Sample1 s2:Sample2 s3:Sample3 | ||
| ezfastq /path/to/fastqs/ samplenames.txt | ||
| ezfastq /path/to/fastqs/ samplenames.txt --workdir /path/to/projectdir/ --subdir seq/Run01/ |
There was a problem hiding this comment.
Also might be worth updating the CLI epilogue to include a command using the --exclude flag
ezfastq /path/to/fastqs/ sample1 sample2 --exclude "[r,R]2"
|
Great suggestions. I updated the |
This PR adds a new feature that skips copying for user-specified FASTQ files. It doesn't affect the initial FASTQ file discovery stage: if, for instance,
ezfastqis expecting paired-end data, it will still fail if more or less than two FASTQ files are found for any sample. But after successful discovery, the new feature will skip copying for files matching the user-specified exclusion pattern (typically something like "R2").Pair programming with @danejo3