EXCEEDS logo
Exceeds
npetrill

PROFILE

Npetrill

Worked on the broadinstitute/warp repository to enhance genome data preparation and indexing workflows, focusing on reducing errors and improving downstream analysis reliability. Addressed duplicate contig issues in FASTA files to ensure accurate genome indexing with bwa-mem2, and improved GTF file processing for STAR index generation by filtering out unnecessary lines. Introduced dynamic handling of mitochondrial accessions, enabling conditional removal of duplicate contigs to stabilize data preparation for both indexing and STAR workflows. Utilized WDL and bash scripting alongside Python for data processing and workflow management, resulting in more robust, reproducible pipelines and clearer changelogs to support collaboration and auditability.

Overall Statistics

Feature vs Bugs

67%Features

Repository Contributions

11Total
Bugs
1
Commits
11
Features
2
Lines of code
196
Activity Months1

Work History

December 2025

11 Commits • 2 Features

Dec 1, 2025

December 2025 monthly summary for broadinstitute/warp. Focused on robust genome data prep and indexing improvements to reduce errors and improve downstream analyses. Delivered three primary updates: 1) remove duplicate contig NC_028718.1 from FASTA prior to genome indexing (bwa-mem2) to ensure accurate indexing; 2) enhance GTF processing for STAR index build by removing lines where third column is 'source', improving index quality; 3) dynamic mitochondrial accession handling with conditional removal of duplicate contigs to stabilize genome data preparation for indexing and STAR workflows. Impact: reduces indexing failures, improves alignment reliability and reproducibility, and provides clearer changelogs for audit and collaboration. Technologies demonstrated: bwa-mem2, STAR, GTF cleaning, dynamic data handling, changelog maintenance, Python scripting.

Activity

Loading activity data...

Quality Metrics

Correctness92.8%
Maintainability85.4%
Architecture85.4%
Performance85.4%
AI Usage21.8%

Skills & Technologies

Programming Languages

WDLbash

Technical Skills

bioinformaticsdata processinggenome analysisgenome processinggenomic analysispipeline developmentscriptingworkflow developmentworkflow management

Repositories Contributed To

1 repo

Overview of all repositories you've contributed to across your timeline

broadinstitute/warp

Dec 2025 Dec 2025
1 Month active

Languages Used

WDLbash

Technical Skills

bioinformaticsdata processinggenome analysisgenome processinggenomic analysispipeline development