Skip to content

HTML report/nfcore addons#4

Open
Lebaranto wants to merge 3 commits into
mainfrom
feature/nfcore-adaptation
Open

HTML report/nfcore addons#4
Lebaranto wants to merge 3 commits into
mainfrom
feature/nfcore-adaptation

Conversation

@Lebaranto

@Lebaranto Lebaranto commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Updated pipeline HTML report with fixed nfcore modules issues

Summary by CodeRabbit

  • New Features

    • Added comprehensive QC outputs including FastQC, samtools analyses (flagstat, stats, idxstats, quickcheck), and an aggregated MultiQC report.
    • Introduced a custom TrES HTML report with embedded metrics JSON for quick assessment of sequencing quality and run statistics.
    • Expanded output structure with dedicated qc/, multiqc/, and tres_report/ directories for organized result access.
  • Documentation

    • Updated output documentation to reflect new QC and report file locations and contents.

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The PR adds end-to-end QC and reporting infrastructure to TrESFlow. It replaces legacy DNA duplicate-marking and coverage steps with nf-core GATK4_MARKDUPLICATES and DeepTools bamCoverage (plus normalization adapters and a zero-mapped-read guard). It installs nf-core FastQC, samtools sidecar (flagstat/stats/idxstats/quickcheck), and MultiQC modules, and adds a custom Python HTML report renderer invoked by a new TRES_REPORT_HTML module. The DNA and RNA subworkflows and the main TRESEQ workflow are rewired to pass QC artifacts through all new steps.

Changes

TrESFlow QC & Reporting Infrastructure

Layer / File(s) Summary
GATK4 markduplicates module + normalization adapter
modules/nf-core/gatk4/markduplicates/..., modules/local/normalize_dna_markduplicates/main.nf
Installs the nf-core GATK4_MARKDUPLICATES process (conda env, BAM/CRAM handling, BAM→CRAM conversion, Java heap sizing, stub) with test configs and a full test suite, plus a local normalization adapter that renames nf-core outputs to the TrESFlow *_MarkedDup.bam / *_MarkedDup.bam.bai / *.DuplicateMetrics.txt filename contract.
NoDup BAM check + DeepTools bamCoverage chain
modules/local/check_dna_nodup_bam/main.nf, modules/nf-core/deeptools/bamcoverage/..., modules/local/normalize_dna_bamcoverage/main.nf
Adds CHECK_DNA_NODUP_BAM (counts mapped reads via samtools view, emits a *.zero_mapped_nodup_bam.tsv warning and skips coverage when zero-mapped), installs the nf-core DEEPTOOLS_BAMCOVERAGE module (CRAM→BAM conversion, configurable bigWig/bedGraph output, stub), and adds NORMALIZE_DNA_BAMCOVERAGE to publish as *_NoDup.bw.
nf-core FastQC + samtools sidecar QC modules
modules/nf-core/fastqc/..., modules/nf-core/samtools/{flagstat,stats,idxstats,quickcheck}/..., modules/local/samtools_quickcheck_report/main.nf, assets/test_realdata/ligation_barcode_whitelist.txt
Installs nf-core FastQC (conda lock files for amd64/arm64, process, meta.yml, 12 test cases) and all four samtools sidecar processes (each with conda env, process, meta.yml, tests), plus a local SAMTOOLS_QUICKCHECK_REPORT adapter that converts exit codes to pass/fail TSVs; populates the barcode whitelist test asset.
nf-core MultiQC module
modules/nf-core/multiqc/...
Installs the nf-core MULTIQC module (conda env, conda lock files for amd64/arm64, process definition with dynamic CLI construction, meta.yml, five-test suite covering default/custom-prefix/external-config/multiple-config/stub variants).
TrES HTML report renderer and module
bin/render_tres_report.py, modules/local/tres_report_html/main.nf
Adds an 849-line CLI script that ingests pipeline artifacts (STARsolo summaries/logs, samtools flagstat/stats/idxstats, GATK dup-metrics, barcode TSVs) into aggregated per-library metrics, writes a metrics JSON, and renders a self-contained HTML report with metric cards, sequencing-quality table, and client-side CSV/Excel export; the TRES_REPORT_HTML Nextflow module invokes it.
RNA_CORE + DNA_CORE subworkflow rewiring
subworkflows/local/dna_core/main.nf, subworkflows/local/dna_core/meta.yml, subworkflows/local/rna_core/main.nf, modules/local/rna_starsolo_align/main.nf
DNA_CORE replaces legacy includes with nf-core GATK4/DeepTools imports, adds nfcoreDnaMeta/restoreDnaMeta helpers, wires the new markdup and coverage chains, and exposes coverage_warnings, coverage_bigwigs, and barcode_report_files. RNA_CORE exposes aligned_solo_summaries, aligned_star_logs, and barcode_report_files. RNA_STARSOLO_ALIGN gains star_log and solo_summary outputs with mock-mode stubs.
Main workflow wiring + configuration + docs
workflows/treseq.nf, conf/base.config, conf/modules.config, conf/test.config, modules.json, README.md, docs/output.md, docs/usage.md, docs/architecture/implemented_pipeline.md
Wires all new QC/reporting steps in treseq.nf (qcMeta/toFastqcInput helpers, FASTQC + samtools sidecar channels, TRES_REPORT_HTML + MULTIQC aggregation, extended emit). Updates base.config with process_single label and all new process resource blocks; modules.config with publishDir/ext.args/ext.prefix for every new module; test.config with replaced DNA mocks and samtools QC disabled via ext.when = false; modules.json registry. Expands README, output, usage, and architecture docs.

Sequence Diagram

sequenceDiagram
  rect rgba(70, 130, 180, 0.5)
    note right of TRESEQ: DNA pipeline
    TRESEQ->>GATK4_MARKDUPLICATES: splitName BAM (nfcoreDnaMeta)
    GATK4_MARKDUPLICATES-->>NORMALIZE_DNA_MARKDUPLICATES: BAM/BAI/metrics
    NORMALIZE_DNA_MARKDUPLICATES-->>SPLIT_DUPLICATES_DNA: *_MarkedDup.bam
    SPLIT_DUPLICATES_DNA-->>CHECK_DNA_NODUP_BAM: NoDup BAM + BAI
    CHECK_DNA_NODUP_BAM-->>DEEPTOOLS_BAMCOVERAGE: non-zero BAM
    CHECK_DNA_NODUP_BAM-->>coverage_warnings: zero_mapped TSV
    DEEPTOOLS_BAMCOVERAGE-->>NORMALIZE_DNA_BAMCOVERAGE: bigWig
    NORMALIZE_DNA_BAMCOVERAGE-->>TRESEQ: *_NoDup.bw
  end

  rect rgba(60, 179, 113, 0.5)
    note right of TRESEQ: Sidecar QC
    TRESEQ->>FASTQC: raw FASTQs (qcMeta)
    TRESEQ->>SAMTOOLS_FLAGSTAT: BAMs
    TRESEQ->>SAMTOOLS_STATS: BAMs
    TRESEQ->>SAMTOOLS_IDXSTATS: BAMs
    TRESEQ->>SAMTOOLS_QUICKCHECK: BAMs
    SAMTOOLS_QUICKCHECK-->>SAMTOOLS_QUICKCHECK_REPORT: exit code -> TSV
  end

  rect rgba(210, 105, 30, 0.5)
    note right of TRESEQ: End-of-run reports
    TRESEQ->>TRES_REPORT_HTML: barcode TSVs, solo summaries, star logs, dup metrics, flagstat/stats/idxstats, quickcheck TSVs, FASTQC ZIPs
    TRES_REPORT_HTML-->>TRESEQ: tres_report.html + tres_report_metrics.json
    TRESEQ->>MULTIQC: aggregated QC inputs
    MULTIQC-->>TRESEQ: multiqc_report.html
  end
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Poem

🐇 Hoppy news from the warren today,
New QC sidecar modules have come to stay!
GATK marks dups, DeepTools counts the reads,
FastQC zips and MultiQC leads.
A TrES HTML report shines bright—
The bunny has wired it all up right! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'HTML report/nfcore addons' clearly and concisely summarizes the two main objectives: HTML report enhancements and nf-core module integration, which are the primary changes evident throughout the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/nfcore-adaptation
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch feature/nfcore-adaptation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

🧹 Nitpick comments (5)
bin/render_tres_report.py (1)

127-154: 🧹 Nitpick | 🔵 Trivial | 💤 Low value

Consider adding strict=True to zip() for data integrity.

At line 138, zip(header, parts) without strict=True will silently truncate if the header and data row have different column counts. While GATK metrics files have a consistent format, adding strict=True would catch malformed input early rather than producing incorrect metrics silently.

Suggested fix
-        values = dict(zip(header, parts))
+        values = dict(zip(header, parts, strict=True))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bin/render_tres_report.py` around lines 127 - 154, The `zip(header, parts)`
call in the read_duplicate_metrics function will silently truncate if the header
and data row have different column counts, potentially causing incorrect metrics
without any warning. Add `strict=True` as a parameter to the zip call where
`values = dict(zip(header, parts))` is assigned to ensure that mismatched column
counts are detected and caught early, rather than silently producing incomplete
or incorrect data.

Source: Linters/SAST tools

modules/nf-core/multiqc/tests/main.nf.test (1)

106-107: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Avoid network-dependent fixtures in module tests.

Line 106-107 and Line 149-150 fetch config files from GitHub at test time. This makes tests susceptible to network/transient failures. Prefer committed local fixture files under the module test assets to keep CI deterministic.

Also applies to: 149-150

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/nf-core/multiqc/tests/main.nf.test` around lines 106 - 107, Remove
network dependencies from the MultiQC module tests by replacing the GitHub URL
references in the multiqc_config.yml file access with local fixture files.
Download the multiqc_config.yml file from the GitHub URL and commit it to the
module test assets directory, then update both occurrences (around lines 106-107
and 149-150) to reference the local file path instead of the remote GitHub URL
using the file() function with the local asset path. This ensures tests remain
deterministic and do not depend on network availability or transient GitHub
connectivity issues.
modules/nf-core/samtools/idxstats/meta.yml (1)

29-38: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Align declared input contract with the actual process requirements.

Line 31-37 currently documents SAM/SAI support, but modules/nf-core/samtools/idxstats/main.nf (Line 11) requires an index path in all cases. Please either narrow the documented patterns to indexed BAM/CRAM inputs or make the index optional in the process contract so docs and runtime behavior match.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/nf-core/samtools/idxstats/meta.yml` around lines 29 - 38, The input
patterns declared in the meta.yml file do not match the actual process
requirements in main.nf, which mandates an index file in all cases. Narrow the
pattern specifications for both the bam and bai input definitions to remove
unsupported formats: remove SAM (sam) from the bam pattern on line 33 to include
only "*.{bam,cram}", and remove SAI (sai) from the bai pattern on line 37 to
include only "*.{bai,crai}". This aligns the documented input contract with the
actual runtime behavior that requires indexed BAM or CRAM files.
modules/nf-core/gatk4/markduplicates/main.nf (2)

29-32: 🧹 Nitpick | 🔵 Trivial | 💤 Low value

Consider adding def keyword for local variables.

While Nextflow allows variable assignments without def, using def for local variables (prefix and prefix_bam) is more idiomatic and makes scope explicit.

♻️ Suggested style improvement
-    prefix = task.ext.prefix ?: "${meta.id}.bam"
+    def prefix = task.ext.prefix ?: "${meta.id}.bam"

     // If the extension is CRAM, then change it to BAM
-    prefix_bam = prefix.tokenize('.')[-1] == 'cram' ? "${prefix.substring(0, prefix.lastIndexOf('.'))}.bam" : prefix
+    def prefix_bam = prefix.tokenize('.')[-1] == 'cram' ? "${prefix.substring(0, prefix.lastIndexOf('.'))}.bam" : prefix
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/nf-core/gatk4/markduplicates/main.nf` around lines 29 - 32, Add the
`def` keyword before both local variable declarations for `prefix` and
`prefix_bam`. This makes the variable scope explicit and follows Nextflow
idioms. Change the assignments to use `def prefix = ...` and `def prefix_bam =
...` instead of declaring them without the keyword.

65-74: 🧹 Nitpick | 🔵 Trivial | 💤 Low value

Stub creates both output formats, while script creates only one.

The stub implementation creates both .bam and .cram outputs (plus both index types), whereas the real script produces only the format specified by prefix. This inconsistency could cause confusion when tests run in stub mode versus real mode, though it won't cause failures since outputs are marked optional.

Consider making the stub match the real behavior more closely by checking the prefix and creating only the expected output format.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/nf-core/gatk4/markduplicates/main.nf` around lines 65 - 74, The stub
block unconditionally creates both BAM and CRAM output files with their indices,
while the real script creates only the format specified by the prefix variable.
Modify the stub implementation to conditionally create only the expected output
format based on the prefix. Check whether the prefix indicates BAM or CRAM
format (by examining the prefix string or the actual script's logic) and use
conditional statements in the stub block to create only the corresponding output
files and their appropriate index, making the stub behavior consistent with the
actual script behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@conf/modules.config`:
- Around line 47-50: In the GATK4_MARKDUPLICATES configuration block, the
ext.prefix parameter currently includes the .bam file extension. Remove the .bam
extension from the ext.prefix value so that it only contains
"${meta.id}_MarkedDup" instead of "${meta.id}_MarkedDup.bam". The nf-core module
will automatically append the appropriate file extension, and removing the
hardcoded extension will prevent double extensions in the output file names.

In `@conf/test.config`:
- Around line 95-98: The DEEPTOOLS_BAMCOVERAGE process block is missing the
ext.mock = true configuration that other test tasks have. Add the line ext.mock
= true inside the withName: DEEPTOOLS_BAMCOVERAGE block so that the real module
is not executed during test runs, keeping tests fast and reliable.

In `@modules/nf-core/deeptools/bamcoverage/main.nf`:
- Around line 61-64: The stub block uses the args variable without declaring it
first, while the script block properly declares args. Add a declaration of args
(likely from task.ext.args with an empty string as default) at the beginning of
the stub block, before the line where args.contains() is called. This ensures
the variable is defined when checking for bedgraph format options, matching the
pattern used in the script block.
- Around line 33-34: The CRAM detection and output naming logic uses incorrect
Nextflow Path property names with uppercase letters. Replace `input.Extension`
with `input.extension` and replace `input.BaseName` with `input.baseName` in the
lines where `is_cram` and `input_out` variables are defined, as Nextflow Path
API requires lowercase property names or method calls like `getExtension()` and
`getBaseName()`.

In `@modules/nf-core/deeptools/bamcoverage/tests/main.nf.test`:
- Line 43: The test name contains a typo where "homo_sampiens" is misspelled.
Change "homo_sampiens" to "homo_sapiens" in the test function name to correct
the species name spelling in the test method declaration.
- Line 105: Fix the typo in the test name string for the test function at this
location. The word "homo_sampiens" in the test name is misspelled and should be
corrected to "homo_sapiens" to match the correct scientific name of the human
species. Update the string passed to the test function to use the correct
spelling.
- Line 12: The test name contains a typo: "homo_sampiens" should be corrected to
"homo_sapiens" (the correct scientific name for humans). Locate the test
function and update the test name string from "homo_sampiens - bam" to
"homo_sapiens - bam" to fix the spelling error.
- Line 74: Fix the typo in the test name where "homo_sampiens" should be
corrected to "homo_sapiens" in the test function. Change the string in the
test() call from "homo_sampiens - cram - fasta" to "homo_sapiens - cram - fasta"
to match the correct species name spelling.

In `@modules/nf-core/fastqc/main.nf`:
- Around line 24-27: The `old_new_pairs` assignment has a logic error in the
single-element list handling. When `reads.size() == 1` is true, `reads` is a
list, not a Path, so accessing `reads.extension` directly will fail. Fix this by
extracting the first element from the list (using `reads[0]` or `reads.first()`)
before accessing the `.extension` property when handling the single-element case
in the ternary operator.

In `@modules/nf-core/fastqc/tests/main.nf.test`:
- Around line 133-140: The assertions in the test are checking for output
filenames that do not match the input files. The 3rd and 4th outputs in the html
and zip assertions (at indices [0][1][2] and [0][1][3]) expect test_3 and
test_4, but the actual input fixtures are test2_1.fastq.gz and test2_2.fastq.gz.
Update all four assertions in both the html and zip output blocks to expect
test2_1_fastqc and test2_2_fastqc respectively instead of test_3 and test_4 to
align with the input file names.

In `@modules/nf-core/gatk4/markduplicates/main.nf`:
- Around line 58-62: The CRAM conversion block (when prefix ends with .cram)
uses the fasta reference file with the samtools view command but lacks
validation that the fasta file exists. Add a validation check inside the if
block that tests whether the fasta variable points to an existing file, and only
execute the samtools view -Ch -T command and subsequent operations if the fasta
file is present. If fasta is missing, the block should either error out with a
clear message or skip the CRAM conversion.

In `@modules/nf-core/gatk4/markduplicates/meta.yml`:
- Around line 1-99: The metrics output section declares a pattern of
`*.{metrics.txt}` but the actual process outputs files with the pattern
`*.{metrics}` without the `.txt` suffix. Update the pattern field in the metrics
output section to match the actual file output by changing it from
`*.{metrics.txt}` to `*.{metrics}`.

In `@modules/nf-core/multiqc/meta.yml`:
- Around line 64-68: Update the pattern field for the HTML output in the MultiQC
module metadata to use a valid glob pattern. Change the pattern value from
".html" to "*.html" to correctly match actual emitted filenames like
multiqc_report.html. This ensures the pattern field accurately reflects the
output contract of the module.
- Around line 85-88: The type declaration for the `*_plots` output does not
match the actual emission type from the Nextflow process. Change the `type:
file` declaration for the `*_plots` output to `type: directory` to correctly
reflect that the process emits a directory path (via `path("*_plots")` in
main.nf) rather than a single file, ensuring the schema aligns with the actual
process contract.

In `@modules/nf-core/samtools/quickcheck/tests/main.nf.test`:
- Around line 151-152: The remote fixture URLs at lines 151-152, 172-173, and
193-194 use mutable branch references (refs/heads/develop) which can cause
non-deterministic CI behavior. Replace all occurrences of the github.com URLs
pointing to refs/heads/develop with immutable commit-SHA based URLs (e.g.,
refs/heads/develop replaced with a specific commit hash) or vendor the test
fixtures locally to ensure snapshots and expected exit codes remain stable over
time.

---

Nitpick comments:
In `@bin/render_tres_report.py`:
- Around line 127-154: The `zip(header, parts)` call in the
read_duplicate_metrics function will silently truncate if the header and data
row have different column counts, potentially causing incorrect metrics without
any warning. Add `strict=True` as a parameter to the zip call where `values =
dict(zip(header, parts))` is assigned to ensure that mismatched column counts
are detected and caught early, rather than silently producing incomplete or
incorrect data.

In `@modules/nf-core/gatk4/markduplicates/main.nf`:
- Around line 29-32: Add the `def` keyword before both local variable
declarations for `prefix` and `prefix_bam`. This makes the variable scope
explicit and follows Nextflow idioms. Change the assignments to use `def prefix
= ...` and `def prefix_bam = ...` instead of declaring them without the keyword.
- Around line 65-74: The stub block unconditionally creates both BAM and CRAM
output files with their indices, while the real script creates only the format
specified by the prefix variable. Modify the stub implementation to
conditionally create only the expected output format based on the prefix. Check
whether the prefix indicates BAM or CRAM format (by examining the prefix string
or the actual script's logic) and use conditional statements in the stub block
to create only the corresponding output files and their appropriate index,
making the stub behavior consistent with the actual script behavior.

In `@modules/nf-core/multiqc/tests/main.nf.test`:
- Around line 106-107: Remove network dependencies from the MultiQC module tests
by replacing the GitHub URL references in the multiqc_config.yml file access
with local fixture files. Download the multiqc_config.yml file from the GitHub
URL and commit it to the module test assets directory, then update both
occurrences (around lines 106-107 and 149-150) to reference the local file path
instead of the remote GitHub URL using the file() function with the local asset
path. This ensures tests remain deterministic and do not depend on network
availability or transient GitHub connectivity issues.

In `@modules/nf-core/samtools/idxstats/meta.yml`:
- Around line 29-38: The input patterns declared in the meta.yml file do not
match the actual process requirements in main.nf, which mandates an index file
in all cases. Narrow the pattern specifications for both the bam and bai input
definitions to remove unsupported formats: remove SAM (sam) from the bam pattern
on line 33 to include only "*.{bam,cram}", and remove SAI (sai) from the bai
pattern on line 37 to include only "*.{bai,crai}". This aligns the documented
input contract with the actual runtime behavior that requires indexed BAM or
CRAM files.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a845e072-2d59-4e8e-b9b3-881191005d1c

📥 Commits

Reviewing files that changed from the base of the PR and between d89a0b8 and 24b6560.

⛔ Files ignored due to path filters (8)
  • modules/nf-core/deeptools/bamcoverage/tests/main.nf.test.snap is excluded by !**/*.snap
  • modules/nf-core/fastqc/tests/main.nf.test.snap is excluded by !**/*.snap
  • modules/nf-core/gatk4/markduplicates/tests/main.nf.test.snap is excluded by !**/*.snap
  • modules/nf-core/multiqc/tests/main.nf.test.snap is excluded by !**/*.snap
  • modules/nf-core/samtools/flagstat/tests/main.nf.test.snap is excluded by !**/*.snap
  • modules/nf-core/samtools/idxstats/tests/main.nf.test.snap is excluded by !**/*.snap
  • modules/nf-core/samtools/quickcheck/tests/main.nf.test.snap is excluded by !**/*.snap
  • modules/nf-core/samtools/stats/tests/main.nf.test.snap is excluded by !**/*.snap
📒 Files selected for processing (61)
  • README.md
  • assets/test_realdata/ligation_barcode_whitelist.txt
  • bin/render_tres_report.py
  • conf/base.config
  • conf/modules.config
  • conf/test.config
  • docs/architecture/implemented_pipeline.md
  • docs/output.md
  • docs/usage.md
  • modules.json
  • modules/local/check_dna_nodup_bam/main.nf
  • modules/local/normalize_dna_bamcoverage/main.nf
  • modules/local/normalize_dna_markduplicates/main.nf
  • modules/local/rna_starsolo_align/main.nf
  • modules/local/samtools_quickcheck_report/main.nf
  • modules/local/tres_report_html/main.nf
  • modules/nf-core/deeptools/bamcoverage/environment.yml
  • modules/nf-core/deeptools/bamcoverage/main.nf
  • modules/nf-core/deeptools/bamcoverage/meta.yml
  • modules/nf-core/deeptools/bamcoverage/tests/main.nf.test
  • modules/nf-core/fastqc/.conda-lock/linux_amd64-bd-5cb1a2fa2f18c7c2_1.txt
  • modules/nf-core/fastqc/.conda-lock/linux_arm64-bd-e455e32f745abe68_1.txt
  • modules/nf-core/fastqc/environment.yml
  • modules/nf-core/fastqc/main.nf
  • modules/nf-core/fastqc/meta.yml
  • modules/nf-core/fastqc/tests/main.nf.test
  • modules/nf-core/gatk4/markduplicates/environment.yml
  • modules/nf-core/gatk4/markduplicates/main.nf
  • modules/nf-core/gatk4/markduplicates/meta.yml
  • modules/nf-core/gatk4/markduplicates/tests/bam.config
  • modules/nf-core/gatk4/markduplicates/tests/cram.config
  • modules/nf-core/gatk4/markduplicates/tests/main.nf.test
  • modules/nf-core/multiqc/.conda-lock/linux_amd64-bd-c17fb751507e9dfc_1.txt
  • modules/nf-core/multiqc/.conda-lock/linux_arm64-bd-5c84a5000a226ab5_1.txt
  • modules/nf-core/multiqc/environment.yml
  • modules/nf-core/multiqc/main.nf
  • modules/nf-core/multiqc/meta.yml
  • modules/nf-core/multiqc/tests/custom_prefix.config
  • modules/nf-core/multiqc/tests/main.nf.test
  • modules/nf-core/multiqc/tests/nextflow.config
  • modules/nf-core/samtools/flagstat/environment.yml
  • modules/nf-core/samtools/flagstat/main.nf
  • modules/nf-core/samtools/flagstat/meta.yml
  • modules/nf-core/samtools/flagstat/tests/main.nf.test
  • modules/nf-core/samtools/idxstats/environment.yml
  • modules/nf-core/samtools/idxstats/main.nf
  • modules/nf-core/samtools/idxstats/meta.yml
  • modules/nf-core/samtools/idxstats/tests/main.nf.test
  • modules/nf-core/samtools/quickcheck/environment.yml
  • modules/nf-core/samtools/quickcheck/main.nf
  • modules/nf-core/samtools/quickcheck/meta.yml
  • modules/nf-core/samtools/quickcheck/tests/main.nf.test
  • modules/nf-core/samtools/quickcheck/tests/nextflow.config
  • modules/nf-core/samtools/stats/environment.yml
  • modules/nf-core/samtools/stats/main.nf
  • modules/nf-core/samtools/stats/meta.yml
  • modules/nf-core/samtools/stats/tests/main.nf.test
  • subworkflows/local/dna_core/main.nf
  • subworkflows/local/dna_core/meta.yml
  • subworkflows/local/rna_core/main.nf
  • workflows/treseq.nf

Comment thread conf/modules.config
Comment on lines +47 to +50
withName: GATK4_MARKDUPLICATES {
ext.prefix = { "${meta.id}_MarkedDup.bam" }
ext.args = '--REMOVE_DUPLICATES false --BARCODE_TAG CB --CREATE_INDEX true --MAX_RECORDS_IN_RAM 10000000'
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Remove the .bam extension from the ext.prefix.

The ext.prefix for GATK4_MARKDUPLICATES should not include the file extension. nf-core modules append their own extensions to the prefix, so including .bam here will produce output files with double extensions like *_MarkedDup.bam.bam.

🐛 Proposed fix
-    ext.prefix = { "${meta.id}_MarkedDup.bam" }
+    ext.prefix = { "${meta.id}_MarkedDup" }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@conf/modules.config` around lines 47 - 50, In the GATK4_MARKDUPLICATES
configuration block, the ext.prefix parameter currently includes the .bam file
extension. Remove the .bam extension from the ext.prefix value so that it only
contains "${meta.id}_MarkedDup" instead of "${meta.id}_MarkedDup.bam". The
nf-core module will automatically append the appropriate file extension, and
removing the hardcoded extension will prevent double extensions in the output
file names.

Comment thread conf/test.config
Comment on lines +95 to 98
withName: DEEPTOOLS_BAMCOVERAGE {
cpus = 1
memory = '1 GB'
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

DEEPTOOLS_BAMCOVERAGE is not mocked in test profile.

Line 95 configures this process in conf/test.config, but unlike the surrounding test tasks it is missing ext.mock = true. That can execute the real module during tests and make test runs slower/flaky.

Suggested fix
     withName: DEEPTOOLS_BAMCOVERAGE {
+        ext.mock = true
         cpus = 1
         memory = '1 GB'
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@conf/test.config` around lines 95 - 98, The DEEPTOOLS_BAMCOVERAGE process
block is missing the ext.mock = true configuration that other test tasks have.
Add the line ext.mock = true inside the withName: DEEPTOOLS_BAMCOVERAGE block so
that the real module is not executed during test runs, keeping tests fast and
reliable.

Comment on lines +33 to +34
def is_cram = input.Extension == "cram" ? true : false
def input_out = is_cram ? input.BaseName + ".bam" : "${input}"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify path-property usage patterns in this module and nearby nf-core modules.
rg -n -C2 '\b(input\.Extension|input\.BaseName)\b|getExtension\(|getBaseName\(' modules/nf-core

Repository: CSOgroup/TrESFlow

Length of output: 672


🏁 Script executed:

# Read the full context of the main.nf file to understand the input variable type
head -50 modules/nf-core/deeptools/bamcoverage/main.nf

Repository: CSOgroup/TrESFlow

Length of output: 2239


🏁 Script executed:

# Search for how other nf-core modules handle file extension detection and basename extraction
rg -n '\.(getExtension|getBaseName|extension|baseName)\(\)|\bext\b|\bstem\b' modules/nf-core --type groovy -A1 -B1 | head -100

Repository: CSOgroup/TrESFlow

Length of output: 43


🌐 Web query:

Nextflow Path object properties Extension BaseName vs getExtension getBaseName Groovy

💡 Result:

In Nextflow, for Path objects, prefer the Groovy-style properties baseName and extension (e.g., path.baseName, path.extension). These correspond to the underlying methods getBaseName() and getExtension(). 1) Nextflow Path “properties” - path.baseName returns the path name without its extension; for /some/path/file.tar.gz it’s file.tar [1]. - path.extension returns the path extension without the dot; for /some/path/file.txt it’s txt [1]. - These work in Nextflow’s docs/examples as path.baseName, path.extension, and path.name [2]. 2) getBaseName() / getExtension() are the method forms - The Nextflow TaskPath type explicitly defines getBaseName() and getExtension() methods [3]. - Groovy property access like path.extension maps to the getter getExtension() (and similarly path.baseNamegetBaseName()). This is consistent with how Groovy exposes extension-method properties for Path/File (via NioExtensions with getExtension(Path) and analogous base-name support) [4][5] and with Nextflow’s TaskPath.groovy implementing getBaseName() / getExtension() [6]. 3) What about “getExtension getBaseName Groovy”? - If you call the methods directly in Groovy, use parentheses: path.getExtension() and path.getBaseName(). - If you use properties (recommended/readable): path.extension and path.baseName. 4) Practical examples in Nextflow - From Nextflow’s working-with-files docs: - assert path.baseName == 'file' - assert path.extension == 'txt' - assert path.name == 'file.txt' - assert path.parent == '/some/path' [2]. So, “Extension BaseName vs getExtension getBaseName” is simply: extension/baseName are the property-style aliases for getExtension()/getBaseName() on Nextflow Path objects.

Citations:


Use valid path API for CRAM detection and output naming.

input.Extension and input.BaseName are not valid Nextflow Path properties. In Nextflow, Path objects use lowercase extension and baseName properties, or the methods getExtension() and getBaseName(). The capitalized versions will cause script evaluation failure at runtime.

Proposed fix
-    def is_cram = input.Extension == "cram" ? true : false
-    def input_out = is_cram ? input.BaseName + ".bam" : "${input}"
+    def is_cram = input.getExtension() == 'cram'
+    def input_out = is_cram ? "${input.getBaseName()}.bam" : "${input}"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def is_cram = input.Extension == "cram" ? true : false
def input_out = is_cram ? input.BaseName + ".bam" : "${input}"
def is_cram = input.getExtension() == 'cram'
def input_out = is_cram ? "${input.getBaseName()}.bam" : "${input}"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/nf-core/deeptools/bamcoverage/main.nf` around lines 33 - 34, The CRAM
detection and output naming logic uses incorrect Nextflow Path property names
with uppercase letters. Replace `input.Extension` with `input.extension` and
replace `input.BaseName` with `input.baseName` in the lines where `is_cram` and
`input_out` variables are defined, as Nextflow Path API requires lowercase
property names or method calls like `getExtension()` and `getBaseName()`.

Comment on lines +61 to +64
stub:
def prefix = task.ext.prefix ?: "${meta.id}"
def extension = args.contains("--outFileFormat bedgraph") || args.contains("-of bedgraph") ? "bedgraph" : "bigWig"
"""

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify undeclared use of `args` inside stub blocks in this file.
rg -n -C3 'stub:|def args|args\.contains' modules/nf-core/deeptools/bamcoverage/main.nf

Repository: CSOgroup/TrESFlow

Length of output: 832


Define args inside stub block before using it.

The stub block uses args.contains() on line 63 without declaring it, while the parallel script block correctly declares args on line 26. This causes a runtime error in stub execution.

Proposed fix
 stub:
     def prefix    = task.ext.prefix ?: "${meta.id}"
+    def args      = task.ext.args ?: ''
     def extension = args.contains("--outFileFormat bedgraph") || args.contains("-of bedgraph") ? "bedgraph" : "bigWig"
     """
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/nf-core/deeptools/bamcoverage/main.nf` around lines 61 - 64, The stub
block uses the args variable without declaring it first, while the script block
properly declares args. Add a declaration of args (likely from task.ext.args
with an empty string as default) at the beginning of the stub block, before the
line where args.contains() is called. This ensures the variable is defined when
checking for bedgraph format options, matching the pattern used in the script
block.

tag "deeptools"
tag "deeptools/bamcoverage"

test("homo_sampiens - bam") {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix typo in test name.

Test name has "homo_sampiens" instead of "homo_sapiens".

✏️ Proposed fix
-    test("homo_sampiens - bam") {
+    test("homo_sapiens - bam") {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
test("homo_sampiens - bam") {
test("homo_sapiens - bam") {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/nf-core/deeptools/bamcoverage/tests/main.nf.test` at line 12, The
test name contains a typo: "homo_sampiens" should be corrected to "homo_sapiens"
(the correct scientific name for humans). Locate the test function and update
the test name string from "homo_sampiens - bam" to "homo_sapiens - bam" to fix
the spelling error.

Comment on lines +58 to +62
if [[ ${prefix} == *.cram ]]; then
samtools view -Ch -T ${fasta} -o ${prefix} ${prefix_bam}
rm ${prefix_bam}
samtools index ${prefix}
fi

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Missing validation: CRAM output requires fasta reference.

The CRAM conversion logic uses ${fasta} without checking if it exists. If a user configures ext.prefix to end with .cram but doesn't provide a fasta reference, the samtools view -Ch -T ${fasta} command will fail because -T requires a valid reference file argument.

🛡️ Proposed fix to add validation
     # If cram files are wished as output, the run samtools for conversion
     if [[ ${prefix} == *.cram ]]; then
+        if [[ -z "${fasta}" ]]; then
+            echo "ERROR: CRAM output requires a reference fasta file" >&2
+            exit 1
+        fi
         samtools view -Ch -T ${fasta} -o ${prefix} ${prefix_bam}
         rm ${prefix_bam}
         samtools index ${prefix}
     fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/nf-core/gatk4/markduplicates/main.nf` around lines 58 - 62, The CRAM
conversion block (when prefix ends with .cram) uses the fasta reference file
with the samtools view command but lacks validation that the fasta file exists.
Add a validation check inside the if block that tests whether the fasta variable
points to an existing file, and only execute the samtools view -Ch -T command
and subsequent operations if the fasta file is present. If fasta is missing, the
block should either error out with a clear message or skip the CRAM conversion.

Comment on lines +1 to +99
name: gatk4_markduplicates
description: This tool locates and tags duplicate reads in a BAM or SAM file,
where duplicate reads are defined as originating from a single fragment of
DNA.
keywords:
- bam
- gatk4
- markduplicates
- sort
tools:
- gatk4:
description: Developed in the Data Sciences Platform at the Broad Institute,
the toolkit offers a wide variety of tools with a primary focus on variant
discovery and genotyping. Its powerful processing engine and
high-performance computing features make it capable of taking on projects
of any size.
homepage: https://gatk.broadinstitute.org/hc/en-us
documentation: https://gatk.broadinstitute.org/hc/en-us/articles/360037052812-MarkDuplicates-Picard-
tool_dev_url: https://github.com/broadinstitute/gatk
doi: 10.1158/1538-7445.AM2017-3590
licence: ["MIT"]
identifier: ""
input:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- bam:
type: file
description: Sorted BAM file
pattern: "*.{bam}"
ontologies: []
- fasta:
type: file
description: Fasta file
pattern: "*.{fasta}"
ontologies: []
- fasta_fai:
type: file
description: Fasta index file
pattern: "*.{fai}"
ontologies: []
output:
cram:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- "*cram":
type: file
description: Marked duplicates CRAM file
pattern: "*.{cram}"
ontologies: []
bam:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- "*bam":
type: file
description: Marked duplicates BAM file
pattern: "*.{bam}"
ontologies: []
crai:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- "*.crai":
type: file
description: CRAM index file
pattern: "*.{cram.crai}"
ontologies: []
bai:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- "*.bai":
type: file
description: BAM index file
pattern: "*.{bam.bai}"
ontologies: []
metrics:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- "*.metrics":
type: file
description: Duplicate metrics file generated by GATK
pattern: "*.{metrics.txt}"
ontologies: []

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Pattern mismatch: metrics file extension.

Line 98 specifies the pattern as *.{metrics.txt}, but the actual process (main.nf line 52) outputs ${prefix}.metrics without the .txt suffix. The pattern should match the actual output.

📝 Proposed fix
       - "*.metrics":
           type: file
           description: Duplicate metrics file generated by GATK
-          pattern: "*.{metrics.txt}"
+          pattern: "*.metrics"
           ontologies: []
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
name: gatk4_markduplicates
description: This tool locates and tags duplicate reads in a BAM or SAM file,
where duplicate reads are defined as originating from a single fragment of
DNA.
keywords:
- bam
- gatk4
- markduplicates
- sort
tools:
- gatk4:
description: Developed in the Data Sciences Platform at the Broad Institute,
the toolkit offers a wide variety of tools with a primary focus on variant
discovery and genotyping. Its powerful processing engine and
high-performance computing features make it capable of taking on projects
of any size.
homepage: https://gatk.broadinstitute.org/hc/en-us
documentation: https://gatk.broadinstitute.org/hc/en-us/articles/360037052812-MarkDuplicates-Picard-
tool_dev_url: https://github.com/broadinstitute/gatk
doi: 10.1158/1538-7445.AM2017-3590
licence: ["MIT"]
identifier: ""
input:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- bam:
type: file
description: Sorted BAM file
pattern: "*.{bam}"
ontologies: []
- fasta:
type: file
description: Fasta file
pattern: "*.{fasta}"
ontologies: []
- fasta_fai:
type: file
description: Fasta index file
pattern: "*.{fai}"
ontologies: []
output:
cram:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- "*cram":
type: file
description: Marked duplicates CRAM file
pattern: "*.{cram}"
ontologies: []
bam:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- "*bam":
type: file
description: Marked duplicates BAM file
pattern: "*.{bam}"
ontologies: []
crai:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- "*.crai":
type: file
description: CRAM index file
pattern: "*.{cram.crai}"
ontologies: []
bai:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- "*.bai":
type: file
description: BAM index file
pattern: "*.{bam.bai}"
ontologies: []
metrics:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- "*.metrics":
type: file
description: Duplicate metrics file generated by GATK
pattern: "*.{metrics.txt}"
ontologies: []
name: gatk4_markduplicates
description: This tool locates and tags duplicate reads in a BAM or SAM file,
where duplicate reads are defined as originating from a single fragment of
DNA.
keywords:
- bam
- gatk4
- markduplicates
- sort
tools:
- gatk4:
description: Developed in the Data Sciences Platform at the Broad Institute,
the toolkit offers a wide variety of tools with a primary focus on variant
discovery and genotyping. Its powerful processing engine and
high-performance computing features make it capable of taking on projects
of any size.
homepage: https://gatk.broadinstitute.org/hc/en-us
documentation: https://gatk.broadinstitute.org/hc/en-us/articles/360037052812-MarkDuplicates-Picard-
tool_dev_url: https://github.com/broadinstitute/gatk
doi: 10.1158/1538-7445.AM2017-3590
licence: ["MIT"]
identifier: ""
input:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- bam:
type: file
description: Sorted BAM file
pattern: "*.{bam}"
ontologies: []
- fasta:
type: file
description: Fasta file
pattern: "*.{fasta}"
ontologies: []
- fasta_fai:
type: file
description: Fasta index file
pattern: "*.{fai}"
ontologies: []
output:
cram:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- "*cram":
type: file
description: Marked duplicates CRAM file
pattern: "*.{cram}"
ontologies: []
bam:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- "*bam":
type: file
description: Marked duplicates BAM file
pattern: "*.{bam}"
ontologies: []
crai:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- "*.crai":
type: file
description: CRAM index file
pattern: "*.{cram.crai}"
ontologies: []
bai:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- "*.bai":
type: file
description: BAM index file
pattern: "*.{bam.bai}"
ontologies: []
metrics:
- - meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- "*.metrics":
type: file
description: Duplicate metrics file generated by GATK
pattern: "*.metrics"
ontologies: []
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/nf-core/gatk4/markduplicates/meta.yml` around lines 1 - 99, The
metrics output section declares a pattern of `*.{metrics.txt}` but the actual
process outputs files with the pattern `*.{metrics}` without the `.txt` suffix.
Update the pattern field in the metrics output section to match the actual file
output by changing it from `*.{metrics.txt}` to `*.{metrics}`.

Comment on lines +64 to +68
- "*.html":
type: file
description: MultiQC report file
pattern: ".html"
ontologies: []

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use a valid glob pattern for report output.

Line 67 uses pattern: ".html", which does not match emitted filenames like multiqc_report.html. Use *.html to reflect the actual output contract.

Proposed fix
-          pattern: ".html"
+          pattern: "*.html"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- "*.html":
type: file
description: MultiQC report file
pattern: ".html"
ontologies: []
- "*.html":
type: file
description: MultiQC report file
pattern: "*.html"
ontologies: []
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/nf-core/multiqc/meta.yml` around lines 64 - 68, Update the pattern
field for the HTML output in the MultiQC module metadata to use a valid glob
pattern. Change the pattern value from ".html" to "*.html" to correctly match
actual emitted filenames like multiqc_report.html. This ensures the pattern
field accurately reflects the output contract of the module.

Comment on lines +85 to +88
- "*_plots":
type: file
description: Plots created by MultiQC
pattern: "*_plots"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Align plots output type with the process contract.

Line 86 declares plots as type: file, but modules/nf-core/multiqc/main.nf emits path("*_plots"), which is a directory. This cross-file mismatch can break schema/lint contract checks.

Proposed fix
   plots:
@@
       - "*_plots":
-          type: file
+          type: directory
           description: Plots created by MultiQC
           pattern: "*_plots"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- "*_plots":
type: file
description: Plots created by MultiQC
pattern: "*_plots"
- "*_plots":
type: directory
description: Plots created by MultiQC
pattern: "*_plots"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/nf-core/multiqc/meta.yml` around lines 85 - 88, The type declaration
for the `*_plots` output does not match the actual emission type from the
Nextflow process. Change the `type: file` declaration for the `*_plots` output
to `type: directory` to correctly reflect that the process emits a directory
path (via `path("*_plots")` in main.nf) rather than a single file, ensuring the
schema aligns with the actual process contract.

Comment on lines +151 to +152
file('https://github.com/samtools/samtools/raw/refs/heads/develop/test/quickcheck/1.quickcheck.badeof.bam', checkIfExists: true),
]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pin remote quickcheck fixtures to immutable revisions

Line 151, Line 172, and Line 193 fetch test data from refs/heads/develop, which is mutable and can make CI nondeterministic. Please switch to immutable commit-SHA URLs (or vendored local fixtures) so snapshots and expected exit codes stay stable over time.

Also applies to: 172-173, 193-194

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/nf-core/samtools/quickcheck/tests/main.nf.test` around lines 151 -
152, The remote fixture URLs at lines 151-152, 172-173, and 193-194 use mutable
branch references (refs/heads/develop) which can cause non-deterministic CI
behavior. Replace all occurrences of the github.com URLs pointing to
refs/heads/develop with immutable commit-SHA based URLs (e.g.,
refs/heads/develop replaced with a specific commit hash) or vendor the test
fixtures locally to ensure snapshots and expected exit codes remain stable over
time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants