Skip to content

Update happy module#311

Draft
georgiakes wants to merge 2 commits into
nf-core:devfrom
georgiakes:update-happy
Draft

Update happy module#311
georgiakes wants to merge 2 commits into
nf-core:devfrom
georgiakes:update-happy

Conversation

@georgiakes

Copy link
Copy Markdown
Member

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/variantbenchmarking branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@georgiakes

Copy link
Copy Markdown
Member Author

I would like to confirm whether regions_bed is used correctly with with the incoming changes.
Up until now, the regions_bed file was passed to -f. With the incoming changes this file will be used with -R.
This seems to agree with both the pipeline documentation and the happy documentation in the sense that that the are making clear that is works like bcftools -R . It this the functionality that we want to achieve?

@georgiakes

Copy link
Copy Markdown
Member Author

However, in another place in the pipeline documentation we say that the regions_bed is for defining the high confidence regions. The happy documentation , though, states that the confidence bed should be passed form -f.
It is a little bit confusing. Should we maybe change the description of the regions_bed param?

@kubranarci

Copy link
Copy Markdown
Contributor

I would like to confirm whether regions_bed is used correctly with with the incoming changes. Up until now, the regions_bed file was passed to -f. With the incoming changes this file will be used with -R. This seems to agree with both the pipeline documentation and the happy documentation in the sense that that the are making clear that is works like bcftools -R . It this the functionality that we want to achieve?

Hello, Passing regions_bed with -f was due to the errorious fix in the previous version. I confirm, we need to use -R in the pipeline to define regions file.

@kubranarci

Copy link
Copy Markdown
Contributor

However, in another place in the pipeline documentation we say that the regions_bed is for defining the high confidence regions. The happy documentation , though, states that the confidence bed should be passed form -f. It is a little bit confusing. Should we maybe change the description of the regions_bed param?

I agree that is really confusing. Can you also please fix the documentation? regions_bed is not really defining confidence regions.. Better to add that note to -f (--false-positives bed)

@kubranarci

Copy link
Copy Markdown
Contributor

@georgiakes You should also update the subworkflow tests involving happy.. :/

@georgiakes

Copy link
Copy Markdown
Member Author

Hello! I have updated the documentation for some of the BED file parameters.

Since we updated the definition of regions_bed and agreed that falsepositive_bed will serve as our high-confidence evaluation region, we will also need to adjust our vcfeval implementation.

Instead of using regions_bed and targets_bed as the primary inputs for vcfeval, we should pass falsepositive_bed to the --evaluation-regions parameter , and then use --bed-regions with whichever subset (regions_bed or targets_bed) the user has provided as input.

Do you agree with this approach?

@kubranarci

Copy link
Copy Markdown
Contributor

Hello! I have updated the documentation for some of the BED file parameters.
Since we updated the definition of regions_bed and agreed that falsepositive_bed will serve as our high-confidence evaluation region, we will also need to adjust our vcfeval implementation.
Instead of using regions_bed and targets_bed as the primary inputs for vcfeval, we should pass falsepositive_bed to the --evaluation-regions parameter , and then use --bed-regions with whichever subset (regions_bed or targets_bed) the user has provided as input.
Do you agree with this approach?

Hello, sorry I took me a while to understand what you mean :D

What you suggests makes me sense, but we should make this very clear to he user. Since, falsepositive_bed doesnt sound like high confidence. What would you think we also change the general input variable from falsepositive_bed to high_confidence_bed ? That will make it more clear?

@georgiakes

Copy link
Copy Markdown
Member Author

I agree! The name as it right now doesn't help. high_confidence_bed is better.

@kubranarci

Copy link
Copy Markdown
Contributor

Okay, lets do that, But we have to be carefull with all the processes using regions and targets files.. For example, cnv part also uses those non intuitively. But this will be a good change to make all more understandable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants