Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to change type of FORMAT field despite updating header #1321

Open
kjaisingh opened this issue Dec 12, 2024 · 0 comments
Open

Unable to change type of FORMAT field despite updating header #1321

kjaisingh opened this issue Dec 12, 2024 · 0 comments

Comments

@kjaisingh
Copy link

I have a VCF with a FORMAT field as follows:

##FORMAT=<ID=EV,Number=1,Type=Integer,Description="Classes of evidence supporting final genotype">

I then update the header line by doing the following:

header = vcf_in.header.copy()
new_header = pysam.VariantHeader()

for line in header.records:
    if line.type == 'FORMAT' and line.get('ID') == EV:
        continue
    new_header.add_line(str(line))
new_header.add_line('##FORMAT=<ID=EV,Number=1,Type=String,Description="Classes of evidence supporting final genotype">')

vcf_out = pysam.VariantFile(args.output_vcf, 'w', header=new_header)

If I now set record.samples[X]['EV'] for a given sample X to a String value, I still get the following error:
TypeError: invalid value for Integer format. This essentially limits my ability to modify values in genotype fields for which I change the type of, though it seems to be a reasonable function to enable.

My thought here is that the evaluation about validity of a genotype field should occur at the time of writing to an output VCF file - which in this case, would be valid, since the output VCF file is built with new_header that has the updated ##FORMAT=<ID=EV tag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant