Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per read info for partial ssAAV and partial scAAV #4

Open
dpaudel-tb opened this issue May 31, 2024 · 5 comments
Open

Per read info for partial ssAAV and partial scAAV #4

dpaudel-tb opened this issue May 31, 2024 · 5 comments
Labels
question Further information is requested

Comments

@dpaudel-tb
Copy link

Ask away!

I was looking at the _per_read_info.tsv file and it outputs per read subgenome type. For partials it only lists as Partial. However, the report separates them as Partial_scAAV and Partial_ssAAV. Is there a file where we can get access to which reads are separated as which partials?

@dpaudel-tb dpaudel-tb added the question Further information is requested label May 31, 2024
@nrhorner
Copy link
Contributor

nrhorner commented Jun 3, 2024

Hi @dpaudel-tb

The genome subtypes in the *per_read_info.tsv should contain more granular assignments than those you see in the report.

for example, test data results include these this entry:

transgene_cassette_7875	Partial ICG - no ITRs	-	sample_1

Please not note the file is tab-separated, so there are 4 fields here

transgene_cassette_7875
Partial ICG - no ITRs
-
sample_1

Do you see something similar? If not could you send me a snippet of the affected file?

@dpaudel-tb
Copy link
Author

Thank you @nrhorner for the details. I was only looking at column 2 and missed the other parts. Looking at the complete breakdown, I am listing below a count of reads for each subtype listed there:

 2940 3` ICG
  97704 3` SBG
    425 3` SBG (incomplete)
     20 3` SBG (incomplete) asymmetrical
     16 3` SBG (incomplete) symmetrical
   3165 3` SBG asymmetrical
   2377 3` SBG symmetrical
 235986 5` ICG
   2863 5` SBG
     31 5` SBG (incomplete)
    139 5` SBG (incomplete) asymmetrical
     60 5` SBG (incomplete) symmetrical
  13908 5` SBG asymmetrical
  15755 5` SBG symmetrical
      1 Assigned_genome_subtype extra_info sample_id
  12805 Complex
  31981 Full_scAAV
  61969 Full_ssAAV
    586 GDM
    566 ITR single strand
    240 ITR1 concatemer ITR1 concatemer
    384 Partial ICG - incomplete ITRs
   8051 Partial ICG - no ITRs
      2 Partial ICG - part of both ITRs
   9043 SBG (unresolved)
     63 Unknown

The qc-report shows the summary as follows:

Assigned_genome_type count percentage sample_id
Partial ssAAV 247947 49.48 unique_nanofilt_q10_l300
Partial scAAV 145506 29.04 unique_nanofilt_q10_l300
Full_ssAAV 61969 12.37 unique_nanofilt_q10_l300
Full_scAAV 31981 6.38 unique_nanofilt_q10_l300
Complex 12805 2.56 unique_nanofilt_q10_l300
ITR region only 806 0.16 unique_nanofilt_q10_l300
Unknown 65 0.01 unique_nanofilt_q10_l300

For the Full_ssAAV, Full_scAAV, and Complex, the count of reads matches the output from per_read_info.tsv. However, for Partial ssAAV and Partial scAAV, I do not see corresponding counts.

@nrhorner
Copy link
Contributor

nrhorner commented Jun 3, 2024

The partial ssAAV number is aggregated following sub-categories (note: Partial ICG - part of both ITRs is not included, but should be)

2940	 3` ICG
235986	 5` ICG
586	GDM
384	 Partial ICG - incomplete ITRs
8051	 Partial ICG - no ITRs
---
247947    Partial ssAAV

scAAV is aggregated from all the SBG subcategories.

We will be updating the docs soon with some more detailed explanation of the various AAV genome types.

@dpaudel-tb
Copy link
Author

Thank you @nrhorner. I was able to map this from the script. Only confusion remained was Partial ICG - part of both ITRs that looks like it is incorrectly reported as Unknown.
Thanks for the clarification.
-Dev

@nrhorner
Copy link
Contributor

nrhorner commented Jun 3, 2024

No problem. i'll let you know when a fix is out for that incorrect annotation.

Neil

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Development

No branches or pull requests

2 participants