Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: update dropped strains file to list accession instead of strain names #26

Merged
merged 5 commits into from
Feb 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion phylogenetic/config/config_dengue.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ strain_id_field: "accession"
display_strain_field: "strain"

filter:
exclude: "config/dropped_strains.txt"
exclude: "config/exclude.txt"
group_by: "year region"
min_length: 5000
sequences_per_group:
Expand Down
69 changes: 0 additions & 69 deletions phylogenetic/config/dropped_strains.txt

This file was deleted.

50 changes: 50 additions & 0 deletions phylogenetic/config/exclude.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Format: <GenBank accession> [# <strain name> <reason>]
JF260983 # DENV/SPAIN/EEB17/2009 # sylvatic according to https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3149010/
HE795086 # DENV1/FRANCE/00475/2008 # 2008/00475
EF457905 # DENV1/MALAYSIA/P1244/1972 # P72-1244
GU131762 # DENV1/VIETNAM/BIDV3990/2008 # DENV-1/VN/BID-V3990/2008
EU482536 # DENV1/VIETNAM/BIDV992/2006 # DENV-1/VN/BID-V992/2006
KX702403 # DENV2/HAITI/DENGUEVIRUS2HOMOSAPIENS1/2016
MW946564 # DENV2/SENEGAL/0674/1970 # SENDAK-HD-10674
DENV2/TRINIDAD_AND_TOBAGO/NA/1953 # Perhaps from https://pubmed.ncbi.nlm.nih.gov/13351628/, but did not search far for the accession
KY923048 # D2Sab2015 # miscategorized # DENV2/MALAYSIA/SAB/2015
KX274130 # QML22 # miscategorized # DENV2/AUSTRALIA/QML22/2015
EF105383 # DAK_Ar_A1247 # sylvatic # DENV2/COTE_D_IVOIRE/DAKARA1247/1980
EF105382 # Dak_Ar_2039 # sylvatic # DENV2/BURKINA_FASO/DAKAR2039/1980
EF105380 # Dak_Ar_578 # sylvatic # DENV2/COTE_D_IVOIRE/DAKAR578/1980
EF105381 # DAK_Ar_510 # sylvatic # DENV2/COTE_D_IVOIRE/DAKAR510/1980
EF105378 # PM33974 # sylvatic # DENV2/GUINEA/PM33974/1981
EF105386 # Dak_Ar_A2022 # sylvatic # DENV2/BURKINA_FASO/DAKARA2022/1980
EF105389 # Dak_Ar_141069 # sylvatic # DENV2/SENEGAL/DAKAR141069/1999
EF105390 # Dak_Ar_141070 # sylvatic # DENV2/SENEGAL/DAKAR141070/1999
EF457904 # Dak_Ar_D75505 # sylvatic # DENV2/SENEGAL/DAKARD75505/1999
EF105384 # Dak_HD_10674 # sylvatic
EF105385 # Dak_Ar_D20761 # sylvatic # DENV2/SENEGAL/DAKAR0761/1974
EF105388 # IBH11664 # sylvatic # DENV2/NIGERIA/IBH11664/1966
EF105387 # IBH11208 # sylvatic # DENV2/NIGERIA/IBH11208/1966
EU003591 # IBH11234 # sylvatic # DENV2/NIGERIA/IBH11234/1966
EF105379 # P8_1407 # sylvatic # DENV2/MALAYSIA/P81407/1970
JF262779 # P75_514 # sylvatic # DENV4/MALAYSIA/P514/1975
JF262780 # P73_1120 # sylvatic # DENV4/MALAYSIA/P731120/1973
EF457906 # P75_215 # sylvatic # DENV4/MALAYSIA/P215/1975
FJ467493 # DKD811 # sylvatic # DENV2/MALAYSIA/DKD811/2008
EF051521 # ZS01/01 # metadata issue
MT929160 # Vero # cell line
MH048676 # MS13002673 # too divergent
MH048674 # MS11011405 # too divergent
MT597439 # V43257 # too divergent
MN448607 # KDC0574A2_06/02/2011 # too divergent
ON046268 # 00178/03 # too divergent
ON046278 # 00759/12 # too divergent
ON046276 # 00988/11 # too divergent
ON046273 # 01113/10 # too divergent
ON046270 # 01224/04 # too divergent
ON046274 # 01231/10 # too divergent
ON046272 # 01488/09 # too divergent
ON046271 # 01542/04 # too divergent
MZ284953 # dev1 # too divergent
MZ215848 # DKE_121 # too divergent
MW946564 # SENDAK_HD_10674 # sylvatic
OK605757 # DENV2_1_DAK_HD_76395 # sylvatic
MW945427 # DENV3/PUERTORICO/1963/PRS_228762_AC27 # too divergent
OM258630 # PR_6 # too divergent