-
Notifications
You must be signed in to change notification settings - Fork 562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added a ksw_global_end function. #92
Open
jkbonfield
wants to merge
2
commits into
attractivechaos:master
Choose a base branch
from
jkbonfield:ksw_global_end
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is as per ksw_global, but permits control over whether gaps at the start and/or end of query and target sequences are penalised. For example: GTTTTTTAC vs GTTTTTTAC || | || GT------C GTC------
I appreciate this isn't easy to study, so if you wish to see what the differences are between the functions, then here is what it looks like with ksw_global replaced by ksw_global_end:
I confess looking at it again this way I'm unsure of the band calculation. It may be should only set best_score when within 'w' of tlen, but it's hard to understand. |
jkbonfield
force-pushed
the
ksw_global_end
branch
from
October 25, 2017 14:19
7993c3e
to
b6af267
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is as per ksw_global, but permits control over whether gaps at the start and/or end of query and target sequences are penalised.
For experimentation I also added another test main() function, although I'm happy for this to be dropped of course. To demonstrate it I compare GTC vs GATTTTTC and GTTTTTAC, in either order. Initially I penalise end gaps ($e == 1 in my command line):
In every case we see a large gap and align the few bases either side. With $e == 0 so that end gaps aren't penalised, the score is higher and we see a mismatch preferred instead:
Note I cannot explain the alignments I get from the existing ksw_global function. Is my test harness processing this correctly? If I call ksw_global instead of ksw_global_end in my main function (see the commented out line) then I get an invalid score and truncated alignments in some of these combinations. I don't know why this is so.
A possible improvement is to express ksw_global in terms of ksw_global_end (it's just 4 extra arguments all set to 1), but I left it as-is for you to decide whether the slight performance difference is worth it.
James