All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
- Fixed bug in
AbstractEpistasis.preferences
withreturnformat
of 'tidy'. Previously the wildtype was set incorrectly for missing values.
- The new
AbstractEpistasis.single_mut_effects
method. - Options
returnformat
andstringency_param
toAbstractEpistasis.preferences
andutils.scores_to_prefs
.
AbstractEpistasis.preferences
andutils.scores_to_prefs
return site as integer.
- Errors related to using
pandas.query
fornan
values. Not sure of the cause, but the errors are fixed now.
- Eliminated the default log base for conversion of scores / phenotypes. This is because base 2 gave excessively flat preferences, and the choice of a base is something that the user should need to think about. Added explanation about the consequences of this choice to docs and examples.
- The preferenes returned by
scores_to_prefs
andAbstractEpistasis.preferences
are now naturally sorted by site.
- The new
AbstractEpistasis.preferences
method gets amino-acid preferences from phenotypes. - Added
utils.scores_to_prefs
.
- The
isplines
module now uses a simple dict-implemented cache rather thanmethodtools.lru_cache
. This fixes excess memory usage and allows objects to be pickled. AbstractEpistasis
internally clears the cache via__getstate__
to reduce size of pickled objects. This avoids pickled models being huge. Also added theclearcache
option toAbstractEpistasis.fit
to serve a similar purpose of memory savings.
- Added additional forms of likelihood function to the global epistasis models. This involves substantial re-factoring the epistasis models in
globalepistasis
. In particular, theMonotonicSplineEpistasis
andNoEpistasis
classes no longer are fully concrete subclasses ofAbstractEpistasis
. Instead, there are also likelihood calculation subclasses (GaussianLikelihood
andCauchyLikelihood
), and the concrete subclasses inherit from both an epistasis function and likelihood calculation subclass. So for instance, what was previouslyMonotonicSplineEpistasis
(with Gaussian likelihood assumed) is nowMonotonicSplineEpistasisGaussianLikelihood
. Note that this an API-breaking change. - Added the
narrow_bottleneck.ipynb
notebook to demonstrate use of the Cauchy likelihood for analysis of experiments with a lot of noise. - Added the
predict_variants.ipynb
to demonstrate prediction of variant phenotypes using global epistasis models. - Added
simulate.codon_muts
.
- Some minor fixes to
codonvariat_sim_data.ipynb
.
- Added
utils.tidy_to_corr
. - Added
binarymap
module. - Added
globalepistasis
module. - Added
ispline
module.
- Order of rows in data frames from
CodonVariantTable.func_scores
. - Updated
codonvariant_sim_data.ipynb
to be smaller and fit global epistasis models, and move plot formatting examples to a new dedicated notebook. - Changed
SigmoidPhenotypeSimulator
so that the enrichment is a sigmoidal function of the latent phenotype, and the observed phenotype is the log (base 2) of the latent phenotype. This change harmonizes the simulator with the definitions in the newglobalepistasis
module. Also changed the input to thelatentPhenotype
andobservedPhenotype
methods. Note that these are backwards-compatibility breaking changes.
- Removed use of deprecated
Bio.Alphabet
- Capabilities to parse barcodes from Illumina data: FASTQ readers and
IlluminaBarcodeParser
. CodonVariantTable.numCodonMutsByType
method to get numerical values for codon mutations per variant.- Can specify names of columns when initializing a
CodonVariantTable
. CodonVariantTable.func_scores
now takeslibraries
rather thancombine_libs
argument.- Added
CodonVariantTable.add_sample_counts_df
method. - Added
CodonVariantTable.plotVariantSupportHistogram
method. - Added
CodonVariantTable.avgCountsPerVariant
andCodonVariantTable.plotAvgCountsPerVariant
methods. - Add custom
plotnine
theme inplotnine_themes
and improved formatting of plots fromCodonVariantTable
. - Added
sample_rename
parameter toCodonVariantTable
plotting methods. - Added
syn_as_wt
toCodonVariantTable.classifyVariants
. - Added
random_seq
andmutate_seq
tosimulate
module.
- Changed how
variant_call_support
set insimulate_CodonVariantTable
. - Better xlimits on
CodonVariantTable.plotCumulMutCoverage
.
- Docs /formatting in Jupyter notebooks.
- Fixed bugs that arose when
pandas
updated to 0.25 (related togroupby
no longer dropping empty categories). - Bugs in
CodonVariantTable
histogram plots whensamples
set.
Initial release. Ported code from dms_tools2
and made some improvements.