Skip to content

Commit

Permalink
Merge visualization updates (#41) (#42)
Browse files Browse the repository at this point in the history
* Develop merge into master to sync commits. (#41)

* Rework readme (#6)

* Renamed actions and added badges to readme

* Switched slash for dash

* Fix import structure (#10)

* Fixed import structure

* Removed pull_request target as push also seems to trigger on pull request?

* Fix visualization helpers (#13)

* Fixed bugs

* Added pull_request again as trigger for workflow

* Update build_deploy.yml

* Fix compressing core and installed pre-commit (#15)

* Pre-commit test

* Added local hook for zipping core

* Added .zip and .tar.gz autozips into core_archives folder

* Zip test

* tar.gz test

* Compressing works, only really ugly workaround for moving files

* Refered to the archives in the README

* Fix datasets (#17)

* Release v0.2a1 (#14)

* Rework readme (#6)

* Renamed actions and added badges to readme

* Switched slash for dash

* Fix import structure (#10)

* Fixed import structure

* Removed pull_request target as push also seems to trigger on pull request?

* Fix visualization helpers (#13)

* Fixed bugs

* Added pull_request again as trigger for workflow

* Deleted old log file and changed setup version

* Restructured dataset files and added package_data to setup.py, also changed max-line-length for black to 79

* New setup for loading datasets

* Added dataset inclusion

* Added archiving for datasets

* Test for matrix include (#18)

* Test for matrix include

* Added runOns variable

* Changed to a custom action

* Added inputFile

* Copied literal line

* Changed strategy order

* Check only ubuntu

* Extended matrix with all OS and extra CIBW_BUILD ENV flag

* Added default of only python 3.6 builds for 64-bit if push is not to master

* Added enter to README

* Release v0.2a2 (#20)

* Release v0.2a1 (#14)

* Rework readme (#6)

* Renamed actions and added badges to readme

* Switched slash for dash

* Fix import structure (#10)

* Fixed import structure

* Removed pull_request target as push also seems to trigger on pull request?

* Fix visualization helpers (#13)

* Fixed bugs

* Added pull_request again as trigger for workflow

* Deleted old log file and changed setup version

* Changed master trigger to release trigger

* Moved wildcard

* Moved wildcard of JMESPath

* Added dot

* Trying starts_with

* Switched arguments

* Excluded python 2.7 pypy

* New jmespath filter test

* Changed version number

* Add PyTest to repo and CI (#21)

* Added first pytest script for protein class and a script for performing local tests.

* Added pytest in CI

* Forgot -r for file

* Version of pytest with pytest at end of pipeline

* New pytest CI where pip installs local package

* Fixed tabs and added caching of python/pip environment

* Removed dot and added removal of build dirs to manage clean command

* Added cache ignores for env setup and ids for the caches

* Fixed cache IDs

* Renamed cache because GitHub does not allow for clearing caches...

* Moved python setup to be before cache loading

* Run that will install the dependencies

* Uncommented the cache-hit detection for installing dependencies

* Updated pre-commit version in hope that runner will create new cache

* Added more tests and added flake8 incompatability

* Other flake8 config try

* Reset to .flake8 file

* Downgrade of pre-commit to force dep. installation in CI

* Different pre-commit version

* Added depth_first tests

* Added depth_first_bnb tests

* Push to try and install all dependencies correctly

* Added back cache check for installing dependencies. New way of calling flake8, added class dependencies on tests

* Removed ls

* Core change test

* Core change test - new correctly

* Core archiving works

* Fixed pytest ordering and upgraded pandas version to trigger new cache

* Upgraded pandas

* Changed pandas version to 1.1.0

* Removed caching of CI and moved code to new PR

* Add CI caching for python environments (#23)

* Added first pytest script for protein class and a script for performing local tests.

* Added pytest in CI

* Forgot -r for file

* Version of pytest with pytest at end of pipeline

* New pytest CI where pip installs local package

* Fixed tabs and added caching of python/pip environment

* Removed dot and added removal of build dirs to manage clean command

* Added cache ignores for env setup and ids for the caches

* Fixed cache IDs

* Renamed cache because GitHub does not allow for clearing caches...

* Moved python setup to be before cache loading

* Run that will install the dependencies

* Uncommented the cache-hit detection for installing dependencies

* Updated pre-commit version in hope that runner will create new cache

* Added more tests and added flake8 incompatability

* Other flake8 config try

* Reset to .flake8 file

* Downgrade of pre-commit to force dep. installation in CI

* Different pre-commit version

* Added depth_first tests

* Added depth_first_bnb tests

* Push to try and install all dependencies correctly

* Added back cache check for installing dependencies. New way of calling flake8, added class dependencies on tests

* Removed ls

* Core change test

* Core change test - new correctly

* Core archiving works

* Fixed pytest ordering and upgraded pandas version to trigger new cache

* Upgraded pandas

* Changed pandas version to 1.1.0

* Dependency check

* new flake8 installation

* Flake8 action

* Added pip update flag

* Cleaned up the flake8 action usage

* Trying to install new deps

* Removed caching, only using dependencies during pytest

* Add documentation to project (#24)

* Ran the sphinx quickstart, ignored mypy on docs

* First version of docs

* Added installation instructions for python

* Finished installation page and added quickstart info (not done yet)

* Finished v1 of the quickstart guide

* Removed heterogeneous setup page and added todo for creating example

* Added manpages for the datasets and algorithms

* Added helpers and visualize documentation

* Added placeholders for the Protein properties

* Added methods of Protein to the documentation

* Small changes

* Reworked the README

* Added whitespace for enter

* Added github star and filler-logo

* Starting on logo

* Added logo

* Removed github fork banner

* Logo test

* Changed logo loading

* New test

* Trying image tag

* Trying image tag 2.0

* Trying image tag 2.0

* Trying relative link

* New size

* New logo test

* New logo try

* New logo

* Reworked logo

* Downgraded matplotlib to alllow CI pipeline

* Added edittable logo, fixed small rst things, fixed compression of cores

* Added some figures, added reference to license, added license

* Started on AminoAcid class (#28)

* Started on AminoAcid class

* Added comments in core, still bugs to sort out

* Fixed more bugs in the core when adding AminoAcid class

* Moved part of bind

* Integrated AminoAcid class and fixed Protein tests

* Mid way testing for new depth-first approach

* Added local check script and depth_first works for HPPH

* Fixed depth_first search new version

* Fixed depth_first_bnb algorithm using new system

* Fixed small core bugs, working on new logo

* Mid-way of changing bond_value  structure

* Cleaned up bugs from intermediate version. Bumped versions of requirements. Introduced max_weights string for keeping track of possible future scores. Merged bond_semetry model setup with the else model setup. Fixed bug with cur_len of protein always being 1. Fixed getting the weight of a amino bond

* Bumbed python version in github workflow for matplotlib version dependency

* Bumped workflow python version to 3.9 as numpy 1.23 requires so

* Removed h_idxs from prune function. Nothing has been tested

* Fixed Protein signature for pybind build

* Setup for debugging current protein issues

* Introduced core testing code

* Finished test script for amino acids

* Started on Protein core tests

* Fixed protein test compilation

* Fixed first couple protein generation checks

* Added more debug statements for core tests

* Fixed bugs with bond checks

* Fixed generation of weighted amino maps

* Finished all protein generation test

* Added debug options in script to run core tests with gdb

* Added removal of amino tests

* Updated some pytest asserts. Added pytest and core_test asserts for score updates

* Updated reference for black in pre-commit

* Try adding core build in github actions pipeline

* Fixed local algorithm core tests

* Fixed dfs_bnb

* Fixed pybind11 change to providing protein pointers

* Added special compilation case for MacOS

* Changed minimum Python version to 3.9 as 3.11 will release soon

* Changed always build to only build python 3.9 versions

* Changed CIwheel builds to be specific instead of exlcuding based

* Leaving documentation as is and adding issue for the future

* Added more licensing references

* Added config file for rtfd to set python version to 3.9

* Temp commit to switch branch

* Update core merge (#33)

* Started on AminoAcid class

* Added comments in core, still bugs to sort out

* Fixed more bugs in the core when adding AminoAcid class

* Moved part of bind

* Integrated AminoAcid class and fixed Protein tests

* Mid way testing for new depth-first approach

* Added local check script and depth_first works for HPPH

* Fixed depth_first search new version

* Fixed depth_first_bnb algorithm using new system

* Fixed small core bugs, working on new logo

* Mid-way of changing bond_value  structure

* Cleaned up bugs from intermediate version. Bumped versions of requirements. Introduced max_weights string for keeping track of possible future scores. Merged bond_semetry model setup with the else model setup. Fixed bug with cur_len of protein always being 1. Fixed getting the weight of a amino bond

* Bumbed python version in github workflow for matplotlib version dependency

* Bumped workflow python version to 3.9 as numpy 1.23 requires so

* Removed h_idxs from prune function. Nothing has been tested

* Fixed Protein signature for pybind build

* Setup for debugging current protein issues

* Introduced core testing code

* Finished test script for amino acids

* Started on Protein core tests

* Fixed protein test compilation

* Fixed first couple protein generation checks

* Added more debug statements for core tests

* Fixed bugs with bond checks

* Fixed generation of weighted amino maps

* Finished all protein generation test

* Added debug options in scriptn to run core tests with gdb

* Fixed the last_pos indexing error from place_amino. Changed the 'changed' variable to 'solutions_found'-like variable

* Solved typing issues

* Finished protein movement tests

* Added removal of amino tests

* Updated some pytest asserts. Added pytest and core_test asserts for score updates

* Updated reference for black in pre-commit

* Fixed score generation of core_test

* Fixed score update through removal

* Try adding core build in github actions pipeline

* Started on algorithm code

* Changed signature of depth_first to use pointers. Added testing code for 2d and 3d depth_first tests

* Fixed local algorithm core tests

* Fixed dfs_bnb

* Fixed pybind11 change to providing protein pointers

* Added dynamic_lookup for linking python in case of undefined symbols

* New way of setting -undefined flag

* Updated the way of passing -undefined setting for macos

* Added special compilation case for MacOS

* Saving Furo update for issue

* Changed minimum Python version to 3.9 as 3.11 will release soon

* Updated versions for wheel deployment

* Changed always build to only build python 3.9 versions

* Changed CIwheel builds to be specific instead of exlcuding based

* Leaving documentation as is and adding issue for the future

* Added more licensing references

* Added config file for rtfd to set python version to 3.9

* Removed commented code

* Splitted visualization function to allow for multiple styles

* Fixed function for plotting proteins

* Fixed bug where no bonds would form

* Fixed paper style

* V1 paper style plots

* Fixed first paper visualization. Added visualization to test set

* Ran pre-commit
  • Loading branch information
okkevaneck authored Nov 10, 2022
1 parent 8ad32ad commit 0736eff
Show file tree
Hide file tree
Showing 9 changed files with 262 additions and 77 deletions.
Binary file modified archives/prospr_core.tar.gz
Binary file not shown.
Binary file modified archives/prospr_core.zip
Binary file not shown.
Binary file modified archives/prospr_data.tar.gz
Binary file not shown.
Binary file modified archives/prospr_data.zip
Binary file not shown.
12 changes: 12 additions & 0 deletions manage.sh
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,18 @@ case "$1" in
echo "~ Running core tests.."
./"$COREDIR/tests/run_tests.sh" "$2"
;;
# Test visualizations without building the Python interfaces.
"test_visualize")
echo "~ Running visualize tests.."
echo "~ Uninstalling old prospr.."
pip uninstall -qy prospr
echo "~ Installing new prospr.."
pip install -q .
python tests/visualize/test_visualization.py
echo "~ Uninstalling old prospr.."
pip uninstall -qy prospr
echo "~ Done running tests!"
;;
# Test core without building the Python interfaces.
"debug_core")
echo "~ Running core tests.."
Expand Down
3 changes: 2 additions & 1 deletion prospr/core/core_module.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ PYBIND11_MODULE(prospr_core, m) {
.def(py::init<const std::string, int, const std::string,
std::map<std::string, int>, bool &>(),
"Protein constructor", py::arg("sequence"), py::arg("dim")=2,
py::arg("model")="", py::arg("bond_values")=bond_values,
py::arg("model")="HP", py::arg("bond_values")=bond_values,
py::arg("bond_symmetry")=true)
.def_property_readonly("solutions_checked",
&Protein::get_solutions_checked)
Expand All @@ -52,6 +52,7 @@ PYBIND11_MODULE(prospr_core, m) {
.def_property_readonly("last_pos", &Protein::get_last_pos)
.def_property_readonly("score", &Protein::get_score)
.def_property_readonly("sequence", &Protein::get_sequence)
.def_property_readonly("max_weights", &Protein::get_max_weights)

.def("get_amino", &Protein::get_amino,
"Get amino index and next direction from amino at given position",
Expand Down
7 changes: 4 additions & 3 deletions prospr/helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,10 @@ def get_scoring_aminos(protein):
amino_acid = protein.get_amino(cur_pos)
idx = amino_acid.index
next_dir = amino_acid.next_move
max_weights = protein.max_weights

# Store origin if it may score points.
if protein.is_hydro(idx):
if max_weights[idx] < 0:
score_pos[tuple(cur_pos)] = np.array([0, next_dir], dtype=np.int64)

while next_dir != 0:
Expand All @@ -36,7 +37,7 @@ def get_scoring_aminos(protein):
next_dir = fold

# Save amino if it may score points.
if protein.is_hydro(idx):
if max_weights[idx] < 0:
score_pos[tuple(cur_pos)] = np.array(
[prev_dir, next_dir], dtype=np.int64
)
Expand All @@ -52,7 +53,7 @@ def get_scoring_pairs(protein):
# Get dictionary with the amino's that can possibly score points.
score_aminos = get_scoring_aminos(protein)

# Sort positions from bottom-left to upper-rigth.
# Sort positions from bottom-left to upper-right.
moves = np.array([m for m in range(1, protein.dim + 1)])
pairs = np.empty((1, 2, protein.dim), dtype=np.int64)

Expand Down
273 changes: 200 additions & 73 deletions prospr/visualize.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,18 +17,13 @@
import pandas as pd


def _plot_protein_2d(protein, ax):
def _plot_aminos_2d_basic(protein, df, ax):
"""
:param protein:
:param ax:
Plot amino acids in basic style in a 2D figure.
:param Protein protein: Protein object to plot the hash of.
:param DataFrame df: DataFrame with all ordered positions.
:param Axes ax: Axis to plot on.
"""
# Setup dataframe containing the data and set types for the coordinates.
df = pd.DataFrame(
get_ordered_positions(protein), columns=["x", "y", "Type"]
)
df = df.astype({"x": "int32", "y": "int32"})

ax.plot(df["x"], df["y"], color="black", alpha=0.65, zorder=1)
sns.scatterplot(
x="x",
Expand All @@ -55,44 +50,72 @@ def _plot_protein_2d(protein, ax):
color="indianred",
alpha=0.9,
zorder=1,
lw=1.5,
lw=2,
)

# Set axis labels.
ax.set_title(f"2D conformation with {protein.score} energy")
ax.set_xlabel("x-axis", fontsize=13)
ax.set_ylabel("y-axis", fontsize=13)
ax.xaxis.set_major_locator(MaxNLocator(integer=True))
ax.yaxis.set_major_locator(MaxNLocator(integer=True))

# Remove title from legend and add item for bonds.
handles, labels = ax.get_legend_handles_labels()
score_patch = Line2D(
[],
[],
color="indianred",
linestyle=":",
alpha=0.9,
label="Contact",
lw=1.5,
def _plot_aminos_2d_paper(protein, df, ax):
"""
Plot amino acids in paper style in a 2D figure.
:param Protein protein: Protein object to plot the hash of.
:param DataFrame df: DataFrame with all ordered positions.
:param Axes ax: Axis to plot on.
"""
# Split dataframe on amino acid type.
df_H = df.loc[df["Type"] == "H"]
df_P = df.loc[df["Type"] == "P"]

ax.plot(df["x"], df["y"], color="black", alpha=0.65, zorder=1)
sns.scatterplot(
x="x",
y="y",
data=df_H,
marker="o",
edgecolor="royalblue",
s=80,
zorder=2,
ax=ax,
label="H",
)
sns.scatterplot(
x="x",
y="y",
data=df_P,
marker="o",
facecolor="white",
edgecolor="orange",
linewidth=2,
s=80,
zorder=2,
ax=ax,
label="P",
)
handles.append(score_patch)
labels.append(score_patch.get_label())
ax.legend(handles=handles, labels=labels)

# Plot dotted lines between the aminos that increase the stability.
pairs = get_scoring_pairs(protein)

def _plot_protein_3d(protein, ax):
"""
for pos1, pos2 in pairs:
ax.plot(
[pos1[0], pos2[0]],
[pos1[1], pos2[1]],
linestyle=":",
color="indianred",
alpha=0.9,
zorder=1,
lw=2,
)

# Remove axis, and position legend in the upper right with created space.
ax.axis("off")

:param protein:
:param ax:
"""
# Setup dataframe containing the data and set types for the coordinates.
df = pd.DataFrame(
get_ordered_positions(protein), columns=["x", "y", "z", "Type"]
)
df = df.astype({"x": "int32", "y": "int32", "z": "int32"})

def _plot_aminos_3d_basic(protein, df, ax):
"""
Plot amino acids in basic style in a 3D figure.
:param Protein protein: Protein object to plot the hash of.
:param DataFrame df: DataFrame with all ordered positions.
:param Axes ax: Axis to plot on.
"""
# Split dataframe on amino acid type.
df_H = df.loc[df["Type"] == "H"]
df_P = df.loc[df["Type"] == "P"]
Expand Down Expand Up @@ -132,17 +155,129 @@ def _plot_protein_3d(protein, ax):
color="indianred",
alpha=0.9,
zorder=1,
lw=1.5,
lw=2,
)

# Set axis labels and tics.
ax.set_title(f"3D conformation with {protein.score} energy")
ax.set_xlabel("x-axis", fontsize=13)
ax.set_ylabel("y-axis", fontsize=13)
ax.set_zlabel("z-axis", fontsize=13)
ax.xaxis.set_major_locator(MaxNLocator(integer=True))
ax.yaxis.set_major_locator(MaxNLocator(integer=True))
ax.zaxis.set_major_locator(MaxNLocator(integer=True))

def _plot_aminos_3d_paper(protein, df, ax):
"""
Plot amino acids in paper style in a 3D figure.
:param Protein protein: Protein object to plot the hash of.
:param DataFrame df: DataFrame with all ordered positions.
:param Axes ax: Axis to plot on.
"""
# Split dataframe on amino acid type.
df_H = df.loc[df["Type"] == "H"]
df_P = df.loc[df["Type"] == "P"]

ax.plot(df["x"], df["y"], df["z"], color="black", alpha=0.65, zorder=1)

sns.scatterplot(
df_H["x"],
df_H["y"],
df_H["z"],
data=df_H,
marker="o",
edgecolor="royalblue",
s=60,
zorder=2,
ax=ax,
label="H",
)
sns.scatterplot(
df_P["x"],
df_P["y"],
df_P["z"],
data=df_P,
marker="o",
facecolor="white",
edgecolor="orange",
linewidth=2,
s=60,
zorder=2,
ax=ax,
label="P",
)

# Plot dotted lines between the aminos that increase the stability.
pairs = get_scoring_pairs(protein)

for pos1, pos2 in pairs:
ax.plot(
[pos1[0], pos2[0]],
[pos1[1], pos2[1]],
[pos1[2], pos2[2]],
linestyle=":",
color="indianred",
alpha=0.9,
zorder=1,
lw=2,
)

# Remove axis, and position legend in the upper right with created space.
ax.axis("off")


def plot_protein(protein, style="basic", ax=None, show=True):
"""
Plot conformation of a protein.
:param Protein protein: Protein object to plot the hash of.
:param [str] style: What style to plot the proteins in.
:param Axes ax: Axis to plot Protein on.
"""
# Catch unplottable dimensions.
if protein.dim != 2 and protein.dim != 3:
raise RuntimeError(
f"Cannot plot the structure of a protein with "
f"dimension '{protein.dim}'"
)

# Create axis to plot onto if not given.
if ax is None:
if style == "paper":
fig = plt.figure(figsize=(4, 2.5))
else:
fig = plt.figure(figsize=(5, 6))
sns.set_style("whitegrid")

if protein.dim == 2:
ax = fig.gca()
else:
ax = fig.gca(projection="3d")

# Fetch data in right dimension.
if protein.dim == 2:
df = pd.DataFrame(
get_ordered_positions(protein), columns=["x", "y", "Type"]
)
df = df.astype({"x": "int32", "y": "int32"})
else:
df = pd.DataFrame(
get_ordered_positions(protein), columns=["x", "y", "z", "Type"]
)
df = df.astype({"x": "int32", "y": "int32", "z": "int32"})

# Plot the selected style.
if style == "paper":
if protein.dim == 2:
_plot_aminos_2d_paper(protein, df, ax)
else:
_plot_aminos_3d_paper(protein, df, ax)
elif style == "basic":
ax.set_xlabel("x-axis", fontsize=13)
ax.set_ylabel("y-axis", fontsize=13)
ax.xaxis.set_major_locator(MaxNLocator(integer=True))
ax.yaxis.set_major_locator(MaxNLocator(integer=True))

# Plot dimension specific.
if protein.dim == 2:
ax.set_title(f"2D conformation with {protein.score} energy")
_plot_aminos_2d_basic(protein, df, ax)
else:
ax.set_title(f"3D conformation with {protein.score} energy")
ax.set_zlabel("z-axis", fontsize=13)
ax.zaxis.set_major_locator(MaxNLocator(integer=True))
_plot_aminos_3d_basic(protein, df, ax)

# Remove title from legend and add item for bonds.
handles, labels = ax.get_legend_handles_labels()
Expand All @@ -152,33 +287,25 @@ def _plot_protein_3d(protein, ax):
color="indianred",
linestyle=":",
alpha=0.9,
label="Contact",
lw=1.5,
label="Bond",
lw=2,
)
handles.append(score_patch)
labels.append(score_patch.get_label())
ax.legend(handles=handles, labels=labels)


def plot_protein(protein):
"""
Plot conformation of a protein.
:param Protein protein: Protein object to plot the hash of.
"""
fig = plt.figure(figsize=(6, 5))
sns.set_style("whitegrid")

# Plot data according to used dimension.
if protein.dim == 2:
ax = fig.gca()
_plot_protein_2d(protein, ax)
elif protein.dim == 3:
ax = fig.gca(projection="3d")
_plot_protein_3d(protein, ax)
else:
raise RuntimeError(
f"Cannot plot the structure of a protein with "
f"dimension '{protein.dim}'"
# Style legend according to plotting style.
if style == "paper":
box = ax.get_position()
ax.set_position([box.x0, box.y0, box.width * 0.7, box.height])
ax.legend(
handles=handles,
labels=labels,
loc="upper left",
bbox_to_anchor=(1, 1),
)
else:
ax.legend(handles=handles, labels=labels)

plt.show()
# Show plot if specified.
if show:
plt.show()
Loading

0 comments on commit 0736eff

Please sign in to comment.