Skip to content

Architecture

Jeff Flatten edited this page Mar 28, 2024 · 40 revisions

Overview

The basic concepts of TMol's architecture are:

  • The PoseStack - This is a collection of Poses (structures) that are the focus of the work that tmol will be performing.
  • The ScoreFunction - A function that will evaluate a PoseStack with one or more ScoreTerms.
  • The Minimizer - A gradient-descent algorithm that modifies degrees-of-freedom of a PoseStack to minimize the value of a ScoreFunction.

PoseStack

A PoseStack represents a set (a batch) of molecular systems. PoseStacks maybe be loaded in directly from PDBs, or may come in from other sources such as RosettaFold. Each "Pose" within the stack is composed of some number of "blocks".

Blocks

tmol breaks poses down into sub-structures called 'blocks' (a generalization of the concept of 'residue' from Rosetta). Each block is one of several "block types". Block types comprise a set of atoms, the properties of those atoms, the chemical bonds between those atoms, and the inter-block chemical bonds that will join blocks together. The molecular system records the block type for each block, the coordinates of each atom, and how each block is connected (or not connected) to other blocks.

Each "Pose" in the stack can hold as many residues as desired

Note

tmol will be more efficient in computing scores for systems that are approximately the same size than for systems that have very different sizes) and these blocks can be bonded together into one or more chains.

PoseStack holds a class that holds the residue-type data for its residues. Most users will not interact directly with this class, but it is good to be aware that it is there. This class, PackedBlockTypes, is built from a collection of RefinedResidueType objects that are constructed either from tmol's default set of residue types, described in a .yaml file, or that are created programmatically by enterprising developers. Most users will be content to interact with the PackedBlockTypes object returned by tmol.default_packed_block_types() if they want to think about this class at all. The PackedBlockTypes object will be used to cache energy-term-specific data that is needed during energy evaluation, and the creation of this data can be somewhat slow; thus, it is most efficient to share a single PackedBlockTypes object between multiple PoseStacks.

The coordinates of a PoseStack can be modified after construction; however, all other data members must be left unaltered. If you want to modify the residue type information for an existing PoseStack, you should construct a new PoseStack object instead.

ScoreFunction

The ScoreFunction calculates the energies of a PoseStack using a weighted sum of ScoreTerms. Many ScoreTerms are provided, some of which represent physical forces like electrostatics and van der Waals' interactions, while others represent statistical terms like the probability of finding the torsion angles in Ramachandran space. New score terms may also be added.

When a ScoreFunction evaluates a PoseStack, each ScoreTerm breaks down the work on a per-block or per-block-pair basis, depending on if the ScoreTerm is a 1-body or 2-body term. This per-block and per-block-pair work is all dispatched simultaneously, letting the GPU handle the scheduling of the individual block-based calculations.

Clone this wiki locally