-
Notifications
You must be signed in to change notification settings - Fork 50
Analyzing Data
In order to analyze the data generated by a CellModeller simulation, we have to analyze the pickle files. The pickle files, saved into the cellmodeller/data folder, contain the full information of the simulation at a given point in time. Note that if running in the GUI, you have to ensure that the 'Save Pickles' button in pressed.
To perform the analysis, we have to run scripts on the pickle files. As a simple example to demonstrate analysis, lets look at the cell lengths as a function of radial position within the colony. This can be found in the file Scripts/LengthHistogram.py. To have some data to analyze, lets also generate a small colony with the Tutorial 1a model. Once you have a pickle file you want to analyze, lets open the Terminal and go to the cellmodeller folder.
In order to run the script on the data, enter the Terminal and run:
$ cmpython Scripts/LengthHistogram.py data/MODEL_PATH/step-01000.pickle
This will run the script on a particular pickle (for step number 1000 in this case). Please replace MODEL_PATH with whatever model you are analyzing, and ensure that the pickle path is correct.
If we take a look at the script in detail, we see several functions.
The let's first look at the main function that deals with the analysis, lengthHist:
def lengthHist(pickle, bins, file=False):
print('opening '+ pickle)
data = cPickle.load(open(pickle,'r'))
cs = data['cellStates']
it = iter(cs)
n = len(cs)
print 'Number of cells = '+str(n)
lens = []
r = []
for it in cs:
radialPosition = rad_pos(cs[it])
cellLength = cs[it].length+2*cs[it].radius
r.append(radialPosition)
lens.append(cellLength)
if file:
file.write(str(radialPosition)+', '+ str(cellLength)+'\n')
...
lengthHist(file,bin,output_file)
This function takes the pickle file as a argument, and outputs a csv file containing the data about cellular radial positions and cell lengths for each cell in the colony. This is done by first getting the data from the pickle file by reading the pickle in data = cPickle.load(open(pickle,'r'))
. Then this data is unpacked to read the cellStates of all the cells in the simulation in cs = data['cellStates']
. The variable cs
is now python dictionary object containing the information of all the cells in the simulation.
This process can be done to statistically analyze any value in the cell state, and can be used to analyse them spatially, or in any other way that you see fit. In this case, the function is set up to look at the distribution of cell lengths as a function of the radius.
We then iterate over all the cells in cs
, and extract the cell length in each case, measured from end to end (this includes the 2*radius as this value includes the radius of the capsule). The radial position of the cell within the colony (colony growth was initiated from coordinates (0,0)), and added to the data. This data is then combined into a list, and can be written to a file, that can be analyzed with your favourite plotting/data analysis software.
As a simple example, here is the length distribution for all the bacteria in a simulated colony: