Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature updates and new README #3

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 30 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,32 @@ A web interface to the superman tools.
Starting from a fresh download of the source files,
a few steps are required before starting the server for the first time.

### 1: Install Dependencies
Easy Guide:

mkdir superman
cd superman
git clone https://github.com/all-umass/superman-web
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may be easier to keep the two repositories separate:

pip3 install git+https://github.com/all-umass/superman.git
git clone https://github.com/all-umass/superman-web
cd superman-web
pip3 install matplotlib tornado==4.4.2 pyyaml h5py pandas pywt sklearn

git clone https://github.com/all-umass/superman
cd superman
pip3 install -e .
cd ../superman-web
pip3 install matplotlib tornado==4.4.2 pyyaml h5py pandas pywt sklearn


Python (2.7 or 3.4+) is the main requirement for running the server.
Several Python packages are needed, available from PyPI via `pip`:
### 1: Install Dependencies

pip install --only-binary :all: superman matplotlib tornado pyyaml h5py pandas
While 2.7 and 3.4+ are supported, I've only tested with 3.4+
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our primary deployment is still using 2.7, so both are tested.


If you're not running Linux, `superman` may require special care to install.
See [the superman docs](https://github.com/all-umass/superman#installation) for instructions.
Python (3.4+) is the main requirement for running the server.
Several Python packages are needed, available from PyPI via `pip`:

For running tests, you'll want:
pip3 install matplotlib tornado==4.4.2 pyyaml h5py pandas pywt sklearn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we should put these in a requirements.txt file, but that can come later.


It will complain about `xylib` and `metakit`, but will only disable the ability to parse specific file types.
Neither of these packages are available on pip3 at the moment.

pip install pytest mock coverage
Make sure that you have set up your superman repo and installed it.
[Superman docs](https://github.com/all-umass/superman#installation)


### 2: Configure
Expand All @@ -33,7 +46,7 @@ In the same way, copy `datasets-template.yml` to `datasets.yml`
and update the listings to match your local datasets.


### 3: Add Datasets
### 3: Add Datasets (Optional)

Datasets are the basic unit of data in the superman server.
Add one by modifying the `datasets.yml` configuration file,
Expand All @@ -54,16 +67,22 @@ with any currently running server.

Or simply run the server directly, and handle the details yourself:

python superman_server.py
python3 superman_server.py

To stop the server without restarting it, use:

./restart_server.sh --kill

### 5: Testing (Optional)

For running tests, you'll want:

pip3 install pytest mock coverage

If you want to verify that everything is working as intended,
try running the test suite (located in the `test/` directory):

python -m pytest
python3 -m pytest

To generate a nice code coverage report:

Expand Down
69 changes: 35 additions & 34 deletions backend/dataset_loaders.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,42 +18,43 @@ def load_datasets(config_fh, custom_loaders, public_only=False, user_added=False
config = yaml.safe_load(config_fh)

for kind, entries in config.items():
for name, info in entries.items():
# skip this entry if it shouldn't be included
is_public = info.get('public', True)
if public_only and not is_public:
continue

if 'files' in info:
files = info['files']
else:
files = [info['file']]

if 'loader' in info:
# look up the loader function from the module namespace
loader_fn = getattr(custom_loaders, info['loader'])
else:
# construct a loader from the meta_mapping and the default template
meta_mapping = [(k, getattr(web_datasets, cls), mname)
for k, cls, mname in info.get('metadata', [])]
if info.get('vector', False):
loader_fn = _generic_vector_loader(meta_mapping)
if entries:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid indenting everything here, you can do:

if not entries:
  continue

for name, info in entries.items():
# skip this entry if it shouldn't be included
is_public = info.get('public', True)
if public_only and not is_public:
continue

if 'files' in info:
files = info['files']
else:
loader_fn = _generic_traj_loader(meta_mapping)
files = [info['file']]

if kind == 'LIBS':
ds = WebLIBSDataset(name, loader_fn, *files)
elif info.get('vector', False):
ds = WebVectorDataset(name, kind, loader_fn, *files)
else:
ds = WebTrajDataset(name, kind, loader_fn, *files)

if 'description' in info:
ds.description = info['description']
if 'urls' in info:
ds.urls = info['urls']
ds.is_public = is_public
ds.user_added = user_added
if 'loader' in info:
# look up the loader function from the module namespace
loader_fn = getattr(custom_loaders, info['loader'])
else:
# construct a loader from the meta_mapping and the default template
meta_mapping = [(k, getattr(web_datasets, cls), mname)
for k, cls, mname in info.get('metadata', [])]
if info.get('vector', False):
loader_fn = _generic_vector_loader(meta_mapping)
else:
loader_fn = _generic_traj_loader(meta_mapping)

if kind == 'LIBS':
ds = WebLIBSDataset(name, loader_fn, *files)
elif info.get('vector', False):
ds = WebVectorDataset(name, kind, loader_fn, *files)
else:
ds = WebTrajDataset(name, kind, loader_fn, *files)

if 'description' in info:
ds.description = info['description']
if 'urls' in info:
ds.urls = info['urls']
ds.is_public = is_public
ds.user_added = user_added


def try_load(filepath, data_name):
Expand Down
6 changes: 4 additions & 2 deletions restart_server.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@

# Parse result of geoiplookup to fit concisely on a line.
function ip_info() {
geoiplookup $1 | sed 1d | cut -d: -f2 | cut -d' ' -f3- | cut -d, -f2,3 | xargs
geoiplookup $1 | sed 1d | cut -d: -f2 | \
cut -d' ' -f3- | cut -d, -f2,3 | \
tr "'" '^' | xargs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the replacement of ' with ^ for?

}
export -f ip_info

Expand All @@ -12,7 +14,7 @@ function find_server_pid() {

function start_server() {
echo "Starting new server..."
nohup python superman_server.py &>logs/errors.out &
nohup python3 superman_server.py &>logs/errors.out &
$follow_log || echo "Use 'tail -f logs/server.log' to check on it"
sleep 1
if [[ -z "$(find_server_pid)" ]]; then
Expand Down