Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TF serving #16

Open
pstjohn opened this issue Sep 25, 2020 · 2 comments
Open

TF serving #16

pstjohn opened this issue Sep 25, 2020 · 2 comments
Assignees

Comments

@pstjohn
Copy link
Collaborator

pstjohn commented Sep 25, 2020

  • explore tf serving for radical model
  • should we just serve the policy model?
@dmdu
Copy link
Collaborator

dmdu commented Oct 8, 2020

Started experimenting with tf_serving and batching (with ysi example so far).

  • To enable batching, updated: tf_serving_example/run_tf_serving-gpu.sh. My singularity line looks likes this: SINGULARITYENV_MODEL_NAME=ysi_model singularity exec --nv -B ./ysi_model:/models/ysi_model -B ./batch.config:/models/batch.config /projects/rlmolecule/pstjohn/containers/tensorflow-serving-gpu.simg tf_serving_entrypoint.sh --enable_batching --batching_parameters_file=/models/batch.config
    My batch.config looks like this (unoptimized; base on an example found online):

max_batch_size { value: 16 }
batch_timeout_micros { value: 100000 }
max_enqueued_batches { value: 1000000 }
num_batch_threads { value: 4 }
pad_variable_length_inputs: true

  • Still not sure about the last line in this config. Without it, I get an error though ("Tensors with name 'serving_default_atom:0' from different tasks have different shapes and padding is turned off.Set pad_variable_length_inputs to true, or ensure that all tensors with the same name have equal dimensions starting with the first dim.").

  • When I run run_tf_serving-gpu.sh, I see the log message: "Wrapping session to perform batch processing" -- an indication that it is indeed running in the batching mode.

  • I am currently trying to see why the responses I get from tf_serving don't match results from the local model. I'm not sure if this padding is causing it or I'm missing something else. I'd like to take a closer look at it with somebody who has some available cycles.

  • Another observation: when we call "tf.keras.models.save_model()", we will need to keep incrementing the version model so that tf_serving can unload the old model and load the new one (it looks like it monitors the directory with all models where the models are the directories numbered from the earliest to the latest and it loads the latest one). How are we planning to "auto-increment" the model ID? Is it going to be literally: check what is in the directory now -> ID=highest number + 1 -> save the model with this ID?

  • This page has info on optimizing batching: https://github.com/tensorflow/serving/blob/master/tensorflow_serving/batching/README.md

  • Another useful discussion with some examples: Batching parameters tensorflow/serving#344

@pstjohn
Copy link
Collaborator Author

pstjohn commented Oct 8, 2020

Looks like we'll need to either patch tf-serving or figure out a way to put a padded atom first in each input: tensorflow/serving#1279

Modifying our code, we could probably add a 'pad' atom, bond, and connectivity row to the beginning of each array. Then we'd increment all the connectivity values by 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants