Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resume training process #4

Open
kilmarnock opened this issue Jan 23, 2016 · 4 comments
Open

resume training process #4

kilmarnock opened this issue Jan 23, 2016 · 4 comments

Comments

@kilmarnock
Copy link

Hi!
Is it (currently / in the future / generally spoken) possible to resume the training process of fast-dqn-caffe?

I assumed that giving a -model model/dqn_iter_1000000.caffemodel would resume training? I see a dqn_iter_1000000.solverstate but no option in the ./src/fast_dqn_main.cpp file. -solver has to point to a .prototxt file.

Thank you for the implementation.

@watts4speed
Copy link
Owner

Hi Kilmarnock,

It's possible but not implemented. That's a good feature to have. :-)

-Peter

On Sat, Jan 23, 2016 at 3:14 AM, kilmarnock [email protected]
wrote:

Hi!
Is it (currently / in the future / generally spoken) possible to resume
the training process of fast-dqn-caffe?

I assumed that giving a -model model/dqn_iter_1000000caffemodel would
resume training? I see a dqn_iter_1000000solverstate but no option in the
/src/fast_dqn_maincpp file -solver has to point to a prototxt file

Thank you for the implementation


Reply to this email directly or view it on GitHub
#4.

@kilmarnock
Copy link
Author

I have got it resuming using a piece of code like this:

void Fast_DQN::Initialize() {

  // Initialize dummy input data with 0
  std::fill(dummy_input_data_.begin(), dummy_input_data_.end(), 0.0);

  // Initialize net and solver
  caffe::SolverParameter solver_param;
  caffe::ReadProtoFromTextFileOrDie(solver_param_, &solver_param);

  solver_.reset(caffe::GetSolver<float>(solver_param));

  std::string fileName;
  fileName  = "./model/dqn_iter_1000000.solverstate";
  const char* resume_file;
  resume_file = fileName.c_str();
  solver_->Restore(resume_file);

  //http://caffe.berkeleyvision.org/doxygen/classcaffe_1_1Solver.html

I would write a "take the highest iteration count in the model directory", or "parse the solverstate from command line" but cpp is not mine. And I have never been here.

@watts4speed
Copy link
Owner

another thing to consider is snapshotting the replay_memory_. The network
trains correctly because of the experience built up in the replay memory.
To correctly resume restoring that state is also important. I think
github.com/mhauskn/recurrent-caffe has code to save and restore the replay
memory.

On Sat, Jan 23, 2016 at 1:31 PM, kilmarnock [email protected]
wrote:

I got it resuming using a piece of code like this:

void Fast_DQN::Initialize() {

// Initialize dummy input data with 0
std::fill(dummy_input_data_.begin(), dummy_input_data_.end(), 0.0);

// Initialize net and solver
caffe::SolverParameter solver_param;
caffe::ReadProtoFromTextFileOrDie(solver_param_, &solver_param);

solver_.reset(caffe::GetSolver(solver_param));

std::string fileName;
fileName = "./model/dqn_iter_1000000.solverstate";
const char* resume_file;
resume_file = fileName.c_str();
solver_->Restore(resume_file);

//http://caffe.berkeleyvision.org/doxygen/classcaffe_1_1Solver.html

I would write a "take the highest iteration count in the model directory",
or "parse the solverstate from command line" but cpp is not mine. And I
have never been here.


Reply to this email directly or view it on GitHub
#4 (comment)
.

@kilmarnock
Copy link
Author

kilmarnock commented Jan 25, 2016

I had a look at Mr. Hausknechts repository https://github.com/mhauskn/dqn in the recurrent branch, loading and storing replay memory is implemented there in full beauty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants