Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFGPS #62

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

RFGPS #62

wants to merge 2 commits into from

Conversation

wmontgomery4
Copy link
Collaborator

Here's the commit for RFGPS, it's pretty large but here's the main takeaways:

  • Restructure algorithm.py and algorithm_mdgps.py to allow for fitting dynamics/policies based on cluster indices rather than condition (I couldn't quite move this all to algorithm_mdgps.py unfortunately)
  • Change CostState to only take a single data_type, rather than a dictionary, and also change the cost to be wp*(A.dot(x) - target). No one seems to really be using cost_state, and this is a more effective use of it in my opinion, but I'd be happy to change this back (although it will require some other changes since we need the A.dot(x) stuff to make it easy to get the difference between the end effector and target).

The one remaining issue is how to initialize the EM. Right now it starts with Kmeans, but I have some code which starts it from random points instead (commented out, near the end of algorithm_mdgps.py in the _cluster_traj_em method).

Copy link
Owner

@cbfinn cbfinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a fairly quick pass, mostly looking at style/clarity, not for technical correctness.
@anuragajay Please also look over the code, especially the technical parts.

@@ -69,7 +69,11 @@
'image_width': 640,
'image_height': 480,
'image_channels': 3,
'meta_include': []
'meta_include': [],
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments for these options would be nice, since they aren't self-explanatory.

@@ -99,6 +94,44 @@ def _setup_world(self, filename):
cam_pos[0], cam_pos[1], cam_pos[2],
cam_pos[3], cam_pos[4], cam_pos[5])

def reset_initial_x0(self, condition):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would a better name for this be "sample_initial_x0"?

Reset initial x0 randomly based on sampling_range_x0
and prohibited_ranges_x0
Args:
condition: Which condition setup to run.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update this comment.

Args:
condition: Which condition setup to run.
"""
tmp_body_pos_offset = self._hyperparams['pos_body_offset'][condition][:]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we are now inconsistent between pos_body and bodypos. Maybe we should change pos_body_offset to bodypos_offset?

self._hyperparams['pos_body_offset'][condition] = sample_params(
self._hyperparams['sampling_range_bodypos'],
self._hyperparams['prohibited_ranges_bodypos'])
# TODO: handle the i/j stuff as above
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this TODO mean?

@@ -231,3 +287,135 @@ def compute_costs(self, m, eta):
fcv[t, :] = (cv[t, :] + PKLv[t, :] * eta) / (eta + multiplier)

return fCm, fcv

def _cluster_samples(self, mode=None):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe put the clustering in a different file? This one is getting quite long.

"""
# Initialize starting with random data points as centers.
K = self._hyperparams['num_clusters']
# self.cluster_idx = np.random.choice(K, size=self.M)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove commented out code.

@@ -22,6 +22,8 @@
'cost': None, # A list of Cost objects for each condition.
# Whether or not to sample with neural net policy (only for badmm/mdgps).
'sample_on_policy': False,
# Number of clusters to use if clustering samples.
'num_clusters': 0,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set default to None rather than 0 to be more explicit? Also, say in the comment that this is used in RF-GPS (unless it can also be used for traj opt?)

'step_rule': 'laplace',
# How to cluster samples, 'random', 'kmeans', 'traj_em'.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention RF-GPS here.

COST_STATE = {
'ramp_option': RAMP_CONSTANT, # How target cost ramps over time.
'wp': None, # State weights - Defaults to ones.
'A': None, # A matrix - Defaults to identity.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain what the A matrix is in the comment.

@anuragajay
Copy link

@cbfinn The code looks correct to me. Should I also make the other changes you suggested?

@wmontgomery4
Copy link
Collaborator Author

Don't worry, I'll handle those other changes Anurag. Thanks for the comments Chelsea!

@wmontgomery4
Copy link
Collaborator Author

@cbfinn Now that the PIGPS stuff is merged, it's going to be kind of hard to merge this stuff in a way that isn't overly complicated. I think it would make sense to refactor the current codebase first, in a way that allows

@wmontgomery4
Copy link
Collaborator Author

Whoops, accidentally hit 'close and comment'. I was going to send this in the email thread instead, but I'll just say it here: I think it would make sense to focus on refactoring the current codebase before merging this, so that the rfgps code can be written more succinctly. I've got a few ideas about how that should go:

  • Move as much as possible to the base Algorithm class, or to algorithm_utils, so that each algorithm can be written nearly as tersely as the PI2 algorithms are right now. I think that pretty much everything but the __init__, iteration and _compute_costs functions could be moved out (and even the _compute_costs functions could be rewritten to just take some library calls). In general, I want to move away from using functions which take in the algorithm object, and instead take in the necessary parts of the algorithm, as this should make it easier to include RFGPS, and will make for better parallelism as well.
  • Keep MDGPS/PIGPS separate, and probably make RFGPS a separate algorithm as well. I think the idea of merging the MDGPS/PIGPS is nice, but there are enough differences that it will be kind of challenging to maintain that structure (the traj_opt module isn't totally replaceable, since it affects other things like whether or not we need to fit dynamics). Also, if we abstract things out of those files, writing an individual algorithm file should be pretty short.

I've got my quals exam in ~10 days, so I'm going to put this stuff at a lower priority right now, and make a branch off this for the RWR stuff. In the meantime this branch can serve as the public implementation of rfgps. Maybe we should have a refactoring meeting sometime soon to talk about this more?

@EdsterG
Copy link
Contributor

EdsterG commented Oct 28, 2016

I don't think we should change CostState to only take a single data_type. I found the dictionary of data_type's to be very useful. It may be best to make a new cost function CostStateWeighted?

@wmontgomery4
Copy link
Collaborator Author

That's a fair point, and I agree that changing the API at this point is unnecessarily confusing. The main reason I wanted to do it was to add the A parameter, as this allows you get arbitrary linear functions of the state (i.e. the distance between the end effector and target). I think that cost_linwp does something similar on one of the other branches though, I'll try to just make it look like that.

@cbfinn
Copy link
Owner

cbfinn commented Oct 29, 2016

Having a refactoring meeting sounds like a good idea. I'm pretty busy with ICLR right now (deadline in <7 days), so let's figure it out after your quals.

I agree with your two main points!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants