[Question] Single action episode #507
Unanswered
riccardobussola
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi everyone,
A brief introduction: my RL task consists of learning the optimal parameters of a trajectory (e.g. a Spline, or a Bezier Curve) in Cartesian Space that a quadruped has to follow for a non-constant time t_follow.
I'm trying to implement this using Orbit's RLTaskEnv but I'm struggling with a few things.
My episode asks for the action one single time (since once the parameters are obtained I have only to calculate the trajectory). For the rest of the simulation, the robot has to follow the obtained trajectory for a certain amount of time that varies from episode to episode (so a Task space IK controller has to run in the background).
In this case, the episode corresponds to a single policy step where the reward and a possible NN update are done at the termination once I can verify where is the robot's final position/orientation.
Is there a way to implement this with Orbit without rewriting the logic of the RLTaskEnv provided?
I can't change only the decimation factor since an episode has a variable duration depending on t_follow.
Many thanks for considering my request.
Beta Was this translation helpful? Give feedback.
All reactions