Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
daniellawson9999 authored Sep 20, 2022
1 parent ca218ee commit d28f77e
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ Alternatively, the environment can be setup using Docker with the attached Docke

There are some small implementation detail differences and other assumed implementation detials that may be different.

Originally, the policy is parameterized as:
$$\pi_\theta(a_t|s_{-K,t}, g_{-K,t}) = N(\mu_\theta(s_{-K,t}, g_{-K,t}), \Sigma_{\theta}(s_{-K,t}, g_{-K,t}))$$
<!--- Originally, the policy is parameterized as:
$$\pi_\theta(a_t|s_{-K,t}, g_{-K,t}) = N(\mu_\theta(s_{-K,t}, g_{-K,t}), \Sigma_{\theta}(s_{-K,t}, g_{-K,t}))$$ -->

However, we use $$\mathbf{tanh}(N(\mu_\theta(s_{-K,t}, g_{-K,t}), \Sigma_{\theta}(s_{-K,t}, g_{-K,t})))$$

Expand Down

0 comments on commit d28f77e

Please sign in to comment.