Merge branch 'master' of https://github.com/hari-sikchi/safeRL

hari-sikchi · Mar 31, 2019 · 9c2db47 · 9c2db47
2 parents 26b9cdd + 35388b8
commit 9c2db47
Showing 1 changed file with 29 additions and 3 deletions.
diff --git a/README.md b/README.md
@@ -1,14 +1,16 @@
 # Safe Reinforcement Learning Algorithms   
-=================
+
+***
 
 <!--ts-->
    * [HCOPE](#hcope)
    * [Safe Exploration](#safe_exploration)
    * [Off Policy Evaluation](#importance_sampling)
+   * [Solving side effects](#side_effects)
 <!--te-->
 
 
-
+***
 
 <a name="hcope"></a>
 ## HCOPE (High-Confidence Off-Policy Evaluation.)
@@ -54,6 +56,8 @@ method. Also, a graph of distribution of Importance sampling ratio is created wh
 Output format: 
 ![Output](https://github.com/hari-sikchi/safeRL/blob/master/results/Result.png)   
 
+***
+
 <a name="safe_exploration"></a>
 ## Safe exploration in continuous action spaces.
 
@@ -98,7 +102,7 @@ This enables agent to learn while following the safety constraints.
 ![Action Correction](https://github.com/hari-sikchi/safeRL/blob/master/results/safety_optimization.png)   
 
 
-
+***
 
 <a name="importance_sampling"></a>
 ## Importance Sampling
@@ -117,3 +121,25 @@ Comparision of different importance sampling estimators:
  Image is taken from phD thesis of P.Thomas:    
  Links: https://people.cs.umass.edu/~pthomas/papers/Thomas2015c.pdf   
 
+
+<a name="side_effects"></a>
+
+
+*** 
+
+## Side Effects
+### Penalizing side effects using relative reachability   
+
+Code - https://github.com/hari-sikchi/safeRL/tree/safe_recovery/side_effects    
+
+
+* Added a simple example for calculating side effects as given towards the end of paper
+![Environment](https://github.com/hari-sikchi/safeRL/blob/safe_recovery/side_effects/env.png)   
+
+The relative reachability measure    
+![Equation relative reachability](https://github.com/hari-sikchi/safeRL/blob/safe_recovery/side_effects/rr.png)   
+
+
+
+
+ Paper: Penalizing side effects using stepwise relative reachability - Krakovna et al.