Skip to content

Commit

Permalink
Remove Sparsity from metrics
Browse files Browse the repository at this point in the history
  • Loading branch information
ffelten committed Oct 22, 2024
1 parent 8d2f89b commit 71f1694
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 5 deletions.
2 changes: 1 addition & 1 deletion docs/algos/performances.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ For single-policy algorithms, the metric used will be the scalarized return of t

### Multi-policy algorithms
For multi-policy algorithms, we propose to rely on various metrics to assess the quality of the **discounted** Pareto Fronts (PF) or Convex Coverage Set (CCS). In general, we want to have a metric that is able to assess the convergence of the PF, a metric that is able to assess the diversity of the PF, and a hybrid metric assessing both. The metrics are implemented in `common/performance_indicators`. We propose to use the following metrics:
* (Diversity) Sparsity: average distance between each consecutive point in the PF. From the PGMORL paper [1]. Keyword: `eval/sparsity`.
* **[Do not use]** (Diversity) Sparsity: average distance between each consecutive point in the PF. From the PGMORL paper [1]. Keyword: `eval/sparsity`.
* (Diversity) Cardinality: number of points in the PF. Keyword: `eval/cardinality`.
* (Convergence) IGD: a SOTA metric from Multi-Objective Optimization (MOO) literature. It requires a reference PF that we can compute a posteriori. That is, we do a merge of all the PFs found by the method and compute the IGD with respect to this reference PF. Keyword: `eval/igd`.
* (Hybrid) Hypervolume: a SOTA metric from MOO and MORL literature. Keyword: `eval/hypervolume`.
Expand Down
4 changes: 0 additions & 4 deletions morl_baselines/common/evaluation.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@
hypervolume,
igd,
maximum_utility_loss,
sparsity,
)
from morl_baselines.common.weights import equally_spaced_weights

Expand Down Expand Up @@ -156,7 +155,6 @@ def log_all_multi_policy_metrics(
Logged metrics:
- hypervolume
- sparsity
- expected utility metric (EUM)
If a reference front is provided, also logs:
- Inverted generational distance (IGD)
Expand All @@ -172,14 +170,12 @@ def log_all_multi_policy_metrics(
"""
filtered_front = list(filter_pareto_dominated(current_front))
hv = hypervolume(hv_ref_point, filtered_front)
sp = sparsity(filtered_front)
eum = expected_utility(filtered_front, weights_set=equally_spaced_weights(reward_dim, n_sample_weights))
card = cardinality(filtered_front)

wandb.log(
{
"eval/hypervolume": hv,
"eval/sparsity": sp,
"eval/eum": eum,
"eval/cardinality": card,
"global_step": global_step,
Expand Down
3 changes: 3 additions & 0 deletions morl_baselines/common/performance_indicators.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ def igd(known_front: List[np.ndarray], current_estimate: List[np.ndarray]) -> fl
def sparsity(front: List[np.ndarray]) -> float:
"""Sparsity metric from PGMORL.
(!) This metric only considers the points from the PF identified by the algorithm, not the full objective space.
Therefore, it is misleading (e.g. learning only one point is considered good) and we recommend not using it when comparing algorithms.
Basically, the sparsity is the average distance between each point in the front.
Args:
Expand Down

0 comments on commit 71f1694

Please sign in to comment.