Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: move all probabilities in edges #4

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

frapac
Copy link
Member

@frapac frapac commented Dec 1, 2017

After a discussion with @leclere , we came to the conclusion that it may be of good idea to store all probability objects inside the edges of the graph, that is, edges implement all probability transitions between the different nodes, and the optimization model. In this case, stagewise indepency would be implemented as a single edge, which would store a probability distribution.

The idea behind storing optimization model in edges is that if we kept the JuMP.Model in memory, these models would be dependent on the value functions of the children node.

The procedure would become:

Forward pass
i) At a given node, sample npath edges out-going the node.
ii) Solve the optimization model in each selected edges, and return a list of Solution object.
iii) Pass the Solution objects to the children nodes.

Backward pass
i) At a given node, solve all outgoing edges and get back a list of Solution object.
ii) Pass the Solution object list to a cut generator and generate a new cut.
iii) Update the value function in node, and in in-going edges of node to update the optimization problem.

What do you think?

Questions:

  • we can avoid the use to store the state inside a Node if we use only Solution objects. While doing so, could we gain in efficiency?
  • how to wrap Lightgraph effectively?
  • How to avoid to store each value function three times (in the Node's value function, in the optimization model and in the cut pruner)?

@frapac frapac requested review from blegat, odow and joaquimg December 1, 2017 12:27
# - a noise, or a collection of noise (for stagewise noise)
# - an optimization problem (implemented with JuMP or MOI)

abstract type AbstractEdge end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't already defined as LightGraphs.AbstractEdge ? We need to decide whether we want to implement LightGraphs interface or create a new one.
In LightGraph, an edge is just Edge(src, dst). In this case, we need to do e.g.

sync!(sp::AbstractStochasticProgram, edge)

instead of

sync!(edge)

because the edge only contains the incoming and outgoing edge and the rest is stored in the stochastic program.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes presumably all these methods should take sp as the first argument.

Copy link
Member Author

@frapac frapac Dec 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, it seems that I do not have the same vision as yours :p But you are right concerning sp.

Concerning Lightgraphs, the arguments would be:

Pros

  • we would be able to use already existing methods in a lightweight package
  • able also to plot easily our graphs!

Cons

  • We have to find a proper (and efficient) way to overwrap Lightgraphs (for instance, how to loop easily on a node's outgoing edges?)

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might miss some of the argument, but why not use MetaGraphs ? As far as I know it's a very light package building on LightGraph allowing to attach to an edge or nodes any object.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to find a proper (and efficient) way to overwrap Lightgraphs (for instance, how to loop easily on a node's outgoing edges?)

The existing functionality out_neighbors is not efficient?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out_neighbors allows to loop upon outgoing nodes, and not outgoing edges. I do not find a proper function to loop upon outgoing edges, but I am not an expert on Lightgraphs ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@frapac The function out_edges has been removed, I think we now need to do Edge.(u, out_neighbors(g, u)) instead of out_edges(g, u).
@leclere I initially wanted to use MetaGraphs in StructDualDynProg but when you look at it here and here, it seems inefficient because all properties are stored in a Dict{Symbol, Any}. MetaGraphs cannot really do better at his abstract level (or maybe using macros as we do in MathOptInterfaceUtilities :-P) but we can do faster by storing dictionnaries like proba::Dict{SimpleEdge{Int}, Float64} which will be typed.

# - an index in the graph, and a stage
# - an optimization model (with objective, dynamics and constraints)
# - a value function
# - a list of outgoing edges


abstract type AbstractNode end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, in LightGraphs a node is just an Int.


Returns the tuple `(a, α)` representing the feasibility cut ``⟨a, x⟩ ≧ α`` certified by this solution.
"""
function feasibility_cut end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor point, but none of the other methods have _ so we should just be consistent with feasibilitycut.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok with renaming it feasibilitycut :)

# - a noise, or a collection of noise (for stagewise noise)
# - an optimization problem (implemented with JuMP or MOI)

abstract type AbstractEdge end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes presumably all these methods should take sp as the first argument.

@odow
Copy link
Member

odow commented Dec 3, 2017

we came to the conclusion that it may be of good idea to store all probability objects inside the edges of the graph, that is, edges implement all probability transitions between the different nodes, and the optimization model. In this case, stagewise indepency would be implemented as a single edge, which would store a probability distribution.

What is a node? How does hazard-decision and decision-hazard fit into this framework?

@frapac
Copy link
Member Author

frapac commented Dec 4, 2017

@odow To me, a node is where we store the model at time t (i.e an objective, some constraints and a value function). The controls' measurability is also specified here.

In edges, we implement the probability law to move from one node to another. If we have stagewise independency, we have a single outgoing edge storing a probability distribution.

Then, in the hazard-decision settings, SDDP would build a JuMP.Model on every edges, considering the model specified in the previous node.

In decision-hazard, the JuMP.Model would be on the nodes, thus storing a bigger problem than in the hazard-decision. If we were able to decompose the coupling constraint between the outgoing edges (e.g with the price of information), we would store the JuMP.Model on edges again.

@odow
Copy link
Member

odow commented Dec 4, 2017

Consider a graph like this:

           x - x
         /      \
x - x - x        y - x
         \      /
           x - x 

Each x is a node at which the agent chooses a control u (in either a hazard-decision, or decision-hazard setting).

If we make the noise on the arcs, then in the 2nd-to-last node (y), there are are two incoming arcs. That suggests that there can be two different distributions for the stagewise independent random variable.

To me, this says that the stagewise independent noise has to be associated with a node. The graph and edges just show the linkages between nodes (stages).

As for D-H vs. H-D, I prefer we just have a "and these controls are non-anticipative" constraint. The fundamental model shouldn't change (i.e. storing bigger models, or on edges vs nodes etc).

@frapac
Copy link
Member Author

frapac commented Dec 4, 2017

As far as I understand, the graph you have presented does not describe a stagewise independant case: the two edges in-going y are taken conditionally to the previous edges (let's call them x1 and x2). In this case, I think it makes sense to consider two different conditional distributions (wrt x1 and x2), as we are not stagewise independent.

However, I have to admit that I view the graph of the sp model just as a manner to express conditional dependencies between nodes. I think this is pretty close of the lattices introduced by Wozabal and Lohndorf.

Concerning DH, it seems to me that we cannot decouple the problem atom by atom because of the non-anticipativity constraint. I cannot see how to avoid to deal with a bigger problem when introducing
wait-and-see controls, unless we decompose the non-anticipativity constraint with the information price.

What do you think?

@odow
Copy link
Member

odow commented Dec 4, 2017

As far as I understand, the graph you have presented does not describe a stagewise independent case

   b
 /  \
a    d
 \  /
   c 

Let's assume this process models rainfall (the stagewise indepdent random variable). There are also 3 "climate" states. Normal (a, d), Wet (b), and Dry (c). I arrive in a and observe a stagewise independent realization of the rainfall. Then I either transition to the Wet or Dry climate states. There I observe another stagewise independent realization of the rainfall. However the values or probabilities of the rainfall distribution can be different in node b or c (for example, 0mm w.p. 0.2 and 10mm w.p. 0.8 in node b and 0mm w.p. 0.8 and 10mm w.p. 0.2 in node c). Next, I transition to node d and observe another stagewise independent realization of the rainfall. That observation is independent of whether I transitioned from b or c.

I think it makes sense to consider two different conditional distributions

The distributions live inside a node rather than on an arc. It is the branching that gives conditional distributions in a node.

I cannot see how to avoid to deal with a bigger problem when introducing wait-and-see controls, unless we decompose the non-anticipativity constraint with the information price.

I think we need to be careful distinguishing between the model (which has a "non-anticipative" constraint), and the form that SDDP finds nice for solving (i.e. the decoupled case).

@frapac
Copy link
Member Author

frapac commented Dec 5, 2017

@odow You have found the right example. You convince me!

So, let's include stagewise independency in nodes.

I think we need to be careful distinguishing between the model (which has a "non-anticipative"
constraint), and the form that SDDP finds nice for solving (i.e. the decoupled case).

Totally agree! Thus, StochOptInterface would implement only the problem's description, and SDDP.jl would build JuMP.Model accordingly to the previous problem, as far as I understand.

I think that we begin to harvest all necessary ingredients to frame a proper StochOptInterface.

@joaquimg
Copy link

joaquimg commented Dec 6, 2017

Coming late to the party...

I like distributions living in nodes like @odow proposed. But are we planning to have separate nodes for distributions and for decision?

At a given node, solve all outgoing edges and get back a list of Solution object.

I don't think that solving all child nodes should be enforced, at least in the MultiCut version, one might want to sample which nodes to solve from time to time, or whatever crazy idea one might have.

we can avoid the use to store the state inside a Node if we use only Solution objects. While doing so, could we gain in efficiency?

I like solutions objects, then algorithms are free to keep solutions in their own friendly storage, which might be a node of the graph or not...

@odow
Copy link
Member

odow commented Dec 6, 2017

But are we planning to have separate nodes for distributions and for decision?

No. Think of the nodes as decision problems parameterised by a random variable. They may include a "non-anticipative" constraint in the case of Decision-Hazard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants