Very difficult combinatorial auction and capacitated facility location ecole instances #272

cwfparsonson · 2021-11-01T14:37:13Z

cwfparsonson
Nov 1, 2021

Hi,

I am trying to reproduce the results of the Gasse et al. 2019 paper (https://arxiv.org/pdf/1906.01629.pdf) for the 4 classes of CO problems. From Table 2 of the paper, it seems that in general, in terms of difficulty (evaluated by number of nodes) on the easy (training) sets, set_covering > capacitated_facility_location > combinatorial_auction > maximum_independent_set. However, I am finding that capacitated_facility_location and combinatorial_auction take several orders of magnitude more branch-and-bound nodes to solve than set_covering and maximum_independent_set, which is leading to prohibitively long dataset generation and training times.

I just wanted to check that I have the ecole.instance.Generator arguments and SCIP solver parameters correct in order to re-produce the Gasse et al. instances for the 4 CO classes. In the below code, I am generating num_tests=5 instances for each class of CO problem as I understand them to be generated by Gasse et al., and then solving the instances with strong branching. As shown, this results in orders of magnitude difference in number of nodes and solving time. Any help would be enormously appreciated!

import ecole
import numpy as np
import time
import EcoleBranching
import StrongBranchingAgent

seed = 0
np.random.seed(seed)
ecole.seed(seed)

# init objects
agent = StrongBranchingAgent()

env = EcoleBranching(observation_function='default',
                      information_function='default',
                      reward_function='default',
                      scip_params='gasse_2019')

co_classes = {'set_covering': ecole.instance.SetCoverGenerator(n_rows=500, n_cols=1000, density=0.05),
                'combinatorial_auction': ecole.instance.CombinatorialAuctionGenerator(n_items=100, n_bids=500),
                'capacitated_facility_location': ecole.instance.CapacitatedFacilityLocationGenerator(n_customers=100, n_facilities=100),
                'maximum_independent_set': ecole.instance.IndependentSetGenerator(n_nodes=100)}


# solve instances
num_tests = 5
co_class_stats = {key: {'num_nodes': [], 'solving_time': []} for key in co_classes.keys()}
for co_class in co_classes.keys():
    num_episodes = 0
    done = True
    while num_episodes < num_tests:
        # reset env
        while done:
            env.seed(seed)
            instance = next(co_classes[co_class])
            agent.before_reset(instance)
            obs, action_set, reward, done, info = env.reset(instance)
            
        # solve instance
        start_t = time.time()
        while not done:
            action, action_idx = agent.action_select(action_set, model=env.model, done=done)
            obs, action_set, reward, done, info = env.step(action)

        # record stats
        solving_time = time.time() - start_t
        num_nodes = info['num_nodes']
        num_episodes += 1
        co_class_stats[co_class]['num_nodes'].append(num_nodes)
        co_class_stats[co_class]['solving_time'].append(solving_time)

for co_class in co_class_stats.keys():
    print(f'{co_class} | mean num_nodes: {np.mean(co_class_stats[co_class]["num_nodes"])} | mean solving_time: {np.mean(co_class_stats[co_class]["solving_time"]):.3f} s')

Output:

set_covering | mean num_nodes: 13.4 | mean solving_time: 1.833 s
combinatorial_auction | mean num_nodes: 278.4 | mean solving_time: 27.175 s
capacitated_facility_location | mean num_nodes: 556.2 | mean solving_time: 227.102 s
maximum_independent_set | mean num_nodes: 11.2 | mean solving_time: 0.376 s

For completeness, here is how I define the agent and env classes in the above:

import ecole

class StrongBranchingAgent:
    def __init__(self, pseudo_candidates=False, name='sb'):
        self.name = name
        self.pseudo_candidates = pseudo_candidates
        self.strong_branching_function = ecole.observation.StrongBranchingScores(pseudo_candidates=pseudo_candidates)

    def before_reset(self, model):
        """
        This function will be called at initialization of the environment (before dynamics are reset).
        """
        self.strong_branching_function.before_reset(model)
    
    def extract(self, model, done):
        return self.strong_branching_function.extract(model, done)

    def action_select(self, action_set, model, done, **kwargs):
        scores = self.extract(model, done)
        action_idx = scores[action_set].argmax()
        return action_set[action_idx], action_idx

GASSE_2019_SCIP_PARAMS = {'separating/maxrounds': 0,
                                                       'presolving/maxrestarts': 0,
                                                       'limits/time': 3600}

class EcoleBranching(ecole.environment.Branching):
    def __init__(
        self,
        observation_function='default',
        information_function='default',
        reward_function='default',
        scip_params='gasse_2019',
        pseudo_candidates=False,
    ):
        self.pseudo_candidates = pseudo_candidates

        # init functions from strings if needed
        if reward_function == 'default':
            reward_function = ({
                     'num_nodes': -ecole.reward.NNodes(),
                     'lp_iterations': -ecole.reward.LpIterations(),
                     'primal_integral': -ecole.reward.PrimalIntegral(),
                     'dual_integral': ecole.reward.DualIntegral(),
                     'primal_dual_integral': -ecole.reward.PrimalDualIntegral(),
                     'solving_time': -ecole.reward.SolvingTime(),
                 })
        if information_function == 'default':
            information_function=({
                     'num_nodes': ecole.reward.NNodes().cumsum(),
                     'lp_iterations': ecole.reward.LpIterations().cumsum(),
                     'solving_time': ecole.reward.SolvingTime().cumsum(),
                     # 'primal_integral': ecole.reward.PrimalIntegral().cumsum(),
                     # 'dual_integral': ecole.reward.DualIntegral().cumsum(),
                     # 'primal_dual_integral': ecole.reward.PrimalDualIntegral(),
                 })
        if observation_function == 'default':    
            observation_function = (ecole.observation.NodeBipartite())
        if scip_params == 'gasse_2019':
            scip_params = GASSE_2019_SCIP_PARAMS
        
        super(EcoleBranching, self).__init__(
            observation_function=observation_function,
            information_function=information_function,
            reward_function=reward_function,
            scip_params=scip_params,
            pseudo_candidates=pseudo_candidates,
        )

dchetelat · 2021-11-03T20:46:53Z

dchetelat
Nov 3, 2021
Collaborator

Hi,

indeed in another thread the same issue is popping up. So there seems to be a discrepancy between the original and Ecole generator instances, at least. (You can generate yourself the original instances using the original code, if you'd like.)

The generators were coded to exactly reproduce these generators, so clearly this is a problem. We'll look into it.

I notice also that the set covering and maximum independent set problems look maybe a bit too easy as well? Although it might have to do with the underlying switch from SCIP 6 to SCIP 7.

0 replies

dchetelat · 2021-12-17T22:36:34Z

dchetelat
Dec 17, 2021
Collaborator

Hi @cwfparsonson ,

so, I finally took time to look into it. Here are my conclusions. I think that there is a discrepancy in difficulty between the combinatorial auction instances from the Ecole generator, and from the original paper generators. However, I didn't find any convincing evidence that there was an issue with the capacitated facility location problems. If I just generate them with Ecole, and with the original code, and solve them with the SCIP default setup for example, I get roughly the same mean number of nodes, etc.

I looked into your code - spent probably a bit too much time, in fact. I think the discrepancies can be explained in a few ways.

The first, main one, is I think you're doing something weird with your agent, because it is trying to extract an observation (the strong branching scores) independently of the Ecole mechanism. You should have done something like this instead:

import ecole
import numpy as np
import time

seed = 0
np.random.seed(seed)
ecole.seed(seed)

class StrongBranchingAgent:
    def __init__(self, name='sb'):
        self.name = name

    def action_select(self, state, action_set, **kwargs):
        action_idx = state[action_set].argmax()
        return action_set[action_idx], action_idx

# init objects
agent = StrongBranchingAgent()

env = EcoleBranching(observation_function=ecole.observation.StrongBranchingScores(pseudo_candidates=False),
                      information_function='default',
                      reward_function='default',
                      scip_params='gasse_2019')

co_classes = {'capacitated_facility_location': ecole.instance.CapacitatedFacilityLocationGenerator(n_customers=100, n_facilities=100)}


# solve instances
num_tests = 50
co_class_stats = {key: {'num_nodes': [], 'solving_time': []} for key in co_classes.keys()}
for co_class in co_classes.keys():
    num_episodes = 0
    done = True
    while num_episodes < num_tests:
        # reset env
        instance = next(co_classes[co_class])
        obs, action_set, reward, done, info = env.reset(instance)
        
        # solve instance
        start_t = time.time()
        while not done:
            action, action_idx = agent.action_select(obs, action_set)
            obs, action_set, reward, done, info = env.step(action)

        # record stats
        solving_time = time.time() - start_t
        num_nodes = info['num_nodes']
        num_episodes += 1
        co_class_stats[co_class]['num_nodes'].append(num_nodes)
        co_class_stats[co_class]['solving_time'].append(solving_time)

for co_class in co_class_stats.keys():
    print(f'{co_class} | mean num_nodes: {np.mean(co_class_stats[co_class]["num_nodes"])} | mean solving_time: {np.mean(co_class_stats[co_class]["solving_time"]):.3f} s')

which fits the MDP framework correctly. (Also, it's a good idea to evaluate on many more instances, I took 50 here, there is a lot of variation.) If you do this, then there doesn't seem to be any difference in mean difficulty. I'm not too sure why your code doesn't work, but for sure we never thought somebody would try to do something like this.

On top of that, there would be additional discrepancies with the numbers reported by the paper for a few reasons: first, the paper report the 1-shifted geometric mean, not the usual mean (as is commonly done in combinatorial optimization). And second, your code implements "vanilla strong branching", i.e. strong branching as described in a textbook, but the "strong branching" reported in the paper is the strong branching implementation of SCIP, which for many reasons underreport the number of nodes it actually opened. You can find more info about this topic in the following paper:
Gamrath, G., & Schubert, C. (2018). Measuring the impact of branching rules for mixed-integer programming. In Operations Research Proceedings 2017 (pp. 165-170). Springer

So I would suggest to change how you access the strong branching scores, i.e. passing it in the environment's observation function parameter, and then otherwise for combinatorial auctions we are still investigating the generators, but I'm sure we will eventually find the bug and fix it!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Very difficult combinatorial auction and capacitated facility location ecole instances #272

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Very difficult combinatorial auction and capacitated facility location ecole instances #272

cwfparsonson Nov 1, 2021

Replies: 2 comments

dchetelat Nov 3, 2021 Collaborator

dchetelat Dec 17, 2021 Collaborator

cwfparsonson
Nov 1, 2021

dchetelat
Nov 3, 2021
Collaborator

dchetelat
Dec 17, 2021
Collaborator