Implement the extended Rosenbrock function \[f (x) =\sum^{i=1}_{n/2} \left[a(x_{2i} - x^2_{2i-1})^2 + (1 - x_{2i-1} )^2\right]\] where \(a\) is a parameter that you can vary (for example, 1 or 100). The minimum is \(\vec x^*=[1, 1, \ldots , 1] , f(\vec x^*)=0\). Consider as starting point \([-1, -1, . . . , -1]\).
Solve the minimization problem with scipy.optimize using all methods seen in class that are suitable for this task. Observe the behavior of the calls for various values of parameters.
Use the COCO test suite (see article) to carry out this exercise. The advantages of the platform is that it provides:
a set of problem instances to use, about 1000 to 5000 problems (number of functions \(\times\) number of dimensions \(\times\) number of instances)
a collection of results from the literature
tools to launch and analyze the experiments
The COCO framework considers functions divided in suites. Functions, \(f_i\), within suites are distinguished by their identifier \(i=
1,2,...\). They are further parametrized by the (input) dimension, \(n\), and the instance number, \(j\). We can think of \(j\) as an index to a continuous parameter vector setting. It parametrizes, among other things, search space translations and rotations. In practice, the integer \(j\) identifies a single instantiation of these parameters. We then have: \[f^j_i \equiv f[n,i,j] : {\mathbb{R}}^n \to {\mathbb{R}}\qquad \vec x\mapsto f^j_i (\vec x)= f[n,i,j](\vec x).\] Varying \(n\) or \(j\) leads to a variation of the same function \(i\) of a given suite. Fixing \(n\) and \(j\) of function \(f_i\) defines an optimization problem instance \((n,i,j) \equiv (f_i,n,j)\) that can be presented to the solver. Each problem receives again an index within the suite, mapping the triple \((n,i,j)\) to a single number.
Varying the instance parameter \(j\) represents a natural randomization for experiments in order to:
generate repetitions on a single function for deterministic solvers, making deterministic and non-deterministic solvers directly comparable (both are benchmarked with the same experimental setup)
average away irrelevant aspects of the function definition,
alleviate the problem of overfitting, and
prevent exploitation of artificial function properties
We focus here only on the comparison between different implementations of BFGS. In particular, we compare a freshly run execution of scipy.optimize.minimize(method='BFGS') against the values collected in the suite.
import cocoex # experimentation moduleimport cocopp # post-processing module (not strictly necessary)import scipy # to define the solver to be benchmarked### input: define suite and solver (see also "input" below where fmin is called)suite_name ="bbob"budget_multiplier =7# increase to 3, 10, 30, ... x dimension### preparesuite = cocoex.Suite(suite_name, "", "") # see https://numbbo.github.io/coco-doc/C/#suite-parameters# suite = cocoex.Suite(suite_name, "instances: 1-5", "dimensions: 2,3,5,10,20")output_folder ='{}_of_{}_{}D_on_{}'.format("bfgs", scipy.optimize.minimize.__module__ or'', int(budget_multiplier+0.499), suite_name)observer = cocoex.Observer(suite_name, "result_folder: "+ output_folder)repeater = cocoex.ExperimentRepeater(budget_multiplier) #, min_successes=4) # x dimensionminimal_print = cocoex.utilities.MiniPrint()### gowhilenot repeater.done(): # while budget is left and successes are fewfor problem in suite: # loop takes 2-3 minutes x budget_multiplierif repeater.done(problem):continue problem.observe_with(observer) # generate data for cocopp problem(problem.dimension * [0]) # for better comparability### input: the next three lines need to be adapted to the specific fmin# xopt = fmin(problem, repeater.initial_solution_proposal(problem), disp=False) # could depend on budget_multiplier xopt = scipy.optimize.minimize(problem, repeater.initial_solution_proposal(problem), method="BFGS", options={"disp":False}) # could depend on budget_multiplier problem(xopt.x) # make sure the returned solution is evaluated repeater.track(problem) # track evaluations and final_target_hit minimal_print(problem) # show progress### post-process datadsl = cocopp.main(observer.result_folder +' bfgs-scipy*') # re-run folders look like "...-001" etc
An aggregate comparison is carried out by means of empirical distribution functions (ECDF), to be explained in class. The analysis is visualized in Figure 1. The plots show that the three algorithms peform very close to each others as expected. However, for the smaller dimensions it seems that the freshly tested version has some troubles. Why?
Comparison of BFGS implementaiton in COCO by means of aggregated ECDFs.