Optimization

This chapter gives an overview over the optimization process. SACOBRA’s optimization process consists of two phases Phase I and Phase II, where Phase I is optional.

Phase I

TODO

Phase II

class cobraPhaseII.CobraPhaseII(cobra: CobraInitializer)

SACOBRA phase II executor.

Information is communicated via object CobraInitializer cobra (with elements SACoptions sac_opts and dictionary cobra.sac_res) and via object Phase2Vars p2 (internal variables needed in phase II).

get_cobra()

Returns:: object cobra with all COBRA variables
Return type:: CobraInitializer

get_p2()

Returns:: phase II variables
Return type:: Phase2Vars

start(gcop: GCOP | None = None)

Start the main optimization loop of phase II.

Perform in a loop, until the budget feval is exhausted:

select cyclically an element p2.ro from DRC XI. Add the appropriate DRC-constraint as extra constraint to the set of constraints

train RBF surrogate models on the current set of infill points (see Surrogator1.AdFitter, Surrogator1, Surrogator2, RBFmodel)

select start point xStart: either current-best xbest or random start (see RandomStarter)

perform sequential optimization, starting from xStart, subject to the nConstraint + 1 constraints. Result is xNew = p2.opt_res['x'] (see SeqOptimizer)

evaluate xNew on the real functions + (if EQU.active) do refine step. Result is the updated EvaluatorReal object p2.ev1

calculate p-effect for onlinePLOG (see Surrogator1.calcPEffect())

update cobra information: The new infill point xNew and its evaluation on the real functions is added to the cobra’s arrays A, Fres, Gres

update and save cobra: data frames df, df2, elements sac_res['xbest', 'fbest', 'ibest'] of dict cobra.sac_res

adjust margins p2.EPS, \(\mu\) (see EQUoptions), adjust \(\rho\) (see RBFoptions) and p2.Cfeas, p2.Cinfeas (see phase2Funcs.adjustMargins())

The result is a modified object cobra (detailed results in dict sac_res and detailed diagnostic info in data frames df, df2). The optimization results can be retrieved from cobra with methods is_feasible(), get_fbest(), get_feasible_best(), get_xbest() and get_xbest_cobra().

Further results are in object Phase2Vars self.p2.

Parameters:: gcop (GCOP) – (optional) a COP object, just needed to retrieve ncall at the end of optimization phase II. If None then the default ncall=0 remains.
Returns:: self

class phase2Vars.Phase2Vars(cobra: CobraInitializer)

This class is just a container for variables needed by CobraPhaseII (in addition to CobraInitializer cobra). These variables include:

EPS number, the current safety margin \(\epsilon\) in constraint surrogates, see safety margin
currentMu number, the current equality margin \(\mu\), see refine step
num the number of real function evaluations carried out
globalOptCounter counter of the global optimization steps in phase II, excluding repair and trust region
Cfeas how many feasible infills in a row (see phase2Funcs.adjustMargins(), updateInfoAndCounters)
Cinfeas how many infeasible infills in a row (see phase2Funcs.adjustMargins(), updateInfoAndCounters)
fitnessSurrogate the objective surrogate model
constraintSurrogates the constraint surrogate models
pEffect number, calculated in Surrogator1.calcPEffect() in each iteration: If > 0, apply plog(Fres).
PLOG boolean vector: whether plog(Fres) was applied in iteration i
pshift float vector: with which p-shift was plog(Fres) applied (if at all) in iteration i

Example: p2 = Phase2Vars(); print(p2.num);

phase2Funcs.adjustMargins(cobra: CobraInitializer, p2: Phase2Vars)

Adjust margins \(\epsilon =\) p2.EPS, \(\mu =\) p2.currentMu and \(\rho =\) cobra.sac_opts.RBF.rho; conditionally reset counters p2.Cfeas, p2.Cinfeas.

If p2.Cfeas >= Tfeas, then halve \(\epsilon\).

If p2.Cinfeas >= Tinfeas, then double \(\epsilon\) and clip it at epsilonMax.

Tfeas, Tinfeas, epsilonMax are members of cobra.sac_opts.SEQ.

Parameters:

cobra – SACOBRA settings and results
p2 – these members may be changed : EPS, currentMu, Cfeas, Cinfeas

The p-Effect

Functions with both very small and very large slopes (e.g. \(f(x)=exp(x^2)\)) are difficult to model accurately by RBF models: The models tend to oscillate. It is much better to squash the function with a log-like transform to a smaller slope range. On the other hand, for very simple functions like \(f(x)=x\) it is disadvantageous to squash them, because a linear function becomes non-linear and thus harder to model.

The p-effect is a number \(p_\text{eff}\) which allows to decide in each iteration whether to model the objective function \(f()\) directly or whether to model \(plog(f())\). Here

\[ plog(y) = \begin{cases} +\ln( 1+\bar{y}) & \mbox{ if } \quad \bar{y} \geq 0 \\ -\ln( 1-\bar{y}) & \mbox{ if } \quad \bar{y} < 0 \end{cases} \]

is a strictly monotonic squashing function, where \(\bar{y}=y-p_{shift}\) with default pshift = 0.

Given the set \(S\) of already evaluated points in input space and given \(M_f, M_p\) as the surrogate models for \(f(), plog(f())\) when using all points in \(S\), we calculate for a new infill point \(\vec{x}_{new}\) the ratio

\[ p_k = \left\{ \frac{\left\| M_f(\vec{x}_{new})-f(\vec{x}_{new})\right\| }{\left\| plog^{-1}\left(M_p(\vec{x}_{new})\right)-f(\vec{x}_{new})\right\|} \right\} \]`

If \(p_k>1\), then \(M_p\) has the smaller error. If \(p_k<1\), then \(M_f\) has the smaller error.

For a more stable decision we define the p-effect number as

\[ p_\text{eff} = \log_{10}(\text{median}\{ p_k \}) \]`

and decide to build \(M_p\) if \(p_\text{eff}>0\) and to build \(M_f\) else.

The calculation of \(p_\text{eff}\) is done in Surrogator1.calcPEffect(). The conditional application of \(plog()\) is done in Surrogator1.AdFitter.

Details for onlinePLOG

The above p-effect recipe is realized in case ISA.onlinePLOG = O_LOGIC.XNEW. It tracks online (in each iteration) which of the two objective surrogate models is better. It uses the new infill point \(\vec{x}_{new}\) (found after sequential optimization), which is usually different from all earlier points. But a closer look reveals that there are two drawbacks:

The p-effect can only be calculated after the sequential optimization of the current iteration and so it can be only used for the model decision in the next iteration, which might be confusing.
It is not guaranteed that the new infill point \(\vec{x}_{new}\) is different from all earlier points. Especially if the distance requirement is 0, \(\vec{x}_{new}\) may be very close or identical to earlier points. Then both approximation errors will be very small or even 0, which causes the error ratio to be unreliable.

An alternative (better) p-effect recipe is the case ISA.onlinePLOG = O_LOGIC.MIDPTS. In this case we replace \(\vec{x}_{new}\) with a set of points, namely all \(P=p(p-1)/2\) midpoints \(\vec{m}_{\ell}\) between the \(p =\) ISA.pEffNpts first points of initial design matrix A. This has the advantage that the p-effect can be calulated directly after training the surrogate models \(M_f, M_p\) (because the midpoints are known in advance) and that the new decision number

\[ p_k = \left\{ \frac{\sum_{\ell=1}^P{\left\| M_f(\vec{m}_{\ell})-f(\vec{m}_{\ell})\right\|} } {\sum_{\ell=1}^P{\left\| plog^{-1}\left(M_p(\vec{m}_{\ell})\right)-f(\vec{m}_{\ell})\right\|} } \right\} \]`

is more robust because it is based on many points \(\vec{m}_{\ell}\). These midpoints are usually not close to trained points and if a single midpoint \(\vec{m}_{\ell}\) happens to be close to trained points (and has small errors), then this will not change the new error ratio \(p_k\) very much, because numerator and denominator are now sums of errors. The calculation of \(p_\text{eff}\) from \(p_k\) proceeds in the same way as described in The p-Effect.

The calculation of the new ratio \(p_k\) and the emerging \(p_\text{eff}\) is done in Surrogator2.calcPEffectNew().

A third option is ISA.onlinePLOG = O_LOGIC.NONE (no online plog). In this case we do not decide online about \(plog()\) but instead make a decision once after initial design and keep this decision for all iterations. The decision is based on the min-max range of Fres after initial design, which is compared with threshold ISA.TFRange. If larger than threshold, apply \(plog()\) to \(f()\); if smaller than threshold, use the unsquashed \(f()\).

The Refine Step

The refine step is part of SACOBRA_Py’s equality handling mechanism and is executed if the COP contains equality constraints and EQU.active = EQU.refine = True.

SACOBRA starts with an initially large artificial feasibility band \(\mu\) around each equality constraint. After a sequential optimization step, the infill point will be usually at the rim of the feasibility band. In order to refine the solution, we run a refine optimization with objective

\[ \text{Minimize} \sum_i \left(\max(0,g_i(x)\right)^2 + \sum_j \left(h_j(x)\right)^2 \]`

In the case of one equality constraints, this projects from the rim of the feasibility band to the central equality surface. In the case of multiple constraints, things get of course more involved, but usually the refine step does a very good job in reducing the maximum constraint violation. This is important when the feasibility band \(\mu\) is reduced for the next iteration: Otherwise, a solution ‘on the rim’ that was (artificially) feasible in the current iteration, might get lost in the next iteration because it is then infeasible.

The refine step is carried out in EvaluatorReal.equ_refine() which is called from EvaluatorReal.update().

Safety Margin \(\epsilon\)

\(\epsilon =\) p2.EPS provides a safety margin to maintain feasibility in the early optimization phase (when the surrogate models are not yet precise): We add the term \(+ \epsilon^2\) to each constraint prediction during sequential optimization. The effect is that for every \(\epsilon > 0\) the infill point remains at a certain distance from the constraint boundary.

\(\epsilon\) is initialized with parameter epsilonInit from SEQoptions. Method phase2Funcs.adjustMargins() adjusts \(\epsilon\) after each iteration.

If SEQ.finalEpsXiZero == True, then we perform the final iteration with \(\epsilon = 0\) and with DRC = 0. That is, we exploit maximally in the final iteration (assuming that we need no safety margin for the final model).

Repair Infeasible Solutions

class repairInfeasRI2.RI2(cobra: CobraInitializer)

A class to perform the repair of infeasible solutions with method RI2, see RI2.repairInfeasRI2() for details.

See RIoptions for all repair-infeasible options and for the definition of eps-feasibility.

Parameters:: cobra – an initialized SACOBRA object with settings and results

find_best_infeasible(x, R, eps2: float, constrSurr: RBFmodel)

Given a set of k=0,...,K-1 points z[k] = x+R[k,:], find this point z[kBest] which has

the lowest number of constraints being not eps2-feasible, and,
if there is more than one such point, select among them the point with the lowest maximum violation.

Parameters:

x – point in input space, vector of dimension d. If ID.rescale==True then x should be (forward-) rescaled.
R – (K,d)-matrix where row k holds delta vector z[k]-x
eps2 – safety margin
constrSurr – constraint surrogate models

Returns:

the best residual R[kBest,:] = z[kBest]-x

is_epsilon_feasible(x: ndarray, eps2: float, constrSurr: RBFmodel)

Return True, if all constraint surrogates have a feasibility ‘better’ than eps2 (i.e. they are -eps2 or lower)

Parameters:

x – point in input space, vector of dimension d. If ID.rescale==True then x should be (forward-) rescaled.
eps2 – safety margin
constrSurr – constraint surrogate models

Returns:

whether x is eps2-feasible

repairInfeasRI2(x: ndarray, gReal: ndarray, constrSurr: RBFmodel, cobra: CobraInitializer, p2: Phase2Vars, currentMu, checkIt, true_grad=None) → ndarray

Repair an infeasible solution with method RI2.

If the solution x is infeasible, i.e. if there is any i or any j such that \(g_i(x) > 0\) or \(|h_j(x)| - \mu > 0\), then:

Estimate the gradient of the constraint surrogate function(s) (go a tiny step in each dimension in the direction of constraint increase).
Take RI.mmax random realizations in the feasible parallelepiped and select among them the best feasible solution, based on the surrogates.
Check whether the new solution is for every dimension in the bounds [lower, upper] of the search region. If not, set the gradient to 0 in these dimensions and re-iterate from step 2.

There is no guarantee but a good chance, that the returned solution z will be feasible.

For further details see:

[Koch15a] Koch, P.; Bagheri, S.; Konen, W. et al. “A New Repair Method For Constrained Optimization”. Proc. 17th Genetic and Evolutionary Computation Conference (GECCO), 2015.

Parameters:

x – an infeasible solution vector of dimension d
gReal – a vector \((g_1(x), \ldots, g_m(x), h_1(x), \ldots, h_r(x))\)
constrSurr – the constraint surrogate models
cobra – an object of class CobraInitializer, we need here: lower, upper, RIoptions s_opts.RI
currentMu – \(\mu\), margin for equality constraints. Only relevant if equality constraints exist
checkIt – if True, perform a check whether the returned solution is really feasible. Needs access to the true constraint functions.
true_grad – (optional) if not None and if checkIt=True, assert that the calculated grad_mat and true_grad are close

Returns:

z, a vector of dimension d with a repaired (hopefully feasible) solution