Optimization
This chapter gives an overview over the optimization process. SACOBRA’s optimization process consists of two phases Phase I and Phase II, where Phase I is optional.
Phase I
TODO
Phase II
- class cobraPhaseII.CobraPhaseII(cobra: CobraInitializer)
SACOBRA phase II executor.
Information is communicated via object
CobraInitializercobra(with elementsSACoptionssac_optsand dictionary cobra.sac_res) and via objectPhase2Varsp2(internal variables needed in phase II).- get_cobra()
- Returns:
object
cobrawith all COBRA variables- Return type:
- get_p2()
- Returns:
phase II variables
- Return type:
- start(gcop: GCOP | None = None)
Start the main optimization loop of phase II.
Perform in a loop, until the budget
fevalis exhausted:select cyclically an element
p2.rofrom DRCXI. Add the appropriate DRC-constraint as extra constraint to the set of constraintstrain RBF surrogate models on the current set of infill points (see
Surrogator1.AdFitter,Surrogator1,Surrogator2,RBFmodel)select start point
xStart: either current-bestxbestor random start (seeRandomStarter)perform sequential optimization, starting from
xStart, subject to thenConstraint + 1constraints. Result isxNew = p2.opt_res['x'](seeSeqOptimizer)evaluate
xNewon the real functions + (ifEQU.active) do refine step. Result is the updatedEvaluatorRealobjectp2.ev1calculate p-effect for
onlinePLOG(seeSurrogator1.calcPEffect())update cobra information: The new infill point
xNewand its evaluation on the real functions is added to thecobra’s arraysA,Fres,Gresupdate and save
cobra: data frames df, df2, elementssac_res['xbest', 'fbest', 'ibest']of dict cobra.sac_resadjust margins
p2.EPS, \(\mu\) (seeEQUoptions), adjust \(\rho\) (seeRBFoptions) andp2.Cfeas,p2.Cinfeas(seephase2Funcs.adjustMargins())
The result is a modified object
cobra(detailed results in dict sac_res and detailed diagnostic info in data frames df, df2). The optimization results can be retrieved fromcobrawith methodsis_feasible(),get_fbest(),get_feasible_best(),get_xbest()andget_xbest_cobra().Further results are in object
Phase2Varsself.p2.- Parameters:
gcop (GCOP) – (optional) a COP object, just needed to retrieve
ncallat the end of optimization phase II. IfNonethen the defaultncall=0remains.- Returns:
self
- class phase2Vars.Phase2Vars(cobra: CobraInitializer)
This class is just a container for variables needed by
CobraPhaseII(in addition toCobraInitializercobra). These variables include:EPS number, the current safety margin \(\epsilon\) in constraint surrogates, see safety margin
currentMu number, the current equality margin \(\mu\), see refine step
num the number of real function evaluations carried out
globalOptCounter counter of the global optimization steps in phase II, excluding repair and trust region
Cfeas how many feasible infills in a row (see
phase2Funcs.adjustMargins(),updateInfoAndCounters)Cinfeas how many infeasible infills in a row (see
phase2Funcs.adjustMargins(),updateInfoAndCounters)fitnessSurrogate the objective surrogate model
constraintSurrogates the constraint surrogate models
pEffect number, calculated in
Surrogator1.calcPEffect()in each iteration: If > 0, apply plog(Fres).PLOG boolean vector: whether plog(Fres) was applied in iteration i
pshift float vector: with which p-shift was plog(Fres) applied (if at all) in iteration i
Example:
p2 = Phase2Vars(); print(p2.num);
- phase2Funcs.adjustMargins(cobra: CobraInitializer, p2: Phase2Vars)
Adjust margins \(\epsilon =\)
p2.EPS, \(\mu =\)p2.currentMuand \(\rho =\)cobra.sac_opts.RBF.rho; conditionally reset countersp2.Cfeas,p2.Cinfeas.If
p2.Cfeas >= Tfeas, then halve \(\epsilon\).If
p2.Cinfeas >= Tinfeas, then double \(\epsilon\) and clip it atepsilonMax.Tfeas,Tinfeas,epsilonMaxare members ofcobra.sac_opts.SEQ.- Parameters:
cobra – SACOBRA settings and results
p2 – these members may be changed :
EPS,currentMu,Cfeas,Cinfeas
The p-Effect
Functions with both very small and very large slopes (e.g. \(f(x)=exp(x^2)\)) are difficult to model accurately by RBF models: The models tend to oscillate. It is much better to squash the function with a log-like transform to a smaller slope range. On the other hand, for very simple functions like \(f(x)=x\) it is disadvantageous to squash them, because a linear function becomes non-linear and thus harder to model.
The p-effect is a number \(p_\text{eff}\) which allows to decide in each iteration whether to model the objective function \(f()\) directly or whether to model \(plog(f())\). Here
\[ plog(y) = \begin{cases} +\ln( 1+\bar{y}) & \mbox{ if } \quad \bar{y} \geq 0 \\ -\ln( 1-\bar{y}) & \mbox{ if } \quad \bar{y} < 0 \end{cases} \]is a strictly monotonic squashing function, where
\(\bar{y}=y-p_{shift}\) with default pshift = 0.
Given the set \(S\) of already evaluated points in input space and given \(M_f, M_p\) as the surrogate models for \(f(), plog(f())\) when using all points in \(S\), we calculate for a new infill point \(\vec{x}_{new}\) the ratio
\[ p_k = \left\{ \frac{\left\| M_f(\vec{x}_{new})-f(\vec{x}_{new})\right\| }{\left\| plog^{-1}\left(M_p(\vec{x}_{new})\right)-f(\vec{x}_{new})\right\|} \right\} \]`If \(p_k>1\), then \(M_p\) has the smaller error. If \(p_k<1\), then \(M_f\) has the smaller error.
For a more stable decision we define the p-effect number as
\[ p_\text{eff} = \log_{10}(\text{median}\{ p_k \}) \]`and decide to build \(M_p\) if \(p_\text{eff}>0\) and to build \(M_f\) else.
The calculation of \(p_\text{eff}\) is done in Surrogator1.calcPEffect().
The conditional application of \(plog()\) is done in Surrogator1.AdFitter.
Details for onlinePLOG
The above p-effect recipe is realized in case ISA.onlinePLOG = O_LOGIC.XNEW. It tracks online (in each
iteration) which of the two objective surrogate models is better. It uses the new infill point \(\vec{x}_{new}\)
(found after sequential optimization), which is usually different from all earlier points.
But a closer look reveals that there are two drawbacks:
The p-effect can only be calculated after the sequential optimization of the current iteration and so it can be only used for the model decision in the next iteration, which might be confusing.
It is not guaranteed that the new infill point \(\vec{x}_{new}\) is different from all earlier points. Especially if the distance requirement is 0, \(\vec{x}_{new}\) may be very close or identical to earlier points. Then both approximation errors will be very small or even 0, which causes the error ratio to be unreliable.
An alternative (better) p-effect recipe is the case ISA.onlinePLOG = O_LOGIC.MIDPTS. In this case we replace
\(\vec{x}_{new}\) with a set of points, namely all \(P=p(p-1)/2\) midpoints \(\vec{m}_{\ell}\) between the \(p =\)
ISA.pEffNpts first points of initial design matrix A. This has the advantage that the p-effect can be calulated
directly after training the surrogate models \(M_f, M_p\) (because the midpoints are known in advance) and that the
new decision number
is more robust because it is based on many points \(\vec{m}_{\ell}\). These midpoints are usually not close to trained points and if a single midpoint \(\vec{m}_{\ell}\) happens to be close to trained points (and has small errors), then this will not change the new error ratio \(p_k\) very much, because numerator and denominator are now sums of errors. The calculation of \(p_\text{eff}\) from \(p_k\) proceeds in the same way as described in The p-Effect.
The calculation of the new ratio \(p_k\) and the emerging \(p_\text{eff}\) is done in Surrogator2.calcPEffectNew().
A third option is ISA.onlinePLOG = O_LOGIC.NONE (no online plog).
In this case we do not decide online about \(plog()\) but instead make a decision once after initial design and keep this decision for all iterations. The
decision is based on the min-max range of Fres after initial design, which is compared with
threshold ISA.TFRange. If larger than threshold, apply \(plog()\) to \(f()\); if smaller than threshold, use
the unsquashed \(f()\).
The Refine Step
The refine step is part of SACOBRA_Py’s equality handling mechanism and is executed if the COP contains equality
constraints and EQU.active = EQU.refine = True.
SACOBRA starts with an initially large artificial feasibility band \(\mu\) around each equality constraint. After a sequential optimization step, the infill point will be usually at the rim of the feasibility band. In order to refine the solution, we run a refine optimization with objective
\[ \text{Minimize} \sum_i \left(\max(0,g_i(x)\right)^2 + \sum_j \left(h_j(x)\right)^2 \]`In the case of one equality constraints, this projects from the rim of the feasibility band to the central equality surface. In the case of multiple constraints, things get of course more involved, but usually the refine step does a very good job in reducing the maximum constraint violation. This is important when the feasibility band \(\mu\) is reduced for the next iteration: Otherwise, a solution ‘on the rim’ that was (artificially) feasible in the current iteration, might get lost in the next iteration because it is then infeasible.
The refine step is carried out in EvaluatorReal.equ_refine() which is called from EvaluatorReal.update().
Safety Margin \(\epsilon\)
\(\epsilon =\) p2.EPS provides a safety margin to maintain feasibility in the early optimization phase
(when the surrogate models are not yet precise): We add the term \(+ \epsilon^2\) to each constraint prediction
during sequential optimization. The effect is that for every
\(\epsilon > 0\) the infill point remains at a certain distance from the constraint boundary.
\(\epsilon\) is initialized with parameter epsilonInit from SEQoptions.
Method phase2Funcs.adjustMargins() adjusts \(\epsilon\) after each iteration.
If SEQ.finalEpsXiZero == True, then we perform the final iteration with
\(\epsilon = 0\) and with DRC = 0. That is, we exploit maximally in the final iteration (assuming that we need
no safety margin for the final model).
Repair Infeasible Solutions
- class repairInfeasRI2.RI2(cobra: CobraInitializer)
A class to perform the repair of infeasible solutions with method RI2, see
RI2.repairInfeasRI2()for details.See
RIoptionsfor all repair-infeasible options and for the definition ofeps-feasibility.- Parameters:
cobra – an initialized SACOBRA object with settings and results
- find_best_infeasible(x, R, eps2: float, constrSurr: RBFmodel)
Given a set of
k=0,...,K-1pointsz[k] = x+R[k,:], find this pointz[kBest]which hasthe lowest number of constraints being not
eps2-feasible, and,if there is more than one such point, select among them the point with the lowest maximum violation.
- Parameters:
x – point in input space, vector of dimension
d. IfID.rescale==Truethenxshould be (forward-) rescaled.R –
(K,d)-matrix where rowkholds delta vectorz[k]-xeps2 – safety margin
constrSurr – constraint surrogate models
- Returns:
the best residual
R[kBest,:]=z[kBest]-x
- is_epsilon_feasible(x: ndarray, eps2: float, constrSurr: RBFmodel)
Return True, if all constraint surrogates have a feasibility ‘better’ than
eps2(i.e. they are-eps2or lower)- Parameters:
x – point in input space, vector of dimension
d. IfID.rescale==Truethenxshould be (forward-) rescaled.eps2 – safety margin
constrSurr – constraint surrogate models
- Returns:
whether
xiseps2-feasible
- repairInfeasRI2(x: ndarray, gReal: ndarray, constrSurr: RBFmodel, cobra: CobraInitializer, p2: Phase2Vars, currentMu, checkIt, true_grad=None) ndarray
Repair an infeasible solution with method RI2.
If the solution
xis infeasible, i.e. if there is anyior anyjsuch that \(g_i(x) > 0\) or \(|h_j(x)| - \mu > 0\), then:Estimate the gradient of the constraint surrogate function(s) (go a tiny step in each dimension in the direction of constraint increase).
Take
RI.mmaxrandom realizations in the feasible parallelepiped and select among them the best feasible solution, based on the surrogates.Check whether the new solution is for every dimension in the bounds
[lower, upper]of the search region. If not, set the gradient to 0 in these dimensions and re-iterate from step 2.
There is no guarantee but a good chance, that the returned solution
zwill be feasible.For further details see:
[Koch15a] Koch, P.; Bagheri, S.; Konen, W. et al. “A New Repair Method For Constrained Optimization”. Proc. 17th Genetic and Evolutionary Computation Conference (GECCO), 2015.
- Parameters:
x – an infeasible solution vector of dimension
dgReal – a vector \((g_1(x), \ldots, g_m(x), h_1(x), \ldots, h_r(x))\)
constrSurr – the constraint surrogate models
cobra – an object of class
CobraInitializer, we need here:lower,upper,RIoptionss_opts.RIcurrentMu – \(\mu\), margin for equality constraints. Only relevant if equality constraints exist
checkIt – if True, perform a check whether the returned solution is really feasible. Needs access to the true constraint functions.
true_grad – (optional) if not None and if checkIt=True, assert that the calculated
grad_matandtrue_gradare close
- Returns:
z, a vector of dimensiondwith a repaired (hopefully feasible) solution