Black-box optimization

2. Black-box optimization#

In many areas of engineering, one is interested in finding the best solution from a set of available alternatives. The notion of “good” or “bad” is usually expressed via a cost function \(J(\mathbf{w})\) defined over a vector of variables \(\mathbf{w}=[w_1, \dots,w_N]^\top\). This leads to the mathematical problem of finding a minimum point of the function \(J(\mathbf{w})\), formally denoted as

\[\operatorname*{minimize}_{\mathbf{w} \in \mathbb{R}^N} \; J(\mathbf{w}).\]

The above statement says to look over every possible vector \(\mathbf{w}\), and find one that gives the smallest value of \(J(\mathbf{w})\).

It is generally impossible to derive an explicit formula for computing the points where a function attains its smallest value. Numerical methods are thus necessary for solving an optimization problem. There is however no universal algorithm to do this. Rather, there exist many different approaches, each tailored to a particular type of optimization: continuous versus discrete, unconstrained versus constrained, deterministic versus stochastic, single-objective versus multi-objective, etc. It is the user’s responsibility to choose an algorithm that is appropriate for the specific optimization problem at hand. This choice is important, and determines whether the problem is solved rapidly or slowly, and indeed whether the solution is found at all.

Contents: This chapter is an introduction to mathematical optimization. The focus is on algorithms that do not require the analytical expression of the cost function to solve an optimization problem. These “derivative-free” methods are especially useful when the cost function is either unavailable or impractical to write down in formulas.

You will understand what is an optimization problem, a global minimum, and a local minimum.
You will start to see the importance of convexity in optimization.
You will study three algorithms: grid search, random search, and the cross-entropy method.
You will learn the limitations of derivative-free optimization algorithms.

Acknowledgment: Part of this chapter is based on the supplementary material for the textbook Machine Learning Refined (1st ed., 2016) written by Jeremy Watt, Reza Borhani, and Aggelos K. Katsaggelos.