Topics covered in the course

The level of detail in the course is not uniform. Some topics are covered in depth, while others are only briefly mentioned. The following list provides a comprehensive overview of the topics discussed in the course.

Definitions

  • Univariate, Multi-variate, real-valued, vector-valued functions
  • Convex, Non-Convex, Concave functions
  • Conditions for Local Minima
  • Taylor expansion
  • Unimodal, Multimodal
  • Lipschitz Continuity
  • Strongly Convex
  • Smoothness
  • Positive Definitness

Derivatives

  • Derivatives in Multiple Dimensions
  • Numerical Differentiation
  • Automatic Differentiation

Bracketing

Finding an Initial Bracket:

  • Fibonacci Search
  • Golden Section Search
  • Quadratic Fit Search
  • Bisection Method

Local Descent

Line Search:

  • Exact Line Search
  • Approximate Line Search:
    • Backtracking line search (Armijo line search)
    • Strong backtracking line search (bracketing + zoom)
  • Rate of convergence: linear, superlinear, quadratic
  • Trust Region Methods
  • Termination Conditions

First-Order Methods:

  • Gradient Descent
  • Conjugate Gradient
  • Momentum (general ideas only):
    • Nesterov Momentum
    • Adagrad
    • RMSProp
    • Adadelta
    • Adam
    • Hypergradient Descent

Second-Order Methods:

  • Newton’s Method and convergence results
  • Secant Method
  • Quasi-Newton Methods:
    • DFP
    • BFGS
    • L-BFGS

Direct Methods:

  • Black box optimization
  • Cyclic Coordinate Search
  • Powell’s Method
  • Hooke-Jeeves
  • Generalized Pattern Search
  • Nelder-Mead Simplex Method

Beyond Local Optima

  • Noisy Descent
  • Stochastic Gradient Descent
  • Mesh Adaptive Direct Search
  • Simulated Annealing
  • Cross-Entropy Method
  • Natural Evolution Strategies
  • Covariance Matrix Adaptation
  • Genetic Algorithms
  • Differential Evolution
  • Particle Swarm Optimization

Sampling Methods

  • Full Factorial
  • Random Sampling
  • Uniform Projection Plans
  • Stratified Sampling
  • Space-Filling Metrics
  • Space-Filling Subsets
  • Quasi-Random Sequences

Optimization for ML

ML tasks:

  • empirical risk minimization
  • stochastic gradient
  • batch gradient
  • mini-batch approach

Beyond stochastic gradient:

  • noise reduction methods (dynamic sample size, gradient aggregation, iterated averaging)

Constrained Optimization

  • Lagrangian multiplier method
  • KKT conditions
  • Min-max inequality and Duality
  • Penalty Methods
  • Interior Point Methods
  • Linear Programming
  • Duality in LP
  • Simplex Method
  • Modeling in LP
  • Primal-Dual Hybrid Gradient (PDHG) Method

Discrete Optimization

  • Integer Linear Programming
    • Branch and Bound
    • Cutting Planes
  • Dynamic Programming
  • ROAR-NET API Specification
  • Construction Heuristics
  • Local Search
  • Metaheuristics (only some ideas)