Topics covered in the course

The level of detail in the course is not uniform. Some topics are covered in depth, while others are only briefly mentioned. The following list provides a comprehensive overview of the topics discussed in the course.

Definitions

  • Univariate, Multi-variate, real-valued, vector-valued functions
  • Convex, Non-Convex, Concave functions
  • Unimodal, Multimodal
  • Lipschitz Continuity
  • Strongly Convex
  • Smoothness
  • Positive Definitness

Derivatives

  • Derivatives in Multiple Dimensions
  • Numerical Differentiation
  • Automatic Differentiation

Bracketing

Finding an Initial Bracket:

  • Fibonacci Search
  • Golden Section Search
  • Quadratic Fit Search
  • Bisection Method

Descent Direction Iteration

Line Search:

  • Exact Line Search
  • Approximate Line Search:
    • Backtracking line search (Armijo line search)
    • Strong backtracking line search (bracketing + zoom)
    • Trust Region Methods
  • Termination Conditions

First-Order Methods:

  • Gradient Descent
  • Conjugate Gradient
  • Momentum:
    • Nesterov Momentum
    • Adagrad
    • RMSProp
    • Adadelta
    • Adam
    • Hypergradient Descent

Second-Order Methods:

  • Newton’s Method
  • Secant Method
  • Quasi-Newton Methods:
    • DFP
    • BFGS
    • L-BFGS

Direct Methods:

  • Cyclic Coordinate Search
  • Powell’s Method
  • Hooke-Jeeves
  • Generalized Pattern Search
  • Nelder-Mead Simplex Method
  • (Divided Rectangles)

Beyond Local Optima

  • Noisy Descent
  • Stochastic Gradient Descent
  • Mesh Adaptive Direct Search
  • Simulated Annealing
  • Cross-Entropy Method
  • Natural Evolution Strategies
  • Covariance Matrix Adaptation
  • Genetic Algorithms
  • Differential Evolution
  • Particle Swarm Optimization

Optimization for ML

ML tasks:

  • empirical risk minimization

Convergence analysis of stochastic gradient

  • convergence rate
  • definitions and results

Beyond stochastic gradient:

  • noise reduction methods (dynamic sample size, gradient aggregation, iterated averaging)
  • second order methods

Constrained Optimization

  • Duality
  • Penalty Methods
  • Interior Point Methods
  • Augmented Lagrange Method
  • Linear Programming
  • Simplex Method
  • Modeling in LP
  • Dual certificates

Sampling Methods

  • Full Factorial
  • Random Sampling
  • Uniform Projection Plans
  • Stratified Sampling
  • Space-Filling Metrics
  • Space-Filling Subsets
  • Quasi-Random Sequences

Discrete Optimization

  • Integer Linear Programming
    • Branch and Bound
    • Cutting Planes
  • Dynamic Programming
  • (Constraint Programming and Backtracking)
  • Construction Heuristics and Local Search