optimizers

A set of mostly quasi-Newton optimizers for PyTorch.

This project started as an academic experiment. The quasi-Newton methods can speed up convergence, but they do so at the cost of higher memory usage, so they are best suited to relatively small systems.

The genetic and sampling-based optimizers are included for completeness, but in most practical settings they are not a good choice. If you find yourself reaching for them because you want more variance or exploration in the weight updates, consider using ExtendedKalmanFilter instead: increasing its process noise parameter q (or decreasing the forgetting factor tau) inflates the state covariance P, which increases the Kalman gain and produces larger, more exploratory steps — a gradient-based alternative to stochastic search.

Install with uv

This package currently targets Python 3.13+ and is documented for Git/source installs rather than PyPI publication.

Add the package from GitHub:

uv add git+https://github.com/kovacoj/optimizers.git

Add the package as a git submodule if you want to pull upstream updates into your repo explicitly:

git submodule add https://github.com/kovacoj/optimizers.git optimizers
uv add ./optimizers

Sync the submodule to the latest upstream commit later with:

git submodule update --remote --merge optimizers
uv lock

For local development after cloning the repo:

uv venv
uv pip install -e .

Because this repository uses a src layout, importing directly from the repo checkout without installing requires PYTHONPATH=src.

Use

from optimizers import KalmanFilter
from optimizers import LevenbergMarquardt
from optimizers import Newton
from optimizers import line_search

Newton, Annealing, Metropolis, and Genetic expect closure() to return a scalar loss tensor. LevenbergMarquardt and ExtendedKalmanFilter expect a residual-vector closure. KalmanFilter expects closure() to return (errors, H).

Line search

The public optimizers.line_search submodule exposes pure-PyTorch callback-based helpers:

line_search.armijo_backtracking(phi, phi0, dphi0, ...)
line_search.strong_wolfe(phi, dphi, phi0, dphi0, ...)

Newton(..., line_search_method="armijo" | "wolfe") uses these helpers to scale the full Newton direction. LevenbergMarquardt(..., strategy="line search", line_search_method="armijo" | "wolfe") uses the same line-search methods for residual-vector problems, while strategy="trust region" switches to a trust-region LM update.

Public API

Optimizer	Closure contract	`step()` return
`Newton`	scalar loss tensor	`None`
`Annealing`	scalar loss tensor	scalar loss tensor
`Metropolis`	scalar loss tensor	scalar loss tensor
`Genetic`	scalar loss tensor	scalar loss tensor
`LevenbergMarquardt`	residual vector tensor	Python `float`
`ExtendedKalmanFilter`	residual vector tensor	scalar loss tensor
`KalmanFilter`	`(errors, H)`	scalar loss tensor

KalmanFilter is the linear-residual variant. ExtendedKalmanFilter computes the residual Jacobian internally.

Newton defaults to the full Newton step when line_search_method=None. LevenbergMarquardt defaults to strategy="line search" and also accepts the compatibility aliases "line_search", "trust_region", and "heuristic".

Scaling Notes

Newton builds and solves a dense Hessian over all trainable parameters, so it is intended for small systems where exact second-order steps are practical. Use the Kalman or stochastic optimizers for experiments where forming a dense Hessian is too expensive.

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
.github		.github
src/optimizers		src/optimizers
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

optimizers

Install with uv

Use

Line search

Public API

Scaling Notes

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

optimizers

Install with uv

Use

Line search

Public API

Scaling Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages