~cgeoga/GPMaxlik.jl

Simple and terse maximum likelihood for parametric Gaussian process models
7b2b660a — Chris Geoga 8 months ago
tweak the updater.jl script
60f5acca — Chris Geoga 10 months ago
Add an `efish` kwarg to `gnll_forwarddiff` that secretly swaps in the
Add wrapper layer to the `gnll` function so that the coordinate

clone

read-only
https://git.sr.ht/~cgeoga/GPMaxlik.jl
read/write
git@git.sr.ht:~cgeoga/GPMaxlik.jl

You can also use your local clone with git send-email.

#GPMaxlik.jl

#Efficient evaluation of the Gaussian log-likelihood and its derivatives and a compatibility layer for indirect methods for working with large covariance matrices.

builds.sr.ht status

Documentation: https://man.sr.ht/~cgeoga/GPMaxlik.jl/

Issue tracker: https://todo.sr.ht/~cgeoga/GPMaxlik.jl/

Build status: https://builds.sr.ht/~cgeoga/GPMaxlik.jl/

This software is designed with a few simple goals:

  • Provide efficient methodology for evaluating the Gaussian log-likelihood, its gradient, and the expected Fisher information matrix of the model, enabling (quasi-)second order optimization.
  • Provide a compatibility layer to compartmentalize the problems of working with complex approximations for very large matrices and the problems of efficient evaluation of the log-likelihood and its derivatives. Just worry about writing your crazy FMM-based covariance matrix applications---by defining a few simple methods, you can plug that object into this code.
  • Minimize the matrix operations required to evaluate all of the above quantities at all costs.

Working with the giant matrices, be it in the form of updating low-rank factorizations of off-diagonal blocks, applying an iterative solver, or computing a trace, is a significant runtime bottleneck. This package aims to aggressively minimize all of these operations. Features to accomplish this include:

  • An interface that emphasizes the option to jointly compute quantities that would otherwise require redundant computations.
  • The option to use high-quality stochastic trace estimation in the gradient and expected Fisher information matrix (see [1]), *completely removing all matrix-matrix operations.
  • The option to provide a variable number of buffers for the derivatives of covariance matrices, which combined with near-optimal pathing in the exact construction of Fisher information matrices significantly reduces redundant matrix updates.
  • The option to cache intermediate results in the case of stochastic trace estimation so that sequential computations of gradients and expected Fisher information matrices do no redundant work.
  • The option to automatically switch from stochastic gradient and expected Fisher matrices to exact ones after a certain number of iterations or when the gradient reaches a sufficiently small norm, so you can cheaply get near your MLE and then polish off the optimization precisely.
  • A basic but functional trust region-based optimizer that directly takes functions that provide the objective, gradient, and Hessian matrix all at once in order to exploit any redundant work used in computing each quantity.

Please take a look at the documentation for complete information about functionality. But in addition to the docs, please refer to the files in ./example for complete usage demonstrations. You probably can almost copy-paste the relevant example file, write the covariance function you need, load in your actual data, and otherwise keep the script nearly as-is.

#Installation

Despite not being hosted on github, this package is in the official registries. So adding it is as easy as

julia> ]add GPMaxlik

The package has no dependencies outside of the standard library and should introduce very little complexity to your codebase, and AD will pass right through it.

#Used in the wild

This package has been used several times in the wild already.

  • The trust-region optimizer was used successfully in [1].
  • Both the optimizer and gnll interface were used to fit a complex 25-parameter model to 17k data points in [2]. Not only was this a very hard optimization problem, but efficiency was key when even the most efficient pass to compute the stochastic gradient and expected Fisher matrix required 25 matrices of size 17k. See the appendix of [2] for more details on how this framework made an otherwise infeasible problem feasible.
  • Your next cool project...

#Citation

If you use this software in your academic work, please cite this software library (GPMaxlik.jl):

@software{GB_GPMaxlik2021,
  author = {Geoga, Christopher J. and Beckman, Paul G.},
  title  = {GPMaxlik.jl},
  url    = {https://git.sr.ht/~cgeoga/GPMaxlik.jl},
  year   = {2021},
  publisher = {Sourcehut}
}

#References

[1]: Scalable Gaussian Process Computations Using Hierarchical Matrices

[2]: Flexible nonstationary spatio-temporal modeling of high-frequency monitoring data