Package: tdigest
Type: Package
Title: Wicked Fast, Accurate Quantiles Using 't-Digests'
Version: 0.2.0
Date: 2019-07-21
Authors@R: c(
  person("Bob", "Rudis", email = "bob@rud.is", role = c("aut", "cre"), 
         comment = c(ORCID = "0000-0001-5670-2640")),

#' Create a new t-digest histogram from a vector
#'
#' The t-digest construction algorithm uses a variant of 1-dimensional
#' k-means clustering to produce a very compact data structure that allows
#' accurate estimation of quantiles. This t-digest data structure can be used
#' to estimate quantiles, compute other rank statistics or even to estimate
#' related measures like trimmed means. The advantage of the t-digest over
#' previous digests for this purpose is that the t-digest handles data with
#' full floating point resolution. With small changes, the t-digest can handle
#' values from any ordered set for which we can compute something akin to a mean.
#' The accuracy of quantile estimates produced by t-digests can be orders of
#' magnitude more accurate than those produced by previous digest algorithms.
#'
#' @param vec vector (will be converted to `double` if not already double)
#' @param compression the input compression value; should be >= 1.0; this
#'        will control how aggressively the TDigest compresses data together.
#' @export
#' @return a tdigest object
#' @references <https://raw.githubusercontent.com/tdunning/t-digest/master/docs/t-digest-paper/histo.pdf>
#' @importFrom stats quantile
#' @useDynLib tdigest, .registration = TRUE
#' @examples
#' set.seed(1492)
#' x <- sample(0:100, 1000000, replace = TRUE)

quantile.tdigest <- function(x, probs = seq(0, 1, 0.25), ...) {
  tquantile(x, probs=probs)
}

#' Printer for t-idgest objects
#'
#' @rdname tdigest
#' @param x t-tigest object
#' @param ... unused
#' @keywords internal

#' Wicked Fast, Accurate Quantiles Using 't-Digests'
#'
#' The t-digest construction algorithm uses a variant of 1-dimensional
#' k-means clustering to produce a very compact data structure that allows
#' accurate estimation of quantiles. This t-digest data structure can be used
#' to estimate quantiles, compute other rank statistics or even to estimate
#' related measures like trimmed means. The advantage of the t-digest over
#' previous digests for this purpose is that the t-digest handles data with
#' full floating point resolution. With small changes, the t-digest can handle
#' values from any ordered set for which we can compute something akin to a mean.
#' The accuracy of quantile estimates produced by t-digests can be orders of
#' magnitude more accurate than those produced by previous digest algorithms.
#'
#' - URL: <https://gitlab.com/hrbrmstr/tdigest>
#' - BugReports: <https://gitlab.com/hrbrmstr/tdigest/issues>
#'
#' @md
#' @name `tdigest-package`
#' @keywords internal
#' @docType package
#' @author Bob Rudis (bob@@rud.is)
#' @importFrom stats quantile
#' @references <https://raw.githubusercontent.com/tdunning/t-digest/master/docs/t-digest-paper/histo.pdf>
#' @useDynLib tdigest, .registration = TRUE
NULL

The R code and src/tdigest-main.c, src/init.c are MIT-licensed by the package author.

src/tdigest.h, src/tdigest.c are MIT-licensed & Copyright (c) 2018 ajwerner [REF: https://github.com/ajwerner/tdigestc; license below copied from that repository]

The original t-Digest implementation and algorithm are have the following license:

% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/create.R
\name{print.tdigest}
\alias{print.tdigest}
\title{Printer for t-idgest objects}
\usage{
\method{print}{tdigest}(x, ...)
}
\arguments{
\item{x}{t-tigest object}

\item{...}{unused}
}
\description{
Printer for t-idgest objects
}
\keyword{internal}

% Please edit documentation in R/create.R
\name{tdigest}
\alias{tdigest}
\alias{print.tdigest}
\title{Create a new t-digest histogram from a vector}
\usage{
tdigest(vec, compression = 100)

\method{print}{tdigest}(x, ...)
}
\arguments{
\item{vec}{vector (will be converted to \code{double} if not already double)}

balance between precision and efficiency. It will land at very small(think like 1e-6 percentile points) errors at extreme points in the distribution, and compression ratios of around 500 for large data sets (~1 million datapoints). Defaults to 100.}

\item{x}{t-tigest object}

\item{...}{unused}
}
\value{
a tdigest object
}
\description{
Create a new t-digest histogram from a vector

The t-digest construction algorithm uses a variant of 1-dimensional
k-means clustering to produce a very compact data structure that allows
accurate estimation of quantiles. This t-digest data structure can be used
to estimate quantiles, compute other rank statistics or even to estimate
related measures like trimmed means. The advantage of the t-digest over
previous digests for this purpose is that the t-digest handles data with
full floating point resolution. With small changes, the t-digest can handle
values from any ordered set for which we can compute something akin to a mean.
The accuracy of quantile estimates produced by t-digests can be orders of
magnitude more accurate than those produced by previous digest algorithms.
}
\examples{
set.seed(1492)

quantile(td)
\references{
\url{https://raw.githubusercontent.com/tdunning/t-digest/master/docs/t-digest-paper/histo.pdf}
}
\keyword{internal}

% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/tdigest-package.R
\docType{package}
\name{`tdigest-package`}
\alias{`tdigest-package`}
\title{Wicked Fast, Accurate Quantiles Using 't-Digests'}
\description{
The t-digest construction algorithm uses a variant of 1-dimensional
k-means clustering to produce a very compact data structure that allows
accurate estimation of quantiles. This t-digest data structure can be used
to estimate quantiles, compute other rank statistics or even to estimate
related measures like trimmed means. The advantage of the t-digest over
previous digests for this purpose is that the t-digest handles data with
full floating point resolution. With small changes, the t-digest can handle
values from any ordered set for which we can compute something akin to a mean.
The accuracy of quantile estimates produced by t-digests can be orders of
magnitude more accurate than those produced by previous digest algorithms.
}
\details{
\itemize{
\item URL: \url{https://gitlab.com/hrbrmstr/tdigest}
\item BugReports: \url{https://gitlab.com/hrbrmstr/tdigest/issues}
}
}
\references{
\url{https://raw.githubusercontent.com/tdunning/t-digest/master/docs/t-digest-paper/histo.pdf}
}
\author{
Bob Rudis (bob@rud.is)
}
\keyword{internal}