~hrbrmstr/tdigest

1e8e9ea32ae74fe5072e9c562bd09abaf0240ba2 — hrbrmstr 4 years ago 1bddab1
updated copyright
7 files changed, 34 insertions(+), 78 deletions(-)

M DESCRIPTION
M R/create.R
D R/tdigest-package.R
M inst/COPYRIGHTS
D man/print.tdigest.Rd
M man/tdigest.Rd
D man/tick-tdigest-package-tick.Rd
M DESCRIPTION => DESCRIPTION +1 -1
@@ 2,7 2,7 @@ Package: tdigest
Type: Package
Title: Wicked Fast, Accurate Quantiles Using 't-Digests'
Version: 0.2.0
Date: 2019-04-03
Date: 2019-07-21
Authors@R: c(
    person("Bob", "Rudis", email = "bob@rud.is", role = c("aut", "cre"), 
           comment = c(ORCID = "0000-0001-5670-2640")),

M R/create.R => R/create.R +14 -2
@@ 1,5 1,16 @@
#' Create a new t-digest histogram from a vector
#'
#' The t-digest construction algorithm uses a variant of 1-dimensional
#' k-means clustering to produce a very compact data structure that allows
#' accurate estimation of quantiles. This t-digest data structure can be used
#' to estimate quantiles, compute other rank statistics or even to estimate
#' related measures like trimmed means. The advantage of the t-digest over
#' previous digests for this purpose is that the t-digest handles data with
#' full floating point resolution. With small changes, the t-digest can handle
#' values from any ordered set for which we can compute something akin to a mean.
#' The accuracy of quantile estimates produced by t-digests can be orders of
#' magnitude more accurate than those produced by previous digest algorithms.
#'
#' @param vec vector (will be converted to `double` if not already double)
#' @param compression the input compression value; should be >= 1.0; this
#'        will control how aggressively the TDigest compresses data together.


@@ 11,6 22,8 @@
#' @export
#' @return a tdigest object
#' @references <https://raw.githubusercontent.com/tdunning/t-digest/master/docs/t-digest-paper/histo.pdf>
#' @importFrom stats quantile
#' @useDynLib tdigest, .registration = TRUE
#' @examples
#' set.seed(1492)
#' x <- sample(0:100, 1000000, replace = TRUE)


@@ 50,8 63,7 @@ quantile.tdigest <- function(x, probs = seq(0, 1, 0.25), ...) {
  tquantile(x, probs=probs)
}

#' Printer for t-idgest objects
#'
#' @rdname tdigest
#' @param x t-tigest object
#' @param ... unused
#' @keywords internal

D R/tdigest-package.R => R/tdigest-package.R +0 -25
@@ 1,25 0,0 @@
#' Wicked Fast, Accurate Quantiles Using 't-Digests'
#'
#' The t-digest construction algorithm uses a variant of 1-dimensional
#' k-means clustering to produce a very compact data structure that allows
#' accurate estimation of quantiles. This t-digest data structure can be used
#' to estimate quantiles, compute other rank statistics or even to estimate
#' related measures like trimmed means. The advantage of the t-digest over
#' previous digests for this purpose is that the t-digest handles data with
#' full floating point resolution. With small changes, the t-digest can handle
#' values from any ordered set for which we can compute something akin to a mean.
#' The accuracy of quantile estimates produced by t-digests can be orders of
#' magnitude more accurate than those produced by previous digest algorithms.
#'
#' - URL: <https://gitlab.com/hrbrmstr/tdigest>
#' - BugReports: <https://gitlab.com/hrbrmstr/tdigest/issues>
#'
#' @md
#' @name `tdigest-package`
#' @keywords internal
#' @docType package
#' @author Bob Rudis (bob@@rud.is)
#' @importFrom stats quantile
#' @references <https://raw.githubusercontent.com/tdunning/t-digest/master/docs/t-digest-paper/histo.pdf>
#' @useDynLib tdigest, .registration = TRUE
NULL
\ No newline at end of file

M inst/COPYRIGHTS => inst/COPYRIGHTS +1 -1
@@ 1,6 1,6 @@
The R code and src/tdigest-main.c, src/init.c are MIT-licensed by the package author.

src/tdigest.h, src/tdigest.c is MIT-licensed & Copyright (c) 2018 ajwerner [REF: https://github.com/ajwerner/tdigestc]
src/tdigest.h, src/tdigest.c are MIT-licensed & Copyright (c) 2018 ajwerner [REF: https://github.com/ajwerner/tdigestc; license below copied from that repository]

The original t-Digest implementation and algorithm are  have the following license:


D man/print.tdigest.Rd => man/print.tdigest.Rd +0 -17
@@ 1,17 0,0 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/create.R
\name{print.tdigest}
\alias{print.tdigest}
\title{Printer for t-idgest objects}
\usage{
\method{print}{tdigest}(x, ...)
}
\arguments{
\item{x}{t-tigest object}

\item{...}{unused}
}
\description{
Printer for t-idgest objects
}
\keyword{internal}

M man/tdigest.Rd => man/tdigest.Rd +18 -1
@@ 2,9 2,12 @@
% Please edit documentation in R/create.R
\name{tdigest}
\alias{tdigest}
\alias{print.tdigest}
\title{Create a new t-digest histogram from a vector}
\usage{
tdigest(vec, compression = 100)

\method{print}{tdigest}(x, ...)
}
\arguments{
\item{vec}{vector (will be converted to \code{double} if not already double)}


@@ 16,12 19,25 @@ balance between precision and efficiency. It will land at very small
(think like 1e-6 percentile points) errors at extreme points in the
distribution, and compression ratios of around 500 for large data sets
(~1 million datapoints). Defaults to 100.}

\item{x}{t-tigest object}

\item{...}{unused}
}
\value{
a tdigest object
}
\description{
Create a new t-digest histogram from a vector
The t-digest construction algorithm uses a variant of 1-dimensional
k-means clustering to produce a very compact data structure that allows
accurate estimation of quantiles. This t-digest data structure can be used
to estimate quantiles, compute other rank statistics or even to estimate
related measures like trimmed means. The advantage of the t-digest over
previous digests for this purpose is that the t-digest handles data with
full floating point resolution. With small changes, the t-digest can handle
values from any ordered set for which we can compute something akin to a mean.
The accuracy of quantile estimates produced by t-digests can be orders of
magnitude more accurate than those produced by previous digest algorithms.
}
\examples{
set.seed(1492)


@@ 33,3 49,4 @@ quantile(td)
\references{
\url{https://raw.githubusercontent.com/tdunning/t-digest/master/docs/t-digest-paper/histo.pdf}
}
\keyword{internal}

D man/tick-tdigest-package-tick.Rd => man/tick-tdigest-package-tick.Rd +0 -31
@@ 1,31 0,0 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/tdigest-package.R
\docType{package}
\name{`tdigest-package`}
\alias{`tdigest-package`}
\title{Wicked Fast, Accurate Quantiles Using 't-Digests'}
\description{
The t-digest construction algorithm uses a variant of 1-dimensional
k-means clustering to produce a very compact data structure that allows
accurate estimation of quantiles. This t-digest data structure can be used
to estimate quantiles, compute other rank statistics or even to estimate
related measures like trimmed means. The advantage of the t-digest over
previous digests for this purpose is that the t-digest handles data with
full floating point resolution. With small changes, the t-digest can handle
values from any ordered set for which we can compute something akin to a mean.
The accuracy of quantile estimates produced by t-digests can be orders of
magnitude more accurate than those produced by previous digest algorithms.
}
\details{
\itemize{
\item URL: \url{https://gitlab.com/hrbrmstr/tdigest}
\item BugReports: \url{https://gitlab.com/hrbrmstr/tdigest/issues}
}
}
\references{
\url{https://raw.githubusercontent.com/tdunning/t-digest/master/docs/t-digest-paper/histo.pdf}
}
\author{
Bob Rudis (bob@rud.is)
}
\keyword{internal}