~hrbrmstr/sergeant

23b9db7aadfb4cea8e844b61a3e25b2bed640250 — Bob Rudis 3 years ago 206a03e
tweak to tests; docs cleanup
A DESCRIPTION.txt => DESCRIPTION.txt +35 -0
@@ 0,0 1,35 @@
Package: sergeant
Title: Tools to Transform and Query Data with 'Apache' 'Drill'
Version: 0.5.0
Authors@R: c(person("Bob", "Rudis", email = "bob@rud.is", role = c("aut", "cre")),
             person("Edward", "Visel", email = "edward.visel@gmail.com", role = "ctb"))
Description: 'Apache Drill' is a low-latency distributed query engine designed to enable 
    data exploration and analytics on both relational and non-relational datastores, 
    scaling to petabytes of data. Methods are provided that enable working with 'Apache Drill'
    instances via the 'REST API', the JDBC interface, 'DBI' 'methods' and 'dplyr'/'dbplyr'.
Depends:
    R (>= 3.0.0),
    DBI,
    dplyr (>= 0.7.0),
    dbplyr
URL: http://github.com/hrbrmstr/sergeant
BugReports: https://github.com/hrbrmstr/sergeant/issues
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
Imports:
    httr,
    jsonlite,
    htmltools,
    readr,
    purrr,
    utils,
    scales,
    methods
Suggests:
    RJDBC,
    rJava,
    testthat
RoxygenNote: 6.0.1
Remotes:
    tidyverse/dbplyr

M R/custom.r => R/custom.r +23 -1
@@ 1,4 1,20 @@
#' Drill expressions / custom functions `dplyr` translation:
#' Drill expressions / custom functions `dplyr` translations
#'
#' One benefit of `dplyr` is that it provide a nice DSL over datasbase ops but that
#' means there needs to be knowlege of functions supported by the host database and
#' then a translation layer so they can be used in R.
#'
#' Similarly, there are functions like `grepl()` in R that don't directly exist in
#' databases. Yet, one can create a translation for `grepl()` that maps to a
#' [Drill custom function](https://github.com/parisni/drill-simple-contains) so you
#' don't have to think differently or rewrite your pipes when switching from core
#' tidyverse ops and database ops.
#'
#' Many functions translate on their own, but it's handy to provide explicit ones,
#' especially when you want to use parameters in a different order.
#'
#' If you want a particular custom function mapped, file a PR or issue request in
#' the link found in the `DESCRIPTION` file.
#'
#' - `as.character(x)` : `CAST( x AS CHARACTER )`
#' - `as.date(x)` : `CAST( x AS DATE )`


@@ 50,6 66,12 @@
#' - `upper(x)` = `UPPER(1)`
#' - `toupper(x)` = `UPPER(1)`
#'
#' You can get a compact list of these with:
#'
#' `sql_translate_env(src_drill()$con)`
#'
#' as well.
#'
#' @md
#' @name drill_custom_functions
NULL

M R/dplyr.r => R/dplyr.r +2 -1
@@ 215,7 215,8 @@ sql_translate_env.DrillConnection <- function(con) {
    ),

    window = dbplyr::sql_translator(.parent = dbplyr::base_win,
      n = function() { dbplyr::win_over(dbplyr::sql("count(*)"), partition = dbplyr::win_current_group()) },
      n = function() { dbplyr::win_over(dbplyr::sql("count(*)"),
                                        partition = dbplyr::win_current_group()) },
      cor = dbplyr::win_recycled("corr"),
      cov = dbplyr::win_recycled("covar_samp"),
      sd =  dbplyr::win_recycled("stddev_samp"),

M man/drill_custom_functions.Rd => man/drill_custom_functions.Rd +23 -1
@@ 2,8 2,24 @@
% Please edit documentation in R/custom.r
\name{drill_custom_functions}
\alias{drill_custom_functions}
\title{Drill expressions / custom functions \code{dplyr} translation:}
\title{Drill expressions / custom functions \code{dplyr} translations}
\description{
One benefit of \code{dplyr} is that it provide a nice DSL over datasbase ops but that
means there needs to be knowlege of functions supported by the host database and
then a translation layer so they can be used in R.
}
\details{
Similarly, there are functions like \code{grepl()} in R that don't directly exist in
databases. Yet, one can create a translation for \code{grepl()} that maps to a
\href{https://github.com/parisni/drill-simple-contains}{Drill custom function} so you
don't have to think differently or rewrite your pipes when switching from core
tidyverse ops and database ops.

Many functions translate on their own, but it's handy to provide explicit ones,
especially when you want to use parameters in a different order.

If you want a particular custom function mapped, file a PR or issue request in
the link found in the \code{DESCRIPTION} file.
\itemize{
\item \code{as.character(x)} : \code{CAST( x AS CHARACTER )}
\item \code{as.date(x)} : \code{CAST( x AS DATE )}


@@ 55,4 71,10 @@
\item \code{upper(x)} = \code{UPPER(1)}
\item \code{toupper(x)} = \code{UPPER(1)}
}

You can get a compact list of these with:

\code{sql_translate_env(src_drill()$con)}

as well.
}

M tests/testthat.R => tests/testthat.R +2 -1
@@ 1,4 1,5 @@
library(testthat)
library(dbplyr)
library(sergeant)
library(testthat)

test_check("sergeant")

M tests/testthat/test-sergeant.R => tests/testthat/test-sergeant.R +4 -2
@@ 1,9 1,11 @@
test_host <- Sys.getenv("DRILL_TEST_HOST", "localhost")

context("dplyr")
test_that("Core dbplyr ops work", {

 testthat::skip_on_cran()

  db <- src_drill("localhost")
  db <- src_drill(test_host)

  expect_that(db, is_a("src_drill"))



@@ 19,7 21,7 @@ test_that("REST API works", {

 testthat::skip_on_cran()

  dc <- drill_connection("localhost")
  dc <- drill_connection(test_host)
  expect_that(drill_active(dc), equals(TRUE))

  test_rest <- drill_query(dc, "SELECT * FROM cp.`employee.json` limit 10")