~hrbrmstr/urlscan

ref: a60e6c38011f19dada7811352d7fa791e0fd5b3b urlscan/README.Rmd -rw-r--r-- 1.1 KiB
a60e6c38boB Rudis initial commit 2 years ago
                                                                                
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
---
output: rmarkdown::github_document
---

# urlscan

Analyze Websites and Resources They Request

## Description

WIP

The <urlscan.io> service provides an 'API' enabling analysis of 
websites and the resources they request. Much like the 'Inspector' of your 
browser, <urlscan.io> will let you take a look at the individual resources 
that are requested when a site is loaded. Tools are provided to search
public <urlscans.io> scan submissions.

## What's Inside The Tin

The following functions are implemented:

- `urlscan_search`: Perform a urlscan.io query

## Installation

```{r eval=FALSE}
devtools::install_github("hrbrmstr/urlscan")
```

```{r message=FALSE, warning=FALSE, error=FALSE, include=FALSE}
options(width=120)
```

## Usage

```{r message=FALSE, warning=FALSE, error=FALSE}
library(urlscan)

# current verison
packageVersion("urlscan")
```

```{r}
library(tidyverse)

x <- urlscan_search("domain:r-project.org")

bind_cols(
  select(x$results$task, -options) %>% 
    mutate(user_agent = x$results$task$options$useragent)
  , x$results$stats, x$results$page
) %>% 
  tbl_df() -> xdf

xdf

glimpse(xdf)
```