Analyze Websites and Resources They Request
The <urlscan.io> service provides an ‘API’ enabling analysis of websites and the resources they request. Much like the ‘Inspector’ of your browser, <urlscan.io> will let you take a look at the individual resources that are requested when a site is loaded. Tools are provided to search public <urlscans.io> scan submissions/results and submit URLs for scanning.
The following functions are implemented:
urlscan_search: Perform a urlscan.io query
urlscan_result: Retrieve detailed results for a given scan ID
urlscan_submit: Submit a URL for scanning
devtools::install_git("https://git.sr.ht/~hrbrmstr/urlscan") # or devtools::install_gitlab("hrbrmstr/urlscan") # or devtools::install_github("hrbrmstr/urlscan")
library(urlscan) library(tidyverse) # for demos # current verison packageVersion("urlscan")
##  '0.2.0'
x <- urlscan_search("domain:r-project.org") as_tibble(x$results$task) %>% bind_cols(as_tibble(x$results$page)) %>% mutate( time = anytime::anytime(time), id = x$results$`_id` ) %>% arrange(desc(time)) %>% select(url, country, server, ip, id) -> xdf ures <- urlscan_result(xdf$id, include_dom = TRUE, include_shot = TRUE) ures
## URL: https://cran.r-project.org/ ## Scan ID: cdc2b957-548c-447a-a1b2-bebd6a734aec ## Malicious: FALSE ## Ad Blocked: FALSE ## Total Links: 0 ## Secure Requests: 9 ## Secure Req %: 100%
|Lang||# Files||(%)||LoC||(%)||Blank lines||(%)||# Lines||(%)|
Please note that the ‘urlscan’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.