~vesto/forerad

A simple toolkit for scraping and manipulating Citibike trip history in Python

refs

main
browse  log 

clone

read-only
https://git.sr.ht/~vesto/forerad
read/write
git@git.sr.ht:~vesto/forerad

You can also use your local clone with git send-email.

#Forerad

2024-03-24 Update: It appears Lyft has changed the way they distribute the datasets in their S3 bucket. As such, the scraper script is currently out of order until I have some time/energy to update it. If you're thinking about using this tool feel free to send an email to ~vesto/feedon@lists.sr.ht and maybe I can speed up the timeline :).

This repository is a collection of utilities for working with Citibike data. It allows you to easily download all of Citibike's ride history archives, transform them as you see fit, and throw them into a SQLite database for easy querying.

This repository is what I use to build the SQLite database used in Citibike Explorer. It is also potentially useful if you don't feel like re-writing your own scraper to download, unzip, and load trip history archives into a pd.DataFrame.

#Installation and usage

Clone the repository, cd into the directory, and run:

$ python -m virtualenv .venv
$ source .venv/bin/activate
$ pip install -r ./requirements.txt

Once requirements are installed, you can use ./bin/scraper to download the trip archives individually or all in one swoop. See ./bin/scraper --help for details.

There is also ./bin/hourly-volume-rollup which will parse through all available archives and roll up the trip data into an hourly timeseries. Note that this requires provisioning a sqlite database, which can be done by running yoyo apply.

If you're just looking to load an archive into pandas, here's the code snippet you're looking for:

import forerad.scrapers.historical as historical

archives = historical.HistoricalTripArchive.list_cached()
df = archives[0].fetch_df()

print(df)

#FAQ

#What's with the stupid name?

I originally wanted to build a forecast of daily trip volume but ended up scaling back my ambitions (maybe just for now). Fore is for forecast, rad is for das Fahrrad, the German word for bike.