This repository is a collection of utilities for working with Citibike data. It allows you to easily download all of Citibike's ride history archives, transform them as you see fit, and throw them into a SQLite database for easy querying.
This repository is what I use to build the SQLite database used in Citibike Explorer. It is also potentially useful if you don't feel like re-writing your own scraper to download, unzip, and load trip history archives into a
Clone the repository, cd into the directory, and run:
$ python -m virtualenv .venv $ source .venv/bin/activate $ pip install -r ./requirements.txt
Once requirements are installed, you can use
./bin/scraper to download the trip archives individually or all in one swoop. See
./bin/scraper --help for details.
There is also
./bin/hourly-volume-rollup which will parse through all available archives and roll up the trip data into an hourly timeseries. Note that this requires provisioning a sqlite database, which can be done by running
If you're just looking to load an archive into pandas, here's the code snippet you're looking for:
import forerad.scrapers.historical as historical archives = historical.HistoricalTripArchive.list_cached() df = archives.fetch_df() print(df)
I originally wanted to build a forecast of daily trip volume but ended up scaling back my ambitions (maybe just for now).
Fore is for forecast,
rad is for das Fahrrad, the German word for bike.