Download spec-access.pdf for the latest published version of the working draft report.
├── README.md: this README ├── analysis │ ├── analyse.R: (2) analyses data, outputs figures │ └── wrangle.R: (1) wrangles data into well-structured form from local scraped webstie ├── images │ ├── first-alt.png │ ├── second-alt.png │ └── spectator-logo.png ├── spec-access.pdf: PDF report output └── spec-access.tex: LaTeX source for report
To replicate data analysis, you must have a local copy of some part of the Spectator's website (or the website you wish to analyse).
Preferably, download my archive of the site within the
analysis directory (filelist.txt included):
then to extract:
tar -xzvf www.columbiaspectator.com.tar.gz
You are then ready to run the R scripts. If you are unable to download the above, you can directly scrape the site using wget within the analysis directory:
wget -rkp -l6 -np -nH -N --wait=1 -A html https://www.columbiaspectator.com
If you scrape a site yourself, you can create filelist.txt with:
find . | grep ".html" > filelist.txt
install.packages("[package name here]") in the R console:
plotly (optional: to create interactive plotly graphs or SVG, requires additional installation of orca)
R -f wrangle.R
Analyse & create plots:
R -f analyse.R
Update the images directory with your newly generated figures, keeping the file names as they are. These images will be pulled in automatically when you compile the document.
Then, update any text/information in the report to reflect changes you might have found.
Compile the document twice with pdfLaTeX or XeLaTeX.
To create other formats of the report, consider using pandoc to convert the LaTeX document. For example:
pandoc spec-access.tex -o spec-access.html
pandoc spec-access.tex -o spec-access.docx
pandoc spec-access.tex -o spec-access.odt
Copyright 2020 Nathaniel Ijams firstname.lastname@example.org
The code is licensed under the GNU General Public License Version 3 (GPLv3). See the LICENSE file.
The writing is licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0). See the LICENSE-TEXT file.
R and LaTeX are well-supported ecosystems, each supported for the ~30 years they have been in existence.
Packages & dependencies used by both in this project are all very popular and well-supported.
The owner of this repository commits to support for its contents until January 1st, 2025.