LaTeX & R source for the Columbia Spectator Accessibility Report
Revert mono-repository.
Move to mono-rep for NewsDev reports.
Add overview to README.


browse  log 



You can also use your local clone with git send-email.

#The Specator Accessibility Report

Download spec-access.pdf for the latest published version of the working draft report.

Send comments, questions, or feedback to nate@ijams.me or to my public inbox at ~exprez135/public-inbox@lists.sr.ht. Follow mailing list etiquette.


Repository Contents:

├── README.md: this README
├── analysis
│   ├── analyse.R: (2) analyses data, outputs figures
│   └── wrangle.R: (1) wrangles data into well-structured form from local scraped webstie
├── images
│   ├── first-alt.png
│   ├── second-alt.png
│   └── spectator-logo.png
├── spec-access.pdf: PDF report output
└── spec-access.tex: LaTeX source for report

#Data & Analysis

To replicate data analysis, you must have a local copy of some part of the Spectator's website (or the website you wish to analyse).

#Scraping the Spectator website

Preferably, download my archive of the site within the analysis directory (filelist.txt included):

wget https://share.ijams.me/columbia/spectator/www.columbiaspectator.com.tar.gz

then to extract:

tar -xzvf www.columbiaspectator.com.tar.gz

You are then ready to run the R scripts. If you are unable to download the above, you can directly scrape the site using wget within the analysis directory:

wget -rkp -l6 -np -nH -N --wait=1 -A html https://www.columbiaspectator.com

#Create filelist.txt

If you scrape a site yourself, you can create filelist.txt with:

find . | grep ".html" > filelist.txt

#R dependencies

Run install.packages("[package name here]") in the R console:

  • tidyverse

  • rvest

  • purrr

  • plyr

  • ggplot2

  • plotly (optional: to create interactive plotly graphs or SVG, requires additional installation of orca)

#Running the scripts

Wrangle data:

R -f wrangle.R

Analyse & create plots:

R -f analyse.R

#Creating PDF from LaTeX

Update the images directory with your newly generated figures, keeping the file names as they are. These images will be pulled in automatically when you compile the document.

Then, update any text/information in the report to reflect changes you might have found.

Compile the document twice with pdfLaTeX or XeLaTeX.

#Creating other document formats

To create other formats of the report, consider using pandoc to convert the LaTeX document. For example:

pandoc spec-access.tex -o spec-access.html

pandoc spec-access.tex -o spec-access.docx

pandoc spec-access.tex -o spec-access.odt

#Further Information

Copyright 2020 Nathaniel Ijams nate@ijams.me

The code is licensed under the GNU General Public License Version 3 (GPLv3). See the LICENSE file.

The writing is licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0). See the LICENSE-TEXT file.

#Longevity Statement

R and LaTeX are well-supported ecosystems, each supported for the ~30 years they have been in existence.

Packages & dependencies used by both in this project are all very popular and well-supported.

The owner of this repository commits to support for its contents until January 1st, 2025.