~duncan-bayne/redbubble

Redbubble coding test
Fix 'naeve' typo
Improve documentation for performance and issues
Expect specific exception types to avoid specs that pass accidentally

refs

master
browse  log 

clone

read-only
https://git.sr.ht/~duncan-bayne/redbubble
read/write
git@git.sr.ht:~duncan-bayne/redbubble

You can also use your local clone with git send-email.

#generate

generate is a tool to generate static HTML for browsing Redbubble works.

#installing

Ensure you have Git and Ruby 2.2.2 installed. Optionally, install rbenv to manage Ruby versions, and editorconfig for automatic editor configuration.

Then, in a console execute:

git clone git://github.com/duncan-bayne/redbubble.git
cd redbubble
gem install bundler
bundle install

#supported operating systems

generate should install on pretty much any recent UNIX or UNIX-like operating system, and Microsoft Windows. I've tested it on 64-bit Linux Mint 17.2 (Rafaela).

#running

#usage
Generates static HTML files for browsing images.  Usage: generate [options]
    -c, --clean                      First delete all HTML files from the output directory; default is false
    -i, --input INPUT                Specify the XML file containing work and EXIF data
    -o, --output-dir OUTPUT          Specify the directory in which to create static HTML files
#examples
$ mkdir tmp
$ ./generate -i spec/fixtures/works.xml -o tmp

The above will create a directory called tmp, and generate the static HTML files inside it. You can then browse the index with your favourite browser:

$ firefox tmp/index.html &

#development notes

#overview

I developed generate in an exploratory fashion. You can see the commit log for how I proceeded.

I chose to build an ActiveRecord-based application upon an in-memory SQLite3 database. This is slightly slower than a naive implementation, but runs well on 100,000-work datasets, and could easily be extended in a variety of ways. Potential improvements include using a more conventional database (e.g. PostgreSQL) for very large datasets, or implementing the database population code in a language more suited to fast XML processing than Ruby.

#issues

New issues can be logged on the Issues page.

#operation

  1. A SQLite3 in-memory database is created, and populated with tables for Works, CameraModels and CameraMakes.
  2. CameraMakes, then CameraModels, then Works are parsed out of XML and into the database.
  3. Index, then CameraMake, then CameraModel pages are created from ERB templates and written to disk.

#application structure

This is the directory structure of the application, with a few very important files identified:

├── doc           Documentation from Redbubble
├── generate      The application executable
├── script        Utility scripts
├── spec
│   ├── fixtures  Fixture files for specs
│   ├── lib       Specs for library classes (only one for now)
│   ├── models    Specs for models
│   ├── support   Shared examples for specs
│   └── views     Specs for view classes
└── src
    ├── db.rb     Contains database setup and schema
    ├── lib       Library classes
    ├── models    Model classes
    ├── templates ERB templates from which HTML is generated
    └── views     View classes

#assumptions

  • The names of camera makes and models are case insensitive (e.g. 'CANON' and 'Canon' would be considered the same make).
  • Names are only treated as the same if they are identical excepting case (e.g. 'FUJIFILM' and 'FUJI PHOTO FILM CO., LTD.' would be considered two separate makes).
  • Work tags without both a camera make and model are ignored when extracting camera makes and models, and may only ever appear on the index page.
  • Camera model names are not assumed to be unique across makes; a work will only be assigned a camera model if it has both a camera model and make.
  • Invalid XML, and XML containing invalid data (e.g. invalid URLs or empty names) will cause processing to halt with an error.
  • The title of a work is its filename, which is not guaranteed to be unique.
  • This is a UNIX program (so it observes the Rule of Silence and only reports on errors or when instructed).

Depending upon the time available and the background of the intended users, some of those assumptions could easily be challenged. E.g. Windows users are not used to case sensitivity or silent success.

#performance testing

You can use the script/generate_fixture script to generate arbitrarily large work XML files for testing. E.g.:

$ ./script/generate_fixture -w 1000 > spec/fixtures/works-1k.xml
$ time ./generate -i spec/fixtures/works-1k.xml -o tmp
./generate -i spec/fixtures/works-1k.xml -o tmp  2.14s user 0.13s system 100% cpu 2.275 total

Running generate against a 100,000 work fixture should take around six and a half minutes on a development-spec laptop.

#other development tasks

Specs, profiling and quality checks are automated through rake. E.g. to display a list of all tasks:

$ rake -T
rake clean          # Clean all tempfiles and test reports
rake default        # Runs all specs and quality tests
rake documentation  # Generates HTML documentation from Markdown files
rake profile        # Profiles generation from 10,000 works
rake roodi          # Run Roodi against all source files
rake specs          # Run RSpec code examples

#licence

All original code and documentation is Copyright © Duncan Bayne, and licensed under the WTFPL.

The requirements, sample works XML and template are Copyright © Redbubble Pty. Ltd. All rights reserved.