Merge pull request #5 from cricsheet/update-to-use-https
Update to use https for cricsheet.org
A ruby script to convert version 0.9 Cricsheet YAML data files into CSV.
The CSV generated is considered an experimental Cricsheet format. Originally CSV data was provided for only T20 and IPL matches, however that has now expanded to include all matches. The CSV version was originally created in response to a request for a simple version of the data in CSV format (rather than YAML). The data it generates is not as complete as the YAML data and, currently, does not fully support all fields. A list of known limitations can be found below.
The data generated is similar to, but not exactly the same as, that provided on the Cricsheet downloads page. The data generated by the
convert.rb script does not include values for
match_referee, unlike that provided on the website. A future version of this converter will include these fields.
You can manage the dependencies using Bundler. Once you have it installed you can install the dependencies using:
$ bundle install
convert.rb is a ruby script. It takes the path to a single match file (in version 0.9 YAML format), and outputs the generated CSV for the match.
Convert a single YAML file, and print the resulting CSV to the command line.
$ ./convert.rb data.yaml
Convert a single YAML file, and write the resulting CSV to a file named
$ ./convert.rb data.yaml > data.csv
Each file has a 'version', multiple 'info' lines, and multiple 'ball' lines. The 'version' is 1.2.0 right now, and will change as I make changes (in line with the Semantic Versioning guidlines). The 'info' entries should be fairly self-explanatory but feel free to ask on Twitter (@cricsheet) if you're unsure. If you look carefully you may see some slight hints as to some data we'll be including in the full data files in the future.
Each 'ball' line has the following fields:
There are a number of known limitations with the current CSV format. Discussion/feedback on how best to address these is welcome.
At the moment each delivery merely shows how many extras were scored, and doesn't break that information down into the type of extras that were conceded. This is due to the original request for CSV data specially asking just for the total extras conceded per delivery.