#Modeling web navigation data into Markov chains of varying order
The only dependency of the tool is the tidyverse library.
This tool has the main entry point simulation:
simulation <- function(input, k, states, topics, skip) { ... }
- input: a file of navigation data, where each line is a navigation path of numbers that
describe the topics that where chosen.
- k: the upper limit of Markov chain order, for which to model the data into
- states: the number of states (topics) that appear in the dataset
- topics: an array of length states, that maps index to name
- skip: the number of lines to ignore from input
Topics and skip arguments provide default values for the analysis of the msnbc anonymous navigation
dataset that can be found here.
It returns three tibbles:
- frequencies of topics
- loglikelihood of topics
- results that contain loglikelihood ratio statistics, AIC and BIC
There are also some helper functions provided for the creation of graphs.
#Common Lisp implementation
The Lisp implementation is faster but not as polished as the R one. Proceed with care!