@@ 1,9 1,10 @@
# Modeling web navigation data into Markov chains of varying order
+The only dependency of the tool is the [tidyverse](https://www.tidyverse.org/) library.
This tool has the main entry point simulation:
``` R
-simulation <- function(input, k, states, topics) { ... }
+simulation <- function(input, k, states, topics, skip) { ... }
```
1. input: a file of navigation data, where each line is a navigation path of numbers that
@@ 11,8 12,15 @@ simulation <- function(input, k, states, topics) { ... }
2. k: the upper limit of Markov chain order, for which to model the data into
3. states: the number of states (topics) that appear in the dataset
4. topics: an array of length states, that maps index to name
+5. skip: the number of lines to ignore from input
-It returns
+Topics and skip arguments provide default values for the analysis of the msnbc anonymous navigation
+dataset that can be found [here](http://kdd.ics.uci.edu/databases/msnbc/msnbc.html).
+
+It returns three tibbles:
+1. frequencies of topics
+2. loglikelihood of topics
+3. results that contain loglikelihood ratio statistics, AIC and BIC
There are also some helper functions provided for the creation of graphs.