A tool for modeling web navigation data into various order Markov chains and deriving statistical data.

b0516676
—
Dimos Dimakakos
1 year, 7 months ago

Docs: update README

95d45081
—
Dimos Dimakakos
1 year, 7 months ago

Fix: add option for skipping input lines

eea495df
—
Dimos Dimakakos
1 year, 7 months ago

Add: README and LICENSE

- read-only
- https://git.sr.ht/~bendersteed/markov-chains-web-navigation
- read/write
- git@git.sr.ht:~bendersteed/markov-chains-web-navigation

The only dependency of the tool is the tidyverse library.

This tool has the main entry point simulation:

```
simulation <- function(input, k, states, topics, skip) { ... }
```

- input: a file of navigation data, where each line is a navigation path of numbers that describe the topics that where chosen.
- k: the upper limit of Markov chain order, for which to model the data into
- states: the number of states (topics) that appear in the dataset
- topics: an array of length states, that maps index to name
- skip: the number of lines to ignore from input

Topics and skip arguments provide default values for the analysis of the msnbc anonymous navigation dataset that can be found here.

It returns three tibbles:

- frequencies of topics
- loglikelihood of topics
- results that contain loglikelihood ratio statistics, AIC and BIC

There are also some helper functions provided for the creation of graphs.

The Lisp implementation is faster but not as polished as the R one. Proceed with care!