e9b375ab3a641a861f49b1fa795cb50d0de1c75e — David Knight 1 year, 5 months ago d1b2684
housekeeping, update fedi link to fosstodon
3 files changed, 16 insertions(+), 16 deletions(-)

M content/blog/2020-06-06-experiment-cifar10-training-order.md
M layouts/index.html
@@ 1,4 1,4 @@
Copyright (c) 2013-2021 David Knight
Copyright (c) 2013-2022 David Knight

Source code is licensed under the MIT License.

@@ 453,4 453,4 @@ understandings, or agreements concerning use of licensed material. For
the avoidance of doubt, this paragraph does not form part of the
public licenses.

Creative Commons may be contacted at creativecommons.org.
\ No newline at end of file
Creative Commons may be contacted at creativecommons.org.

M content/blog/2020-06-06-experiment-cifar10-training-order.md => content/blog/2020-06-06-experiment-cifar10-training-order.md +12 -12
@@ 50,59 50,59 @@ Grouped entropy sort was included to make the sorting more fair because average 

The key random ordered test was when training set size was set to 100%. This configuration made the test essentially the same as the original PyTorch tutorial. Naturally, I expected to see very similar results.


This is a standard chart plotting loss versus normalized training epoch, where the number of iterations per epoch is equal to the training set size. The results looked pretty textbook with loss leveling out asymptotically. As one would expect, the tests with smaller training sets produced shallower loss curves, indicating that the optimization had difficulty converging.


This chart shows total accuracy across all image classes against training set size. Again, these results seemed intuitive, where accuracy decreased with training set size. Accuracy achieved on the full training set was 53%, which was consistent with the number achieved in the original PyTorch tutorial.

## Results: Global Entropy Sort Ascending


The shape of these loss curves was more erratic than the random ordered tests. The interesting thing here was the upwards loss bump at the start of the second epoch. This was when the optimization finished the high entropy images at the end of the first epoch and switched to low entropy images again.

It's also worth pointing out that the loss started at a noticeably lower value than the random ordered tests. Maybe there was less difference image-to-images because of the entropy sorting, and this helped the optimization?


No surprises here. Accuracy took a dive as the training set size shrank. Even though the loss chart looked very different from the random ordered tests, peak total accuracy was roughly the same.

## Results: Global Entropy Sort Descending


This loss chart was more normal looking than the version from the previous section's test parameters. For large training set sizes the initial loss did still start at a lower value than the randomly ordered tests.


This was the most interesting chart so far! The largest training set size performed relatively poorly, and peak accuracy occurred at 40,000 training images (80%). It's worth pointing out that in this configuration 10,000 low entropy images were kept out of the training set. Something about these images reduced total accuracy.

## Results: Grouped Entropy Sort Ascending


Unlike the ascending global entropy sort tests this loss chart did not have the weird dip and hump between epochs. The lower initial loss values from that test were also not present here. This chart looked pretty similar to the random ordered tests.

It's possible that the lower initial loss in the global entropy sort tests was from some image classes having inherently higher or lower entropies, which would result in runs of training data that were not distributed uniformly across image classes. This possible unfairness in entropy distribution was the original reason for including the grouped entropy sort tests.


Accuracy results looked pretty normal yet again. Peak total accuracy was roughly the same, too.

## Results: Grouped Entropy Sort Descending


That weird bump came back but this time on the descending tests! I didn't have an explanation for it. Again, the initial loss started at around the same value as the random ordered tests.


The fun accuracy curve was also back and very pronounced this time! Unlike the mysterious loss behavior, this accuracy curve consistently showed up on both sets of descending order tests. This lends more credence to the theory that some of the low entropy training images were not helping.

@@ 125,9 125,9 @@ So does training order affect classifier performance? It does! Unfortunately the

The accuracy curves of the descending entropy sort tests (both global and grouped) were the most interesting part of the experiment. They suggest that some of the low entropy training images in the CIFAR-10 set might be irrelevant or even harmful to classifier performance.



Above are the 4 lowest entropy images from the global entropy sort (top) and grouped entropy sort (bottom) configurations. The obvious thing here is that these images have either had their backgrounds removed or have naturally homogenous backgrounds.

M layouts/index.html => layouts/index.html +2 -2
@@ 4,9 4,9 @@
My name is David Knight.<br/>
I'm a software engineer in Northern California.<br/>
Get my code from <a href="https://git.sr.ht/~dvshkn">Sourcehut</a>.<br/>
Find me on the <a rel="me" href="https://mastodon.technology/@dvshkn">fediverse</a>.<br/>
Find me on the <a rel="me" href="https://fosstodon.org/@dvshkn">fediverse</a>.<br/>
Send me an <a href="mailto:~david~at~dvshkn~com~">email</a>.<br/>
{{ partial "latest_post" . }}
{{ end }}
\ No newline at end of file
{{ end }}