Randomly Sampling is a Strong Benchmark
As some might know, I’ve recently become very interested in active learning techniques. The big picture idea of active learning is that we might use the uncertainty of a model as a proxy for labelling priority. The hope is this way, we may try to sample from an unlabelled dataset as effectively as possible in an attempt to reduce the amount of time it takes to annotate it.
There are a few techniques in this space, but the overall goal is to label more effectively.
While doing a bit of research I “stumbled” on a very interesting paper titled “Practical Obstacles to Deploying Active Learning” by David Lowell, Zachary C. Lipton and Byron C. Wallace.
The paper describes how active learning doesn’t beat random sampling in many cases.
They try out text classification tasks as well as named entity recognition (NER) tasks. For text classifcation the results seem to suggest that there’s not a clear-cut benefit to using active learning techniques over random sampling.
For NER, active learning techniques seem to work more as expected.
The conclusion of the article puts it directly;
Our findings indicate that AL performs unreliably. While a specific acquisition function and model applied to a particular task and domain may be quite effective, it is not clear that this can be predicted ahead of time.
They do restate later that this claim is mainly aimed at text classifcation tasks.
Results are more favorable to AL for NER, as compared to text classification, which is consistent with prior work.
But still … when I first read this I was a plenty surprised! So I decided that I should run a benchmark myself. I used modAL, which is a like-able tool for active learning, to run some simulations and sure enough, I was able to reproduce it. The chart below shows simulations of active learning against the make_classification
dataset from scikit-learn. The blue lines show simulations that used uncertainty based sampling while the orange lines just plucked randomly.
I’ll be doing some more benchmarks to try to understand this better. If that sounds interesting, you can find my scripts over at my scikit-teach repository.