Between Test and Training data!
Let's study annotators
Computational Pun-derstanding that is.
A Dutch Abbusive Language Corpus
This is a really cool use-case for Blender.
Are We Modeling the Task or the Annotator?
Learning from Teachers, more Literally
Randomly Sampling is a Strong Benchmark
Randomly Sampling is a Strong Benchmark
Neat usecase for Active Learning.
Oh boy...
Colors and Convex Hulls
Statistics, Storks and Babies
Via Github Copilot!
Seems like a sensible baseline.
It's frequent and has hidden nasty bits.
Classification as a Heavy-Tail Regressor
Don't predict too far into the future.
And a use-case for it!
And How to Clone Them.
Manual_seed(3407) is All You Need
And only 24.1% of them actually ran.
Exploring Huggingface while I'm at it.
A hypothesis *can* be a liability.
The Saga continues in Embeddings
It's Numbers that Differ!
As in ... text embeddings!
And how to render them.
Pretty table renders.
They're not very consistent.
How a Great Game became a Grand Challenge
How to find LOTS of them.
Tracking Metrics over Epochs to Understand Labels Better.
There's a lot of it.
A "shortcut" with 4 keys.
Pytest vs. Parrot
It's a great helper
Autocomplete Might be Better
Is it big or is it small?
Data Quality Strikes Again
I *really* like Svelte.
Graphs Mostly
It's an entertaining idea.
Ten Year Old Bug?
Data Quality Strikes Again
My take on Git-Scraping[tm]
Data Quality Strikes Again