A Look at The First Place Solution of a Dermatology Classification Kaggle Competition
One interesting thing I often think about is the gap between academic and real-world solutions. In general academic solutions play in the realm of idealized problem spaces, removing themselves from needing to care about the messiness of the real-world. Kaggle competitions are a (small) step in the right direction towards dealing with messiness, usually providing a true blind test set (vs. overused benchmarks), and opening a few degrees of freedom in terms the techniques that can be used, which usually eschews novelty in favour of more robust methods. To this end, I thought it would be useful to take a look at a more realistic problem (via a Kaggle competition) and understand the practical details that result in a superior solution.
This post will cover the first place solution [1] to the SIIM-ISIC Melanoma Classification [0] challenge. In addition to using tried and true architectures (mostly EfficientNets), they have some interesting tactics they use to formulate the problem, process the data, and train/validate the model. I'll cover background on the ML techniques, competition and data, architectural details, problem formulation, and implementation. I've also run some experiments to better understand the benefits of certain choices they made. Enjoy!