Model Explainability with SHapley Additive exPlanations (SHAP)

One of the big criticisms of modern machine learning is that it's essentially a blackbox -- data in, prediction out, that's it. And in some sense, how could it be any other way? When you have a highly non-linear model with high degrees of interactions, how can you possibly hope to have a simple understanding of what the model is doing? Well, turns out there is an interesting (and practical) line of research along these lines.

This post will dive into the ideas of a popular technique published in the last few years call SHapely Additive exPlanations (or SHAP). It builds upon previous work in this area by providing a unified framework to think about explanation models as well as a new technique with this framework that uses Shapely values. I'll go over the math, the intuition, and how it works. No need for an implementation because there is already a nice little Python package! Confused yet? Keep reading and I'll explain.

Read more…

A Note on Using Log-Likelihood for Generative Models

One of the things that I find is usually missing from many ML papers is how they relate to the fundamentals. There's always a throwaway line where it assumes something that is not at all obvious (see my post on Importance Sampling). I'm the kind of person who likes to understand things to a satisfactory degree (it's literally in the subtitle of the blog) so I couldn't help myself investigating a minor idea I read about in a paper.

This post investigates how to use continuous density outputs (e.g. a logistic or normal distribution) to model discrete image data (e.g. 8-bit RGB values). It seems like it might be something obvious such as setting the loss as the average log-likelihood of the continuous density and that's almost the whole story. But leaving it at that skips over so many (interesting) and non-obvious things that you would never know if you didn't bother to look. I'm a curious fellow so come with me and let's take a look!

Read more…


It's been a long time coming but I'm finally getting this post out! I read this paper a couple of years ago and wanted to really understand it because it was state of the art at the time (still pretty close even now). As usual though, once I started down the variational autoencoder line of posts, there was always yet another VAE paper to look into so I never got around to looking at this one.

This post is all about a proper probabilistic generative model called Pixel Convolutional Neural Networks or PixelCNN. It was originally proposed as a side contribution of Pixel Recurrent Neural Networks in [1] and later expanded upon in [2,3] (and I'm sure many other papers). The real cool thing about it is that it's (a) probabilistic, and (b) autoregressive. It's still counter-intuitive to me that you can generate images one pixel at at time, but I'm jumping ahead of myself here. We'll go over some background material, the method, and my painstaking attempts at an implementation (and what I learned from it). Let's get started!

Read more…

Importance Sampling and Estimating Marginal Likelihood in Variational Autoencoders

It took a while but I'm back! This post is kind of a digression (which seems to happen a lot) along my journey of learning more about probabilistic generative models. There's so much in ML that you can't help learning a lot of random things along the way. That's why it's interesting, right?

Today's topic is importance sampling. It's a really old idea that you may have learned in a statistics class (I didn't) but somehow is useful in deep learning, what's old is new right? How this is relevant to the discussion is that when we have a large latent variable model (e.g. a variational autoencoder), we want to be able to efficiently estimate the marginal likelihood given data. The marginal likelihood is kind of taken for granted in the experiments of some VAE papers when comparing different models. I was curious how it was actually computed and it took me down this rabbit hole. Turns out it's actually pretty interesting! As usual, I'll have a mix of background material, examples, math and code to build some intuition around this topic. Enjoy!

Read more…

I'm Brian Keng, a former academic, current data scientist and engineer. This is the place where I write about all things technical.

Twitter: @bjlkeng

Signup for Email Blog Posts