Beyond Collaborative Filtering

I wrote a couple of posts about some of the work on recommendation systems and collaborative filtering that we're doing at my job as a Data Scientist at Rubikloud:

Here's a blurb:

Here at Rubikloud, a big focus of our data science team is empowering retailers in delivering personalized one-to-one communications with their customers. A big aspect of personalization is recommending products and services that are tailored to a customer’s wants and needs. Naturally, recommendation systems are an active research area in machine learning with practical large scale deployments from companies such as Netflix and Spotify. In Part 1 of this series, I’ll describe the unique challenges that we have faced in building a retail specific product recommendation system and outline one of the main components of our recommendation system: a collaborative filtering algorithm. In Part 2, I’ll follow up with several useful applications of collaborative filtering and end by highlighting some of its limitations.

Hope you like it!

A Probabilistic View of Linear Regression

One thing that I always disliked about introductory material to linear regression is how randomness is explained. The explanations always seemed unintuitive because, as I have frequently seen it, they appear as an after thought rather than the central focus of the model. In this post, I'm going to try to take another approach to building an ordinary linear regression model starting from a probabilistic point of view (which is pretty much just a Bayesian view). After the general idea is established, I'll modify the model a bit and end up with a Poisson regression using the exact same principles showing how generalized linear models aren't any more complicated. Hopefully, this will help explain the "randomness" in linear regression in a more intuitive way.

Read more…

Normal Approximation to the Posterior Distribution

In this post, I'm going to write about how the ever versatile normal distribution can be used to approximate a Bayesian posterior distribution. Unlike some other normal approximations, this is not a direct application of the central limit theorem. The result has a straight forward proof using Laplace's Method whose main ideas I will attempt to present. I'll also simulate a simple scenario to see how it works in practice.

Read more…

Elementary Statistics for Direct Marketing

This post is going to look at some elementary statistics for direct marketing. Most of the techniques are direct applications of topics learned in a first year statistics course hence the "elementary". I'll start off by covering some background and terminology on the direct marketing and then introduce some of the statistical inference techniques that are commonly used. As usual, I'll mix in some theory where appropriate to build some intuition.

Read more…

I'm Brian Keng, a former academic, current data scientist and engineer. This is the place where I write about all things technical.

Twitter: @bjlkeng



RSS feed

Signup for Email Blog Posts