How to become a data scientist, part 3: tell people about your work

Posted on Thu 14 April 2022 in data science


In part 1, I suggested you find a good problem. In part 2, I suggested you try to solve it. Here I discuss how to tell people what you did.

Let me start with a story. I was presenting some work, a couple weeks of digging in data to define …


Continue reading

What if Recommendation Algorithms Like Facebook’s Grappled Directly with Bad Content?

Posted on Thu 07 October 2021 in data science


Maybe we can improve recommendations by giving our models a different goal

Recommendation algorithms (a.k.a recommender systems) are computer programs that choose which content to present to people. The Facebook news feed is powered by a recommendation algorithm.

I first worked on recommender systems in 1997 at Net …


Continue reading

Should you explain your predictions with SHAP or IG?

Posted on Tue 13 August 2019 in data science


Some of the most accurate predictive models today are black box models, meaning it is hard to really understand how they work. To address this problem, techniques have arisen to understand feature importance: for a given prediction, how important is each input feature value to that prediction? Two well-known techniques …


Continue reading

Causality in model explanations and in the real world

Posted on Wed 31 July 2019 in data science


You can’t always change a human’s input to see the output.

At Fiddler Labs, we place great emphasis on model explanations being faithful to the model’s behavior. Ideally, feature importance explanations should surface and appropriately quantify all and only those factors that are causally responsible for the …


Continue reading

“Hey, what’s that?” Debugging predictions using explanations

Posted on Mon 22 July 2019 in data science


Machine learning (ML) models are popping up everywhere. There is a lot of technical innovation (e.g., deep learning, explainable AI) that has made them more accurate, more broadly applicable, and usable by more people in more business applications. The lists are everywhere: banking, healthcare, tech, all of the above …


Continue reading

A gentle introduction to GA2Ms, a white box model

Posted on Mon 03 June 2019 in data science


This post is a gentle introduction to a white box machine learning model called a GA2M.

We’ll walk through:

  • What is a white box model, and why would you want one?
  • A classic example white box model: logistic regression
  • What is a GAM, and why would you want one …

Continue reading

Humans choose, AI does not

Posted on Wed 08 May 2019 in data science


Artificial intelligence isn’t human

Artificial Intelligence Will Best Humans at Everything by 2060, Experts Say”. Well.

First, as Yogi Berra said, “It’s tough to make predictions, especially about the future.” Where is my flying car?

Second, the title reads like clickbait, but surprisingly it appears to be pretty …


Continue reading

A gentle introduction to algorithmic fairness

Posted on Tue 23 April 2019 in data science


A gentle introduction to issues of algorithmic fairness: some U.S. history, legal motivations, and four definitions with counterarguments.

History

In the United States, there is a long history of fairness issues in lending.

For example, redlining:

‘In 1935, the Federal Home Loan Bank Board asked the Home Owners’ Loan …


Continue reading

Case study: explaining credit modeling predictions with SHAP

Posted on Thu 21 March 2019 in data science


Introduction

At Fiddler labs, we are all about explaining machine learning models. One recent interesting explanation technology is SHAP (SHapely Additive exPlanations). To learn more about how SHAP works in practice, we applied it to predicting loan defaults in data from Lending Club.

We built three models (random, logistic regression …


Continue reading

Mary and John: using first name to predict sex in the US works quite well

Posted on Mon 11 February 2019 in data science


Someone’s first name is a good clue of their sex. Mary is probably female. John is probably male. How good a clue exactly? The short story: in the US, first name is a very good clue of sex at birth, but it varies by at least name and year …


Continue reading