What if Recommendation Algorithms Like Facebook’s Grappled Directly with Bad Content?
Posted on Thu 07 October 2021 in data science
Maybe we can improve recommendations by giving our models a different goal
Recommendation algorithms (a.k.a recommender systems) are computer programs that choose which content to present to people. The Facebook news feed is powered by a recommendation algorithm.
I first worked on recommender systems in 1997 at Net Perceptions, a Twin Cities startup that was a pioneer in the space. An early customer was Amazon, and although the business relationship turned sour, we were part of the path of developing their feature “If you like A, you’ll like B.”
I’ve also seen Silicon Valley from the inside working at or with Internet companies (Google, Amazon, Pinterest). However, I have never worked at Facebook, and I have no special or inside knowledge about their recommendation algorithms work, or about their corporate structure or internals. Given that, I’ll offer some opinions I think may be relevant.
The goal
Every recommender is a model (nowadays called a machine learning model, or AI model) formed with a mathematically well-defined goal. At a high level, that goal requires a dataset labeled with at least good examples (what you want to recommend) and the rest (not good, or maybe bad).
This is where things get complicated. What should the goal be?
Suppose you choose to optimize a model to increase engagement with the content, as measured by clicks. Voila, now you understand one reason the Internet is full of clickbait: we optimize for clicks. This is not a great goal. We actually care about more than just clicks. Companies care about whether the content is against policy, society might care if it’s divisive or reprehensible.
The goal you choose is crucially important to what is recommended.
The teams
Choosing a goal intersects with company organization in an interesting way. Large Internet companies are often led by people with two overriding goals: user growth and revenue. And growth has always mattered more, a lot more. That’s how they got big enough to matter. Facebook in particular has been obsessed with growth for years.
So, there are different groups within Facebook: some are associated directly with growth (there might be a “growth team”), and others are not. One of the teams that is not growth is “trust and safety”, the people tasked with making Facebook content conform to their policies (“community standards”).
I am guessing for years Mark Zuckerberg has had a constant eye on growth, and tried to think about trust and safety as little as possible. The news stories I’ve seen over the years seem to support this guess.
Growth fuels the company. Trust and safety tries to lessen the damage. No one funds or pays a company primarily because it has good community standards. They fund or pay because it has users.
The way I’ve seen that work out on a technical level is that sometimes there are two different teams dealing with content, and they don’t work together. One team tries to get some primary positive metric (like engagement, or Facebook’s more recent “Meaningful Social Interactions”) to go up. A different team tries to get some negative metric (like posts that are against policy, or users posting bad content, or users seeing bad content) to go down.
These two teams might have entirely separate models, separate goals, even different infrastructure. There are some good reasons for that (the problems are not the same), but if teams are separate, the important team (growth, engagement) can ignore the goals of the less important team (trust and safety). I’ve seen it happen. People aren’t trying to be evil, but each team has their own goals.
For example, the team training a content recommender doesn’t want to recommend bad content, so they might simply remove it before training their model.
However, that ignores the problem. In fact, content that is engaging might be more likely to be bad. Facebook whistleblower Frances Haugen ‘stated that some of Facebook’s own research found that “angry content” is more likely to receive engagement, something that content producers and political parties are aware of.’
One model
What if instead one model was forced to grapple with the tradeoff between engagement (positive) and bad content (negative). As a simple example, never remove any data from training, and give +1 for a post someone engaged with, but -10 for a post that was later removed as against policy.
This might apply our sophisticated machine learning techniques to finding the content that is engaging but not bad. However, such a model might also recommend only milquetoast content that offends no one, and severely reduce engagement. I’d love to know more.
Facebook might even already have experimented with this technique, but I’ve not seen public mention of it. If someone has, let me know.