In the buildup to the Oscars, all fans and critics have their hunches about who’ll come out on top. But can we move beyond guesswork? Is it possible to predict the winners more reliably using data and machine learning?
This was the challenge I took up in a few years ago. Using data and machine learning I developed an Oscar prediction model. This project not only sheds light on the potential outcome of the award ceremony each year, but also offers insights into the critical factors that contribute to an Oscar win.
This is an updated version of the 2022 model which correctly predicted five out of six categories.
Introducing the Oscars prediction model
The model’s predictions are based on the following variables:
- #1 General movie and nominee information: This includes the movie’s genre, the release date, the age of the actors involved, and the public reception on platforms like IMDB and rotten tomatoes.
- #2 Oscar nominations and nomination statistics: I consider the number and kind of nominations each movie receives, as well as past nominations and wins of individual actors and directors. For example, movies with a higher total number of nominations and past wins have a greater likelihood of triumph.
- #3 Results of Oscar pre-cursor ceremonies: I gather the outcomes of forerunner ceremonies such as the BAFTAs, DGAs, and Golden Globes. This is crucial because nominees and winners of these awards often go on to receive Oscars as well.
How does it all come together?
- The machine learning model looks for patterns in the data and tries to find the factors that are likely to contribute to a win. Each category has a different set of variables that are relevant for predicting the winners. For example older actors and actresses have a better chance of receiving an award. When it comes to the Best Picture category, the number of nominations that a film received is an important predictor.
- I also analyze the relative importance of the variables of the model. For instance the model can tell whether a musical is more likely to win or a horror movie.
- The model can also assess the interaction between variables like audience ratings and the ratings of critics. For example, is a low critics score and a high audience score a better predictor of a win than a high critics score and a low audience score?
Drum roll. The model's Oscar predictions for 2023
This is the most high-profile award, and the 2023 list sees the greatest ever number of sequels nominated. Both Avatar: The Way of Water and Top Gun: Maverick are nominated, but according to the model they are not in contention.
My model suggests that Everything Everywhere All at Once will come out on top. It has 11 nominations in total, meaning that the film has support from many segments of the Academy. It also won the Critics Choice, PGA, and DGA awards in the run-up, and these awards are all strong indicators of success at the Academy Awards.
Lastly, an important predictor for Best Picture is whether the movie was nominated for a Best Director award. In the history of the Oscars it has only happened 5 times that a movie won Best Picture without a Best Director nomination.
Though the prediction has a high degree of confidence, we have to note that there have been quite a few surprises in this category in recent years. For example, out of the 5 occasions where a film won without a Best Director nomination, 3 happened in the past 10 years—Argo in 2012, Green Book in 2019 and CODA last year, making it the only error in my model of 2022.
The strongest contenders are Banshees of Inisherin and The Fabelmans. Banshees of Inisherin won the Golden Globe’s Best Comedy award and has 9 total nominations. The Fabelmans won the Golden Globe’s Best Drama award and has 7 nominations.
Stephen Spielberg has been nominated 8 times for Best Director, but he has only won it once for Schindler’s List. Is this the year Spielberg wins again? According to our model, probably not.
Historically, this is the category with the least surprises. The prize often goes to the same director who won DGA. Because Daniel Kwan and Daniel Scheinert won that award this year they’re highlighted by the model as the likely winners of the Best Director Oscar.
This category features a notable age gap between the nominees, with Bill Nighy receiving his first nomination at the age of 73 and Austin Butler nominated for Elvis at just 31. Will this have any bearing on the outcome of the winner? Austin Butler is a strong contender, having won BAFTA and a Golden Globe.
The model predicts Brendan Fraser to pick up the Best Actor prize, partly thanks to winning the SAG and Critics Choice awards in the lead up.
However, actors starring in a Best Picture nominated movie have a better chance of winning acting awards. Unlike Elvis, The Whale wasn’t nominated for Best Picture. Therefore, this category definitely provides an interesting race.
When it comes to the Best Actress category, there’s a close race between Cate Blanchett and Michelle Yeoh. The model predicts that Blanchett will triumph, thanks to her wins at BAFTA, Critics Choice, and the Golden Globes, as well as her previous two Oscar wins and seven total Oscar nominations.
Best Supporting Actor
This category has seen some iconic victories in the past. These include Heath Ledger’s performance as the Joker in Dark Knight (2009) and Brad Pitt’s portrayal of Cliff Booth in Once Upon a Time in Hollywood (2020).
This year the top contenders are Ke Huy Quan and Brendan Gleeson. Brendan Gleeson is the older of the two, and he won a BAFTA for his performance in The Banshees of Inisherin. Ke Huy Quan received a SAG award and a Golden Globe for Everything Everywhere All at Once in the build up to the Oscars. According to the model he is the most likely one to win Best Supporting Actor this year.
Best Supporting Actress
Jamie Lee Curtis is a Hollywood veteran, but this is the first time she’s been nominated for an Oscar. She is in a three-way race with Angela Bassett and Kerry Conda. The precursor awards for this category have been split between these three actresses with Jamie Lee Curtis winning SAG, Angela Bassett winning the Golden Globe and Critics Choice, and Kerry Conda winning BAFTA.
Angela Bassett is the only one out of the three who has already been nominated for an Oscar. It happened back in 1994 for her portrayal of Tina Turner in the movie What’s Love Got to Do with It. The workings of the model indicate that this might be enough to sway votes in her favor, leaving Jamie Lee Curtis empty handed.
Who do you think will win?
My Oscar prediction model sheds light on some of the factors that influence the outcome of the awards and hopefully it has given you an insight into how machine learning works. Having seen the data presented and maybe even some of the movies in question, do you agree with the model’s predictions? Do you have your own favorites?