Netflix – DVD: Simply the Best…but Streaming is Another Story

Netflix provides a Best Guess of how many stars you are likely to give a movie. Best Guesses of 3.8 stars and higher are Recommended Movies. Notice below how much more effectively Netflix sorts the recommended movies, the average movies, and the below average movies compared to Rotten Tomatoes. Netflix-DVD is the best tool out there for finding movies that you will “really like”.

In October 2006 Netflix launched a contest to improve their capability to predict which movies their customers, individually, would “really like”. They had discovered that their movie recommender, Cinematch, had become a competitive advantage to the company. By recommending movies that customers really liked, they were able to retain more customers for their DVD delivery company. Since its beginning in 1997, Netflix had reshaped the DVD rental industry by being on the leading edge of internet product delivery. It was only logical that their movie recommender should be the best. They offered $1,000,000 to anyone who could create a movie predictor algorithm that would be at least 10% more predictive than Cinematch.  It took almost three years but a team of Austrian researchers combined with a team of Bell Lab researchers claimed the $1,000,000 Netflix Prize by developing a predictor that was 10.9% more predictive than Cinematch.

Netflix didn’t adopt the new algorithm in its entirety but it did incorporate some of the discoveries made by the winning team into Cinematch. To my knowledge, no other movie recommender out there has endured the trial by fire that Netflix endured with their Netflix Prize competition. Based on my very simple analysis of my own data, Netflix is simply the best at predicting which movies I will “really like”. Using the same approach I used to compare Rotten Tomatoes and IMDB in a previous post (Rotten Tomatoes, IMDB and the Wisdom of Crowds), take a look at the following comparison with Netflix:

Rotten Tomatoes
“Really Like” Don’t “Really Like” Total % of Total % “Really Like”
 Cert. Fresh 570 310 880 44.7% 64.8%
Fresh 326 399 725 36.8% 45.0%
Rotten 91 272 363 18.4% 25.1%

Netflix provides a Best Guess of how many stars you are likely to give a movie. Best Guesses of 3.8 stars and higher are Recommended Movies. Notice below how much more effectively Netflix sorts the recommended movies, the average movies, and the below average movies compared to Rotten Tomatoes. Netflix-DVD is the best tool out there for finding movies that you will “really like”.

Netflix
Best Guess “Really Like” Don’t “Really Like” Total % of Total % “Really Like”
> 3.8 659 218 877 44.6% 75.1%
3.4 to 3.8 267 457 724 36.8% 36.9%
< 3.3 61 306 367 18.7% 16.6%

It’s important to understand that Netflix Streaming, which brings you House of Cards and Orange is the New Black, is not Netflix-DVD. In Oct. 2011, Netflix announced that they were creating a new company for their DVD business called Qwikster. The Netflix brand would only be used for their Streaming business. The outcry (and the cancelled subscriptions) was so great and immediate that, in November 2011, Netflix backtracked and restored the Netflix brand to their DVD business. The Netflix vision still remained, though, and very quietly Netflix separated the DVD and Streaming businesses. Today, they are run as separate businesses with their own facilities, management, and profit centers. They also have different strategies for recommending movies.

Netflix-DVD continues to use their gold-standard algorithm for predicting how many stars you are likely to give a particular movie or TV show, but one of the most popular features of the old Netflix is missing. Gone is the list of movies and shows that Netflix suggests for you. If you want to find suggestions, you need to hunt for them. For example, you can go to New Releases or the Drama genre and have those subsets of all movies sorted by suggestions. But, Netflix-DVD no longer makes it easy for you to identify all of the movies that you will “really like”. The reason might be that the Netflix business vision no longer wants you to identify the movies that you will “really like”. They want you to “really like” the movies they have.

On Monday, I’ll explore the Streaming side of Netflix and the conflict that pits the old Netflix against the new.

Is There Something Rotten (Tomatoes) in Denmark?

With apologies to William Shakespeare and Hamlet, does the influence of corporate profit incentives have a corrupting influence on movie recommender websites.

With apologies to William Shakespeare and Hamlet, does the influence of corporate profit incentives have a corrupting influence on movie recommender websites? Movie Ratings have become big business. Amazon bought IMDB in 1998 to promote Amazon products. There appears to be a synergy between the two that doesn’t seem to impact IMDB’s rating system. On the other hand, the Netflix business model, which began as DVD mail order business,  today is a very different business. Netflix has become heavily invested in original entertainment content for its online streaming business and is using a recommender algorithm for that business that is different than its gold-standard algorithm used for the DVD business. Does the Netflix algorithm for its online streaming business better serve the interest of Netflix subscribers or Netflix profits? I’m sure Netflix would say that it serves both. I’m not so sure. This will be a topic of interest for me in future posts. The more immediate concern is Rotten Tomatoes.

It was announced on Feb. 17, 2016 that Rotten Tomatoes, along with the movie discovery site Flixster, was sold to Fandango. For those of you who are not familiar with Fandango, it is one of two major online advance movie ticket sales sites. MovieTickets.com is the other site. For a premium added to your ticket price, Fandango allows you to print movie tickets at home to allow the moviegoer to avoid big lines at the theater.

So, why should we be concerned? Let’s start with the perception that Rotten Tomatoes has become so influential that it makes or breaks movies before they are even released. Here are a couple of articles that express the growing concern film-makers have with Rotten Tomatoes scores: Rotten Tomatoes: One Filmmaker’s Critical Conundrum and Summer Box Office: How Movie Tracking Went Off the Rails. Whether it is true or not, the movie industry believes that the box office success or failure of a film is in the hands of 200 or so critics and the website that aggregates the results, Rotten Tomatoes.

This impact that Rotten Tomatoes has on the box office each week may be a driving force behind Fandango’s acquisition. In CNN Money’s article  announcing the purchase, Fandango President Paul Yanover states “Flixster and Rotten Tomatoes are invaluable resources for movie fans, and we look forward to growing these successful properties, driving more theatrical ticketing and super-serving consumers with all their movie needs,”. Fandango makes money when more people go to the movies, particularly on opening weekends for well-reviewed movies, when lines are expected to be long. Rotten Tomatoes’ Certified Fresh designations drive opening weekend long lines. Logically, Fandango business interests would be better served by even more movies earning the Certified Fresh rating.

Am I being too cynical? Well, according to a study by Nate Silver’s FiveThirtyEight site  Fandango has done this before. According to FiveThirtyEight Fandango used some creative rounding to inflate their movie ratings in the past. Has Fandango learned its lesson? They claim that Rotten Tomatoes will maintain their independence within their corporate structure. Maybe, but from my experience, corporate acquisitions are made to create profitable synergies – more Certified Fresh ratings, more moviegoers, more long lines for tickets, more “theatrical ticketing” in advance, more profits.

If you begin to “really like” fewer movies that are Certified Fresh on Rotten Tomatoes you might conclude that there may be something Rotten (Tomatoes) in Fandango…if not in Denmark.

 

Rotten Tomatoes, IMDB and the Wisdom of Crowds

In the Introduction of James Surowiecki’s The Wisdom of Crowds, the author writes that “under the right circumstances, groups are remarkably intelligent, and are often smarter than the smartest people in them”. This prescient book, written in 2004, was describing the crowd-sourcing, data driven world that we live in today. If you want information, you type a couple of words into Google and you find exactly what you were looking for on the first page of links. If you are visiting a new city and you’re looking for a good restaurant, you check Yelp to identify the highest rated restaurants. And, if you want to go to the movies, you check Rotten Tomatoes and IMDB to see which of the movies you are considering is the highest rated.

The “right circumstances” for groups to be intelligent, according to Surowiecki, is that the group has to be big enough, diverse, and individual decisions within the group need to be made independently. Rotten Tomatoes is independent enough, most of the critic reviews are made prior to the release of the movie without knowledge of how other critics are rating the movie. Diversity is an interesting question. They are all movie critics after all and most of them are men. Still, they certainly bring a diverse set of life experiences. So, diversity isn’t optimal but still exists. The biggest question mark is whether the group is big enough. Star Wars: The Force Awakens is the most reviewed movie I’ve come across on Rotten Tomatoes with a little more than 335 critics reviews counted in the rating. My database average is 104 reviews. That is not a big sample size for statistical analysis. While, logically, movies rated Certified Fresh 95% should be better than Certified Fresh 75% movies, my data doesn’t support that.

“Really Like” Don’t “Really Like” Total % “Really Like”
CF > 88% 284 155 439 64.7%
CF < 88% 283 154 437 64.8%

There is virtually no difference between movies rated higher than Certified Fresh 88% and those less than Certified Fresh 88%. On the other hand, when you just look at Certified Fresh vs. Fresh vs. Rotten movies, the group allocates the movies intelligently.

“Really Like” Don’t “Really Like” Total % of Total Database % “Really Like”
 CF 567 309 876 44.6% 64.7%
F 324 399 723 36.9% 44.8%
R 91 272 363 18.5% 25.1%

It turns out that crowds of critics are pretty smart.

IMDB certainly meets the criteria for an intelligent group. It is big enough, Star Wars: The Force Awakens has over 450,000 votes, for example. While not as diverse demographically as one might like, it is much more diverse than a crowd of critics. And, moviegoers who vote on IMDB cast their vote independently (how influenced they are by other ratings is a subject for another day). When I rank the movies in my database by Avg. IMDB Rating and allocate them in groups identical to the Rotten Tomatoes table, you get the following results:

Avg. IMDB Rating “Really Like” Don’t “Really Like” Total % of Total Database % “Really Like”
> 7.4 552 324 876 44.6% 63.0%
6.7 to 7.4 361 362 723 36.9% 49.9%
< 6.7 69 294 363 18.5% 19.0%

Crowds of moviegoers are pretty smart as well.

Let’s go one step further. What would these results look like for movies that Rotten Tomatoes rated Certified Fresh and IMDB rated 7.4 or higher:

“Really Like” Don’t “Really Like” Total % of Total Database % “Really Like”
370 156 526 26.8% 70.3%

How about if Rotten Tomatoes rated the movie Rotten and IMDB had an average rating of 6.7 or less:

“Really Like” Don’t “Really Like” Total % of Total Database % “Really Like”
24 193 217 11.1% 11.1%

This is the basis for my rating system. When you combine movie recommender systems together, you improve your chances of selecting movies that you will “really like” and avoiding movies you won’t “really like”. It turns out that crowds of critics and moviegoers are the wisest crowds of all.

 

Rotten Tomatoes: The Critics Aren’t Always Right but, collectively, they are Not Often Wrong

I lived in Chicago from 1976 to 1980. During that time I discovered a little show on WTTW, the local PBS channel, called Sneak Previews.  In the show, a couple of local film critics showed clips from recent movies and each gave their individual review of each movie.  Those film critics, Gene Siskel and Roger Ebert, were in the early years of a show that, through more than 35 years, would go through a number of name changes, would eventually be syndicated to a nationwide audience, and would endure contract disputes, the death of Siskel and the serious illness to its other originator Ebert.  People across the nation tuned in to find out if a movie they were thinking of seeing would get “two thumbs up”.  Like Roman emperors at the coliseum, the box office fate of a movie might hinge on whether Siskel & Ebert gave a movie thumbs up or thumbs down. As a viewer, if a movie got a “two thumbs up” it landed on my mental list of movies I’d consider watching.  If it got a “two thumbs down” it landed on my” don’t waste my time watching” list. But, Siskel & Ebert were competitors from rival Chicago newspapers, and, not surprisingly, they didn’t always agree about a movie. Some movies got a split decision. Siskel would give a “thumbs up” and Ebert would give a “thumbs down”, or vice versa. This left me in the quandary of having to choose which critic to put my faith in since there was no consensus opinion.

This brings me to Rotten Tomatoes. With no disrespect intended to Siskel & Ebert, or any other critic, Rotten Tomatoes is the concept of “two thumbs up” on steroids. The website aggregates the opinions of critics from around the globe. Instead of giving a “thumbs up” or a “thumbs down”, critics label a movie as “Fresh” or “Rotten”. Instead of two critics, a widely distributed movie might garner up to 300 critic reviews. Rotten Tomatoes includes reviews only from critics who have been certified by film critic associations or writing guilds. In addition, they designate some of those critics as “top critics”, well-respected critics writing for newspapers or national magazines. In fact, Roger Ebert was one of those “top critics” before his death.  If a given movie has been reviewed by at least 40 critics, including at least 5 “top critics”, and 75% of those critics designate the movie as “Fresh”, then the movie earns Rotten Tomatoes top designation of being “Certified Fresh”. If less than 60% of the critics rate the movie as “Fresh”, then the movie is designated as “Rotten”. Movies in between, for the most part, are designated as “Fresh”.

I have a lot of respect for film critics. All of the other movie recommender websites that I use rely on feedback from moviegoers after they’ve seen the movie. Movie critics form their opinion, most of the time, before the movie has been released to the general public. They don’t know whether it will be a blockbuster at the box office or a flop. They rely on their expertise without the benefit of feedback from the viewing public. In my next article, I’ll get into how effective Rotten Tomatoes has been in leading me to movies that I “really like”. For now, I’ll just say it’s amazing how often good film critics get it right. Two Thumbs Up!

***

Beginning with this article, I am going to attempt to keep a regular schedule for my posts – two a week, Monday and Thursday. In addition, I plan on updating my movie lists by each Wednesday. Look for my next article, Rotten Tomatoes, IMDB and The Wisdom of Crowds to be posted March 10th.