I love baseball, movies and analyzing data. Since analyzing baseball data is a well-traveled path and analyzing data from baseball movies is too narrow a path, I am left with the intersection of movies and data analysis. Specifically, I analyze data generated by five Movie Ratings websites: IMDB, Rotten Tomatoes, Netflix-DVD, Movielens, and Criticker. There are other sites I could include and maybe will include in the future. For example, Metacritic is a fairly well known website but, for now, I’ve chosen not to use Metacritic because it is similar to Rotten Tomatoes with a less robust volume of movie ratings. I focused on these five sites because they are a good cross section of the methodologies I’ve come across that are used to rate movies.
Over the past few years I have built and maintained a database of all of the movies I have watched in the last 15 years. As of January 31, 2016, my database contains 1,957 movies. For each movie, I have entered the ratings provided by IMDB and Rotten Tomatoes, as well as the personalized ratings generated by Netflix-DVD, Movielens, and Criticker. If you are unfamiliar with these sites, the links at the top of the page will get you to the home page for each site. IMDB and Rotten Tomatoes don’t require any work on your part to see ratings. Criticker and Movielens base their ratings off of the ratings you provide for the movies you’ve seen. Their value as movie guides requires some effort on your part. Netflix-DVD is also based off of your ratings but has the additional requirement that you be a subscriber to their DVD service.
At about this point you are probably asking, “Why is he doing this?” Initially, I wanted to test which website was the best at leading me to movies that I’d “really like”. Instead, I ended up with an algorithm, using all five websites, that provides me with the probability that I will “really like” a particular movie. For example, the two Oscar nominated movies for Best Picture this year that I haven’t seen are Bridge of Spies and The Revenant. Based on my algorithm, there is a 98.1% chance that I will “really like” Bridge of Spies while there is only a 49.5% chance that I will “really like” The Revenant. I will watch both movies but it will be with the recognition that Bridge of Spies is close to a sure thing while The Revenant is a 50/50 proposition. Where these probabilities come from is a topic for another day.
I’ve been using some form of this algorithm to select the movies to watch over the last two years. A comparison of the last two years with the first two years used in my study would suggest I’ve been pretty successful. I watched 165 movies during the 1999 & 2000 calendar year and “really liked” 72 of them. Over the last two years, 2014 & 2015, I watched 182 movies and “really liked” 163. I’ve gone from “really liking” 44% of my movies in the first years of my study to 90% over the last two years.
I have a friend of mine who, on occasion, will ask me what I thought of a particular movie. Mostly, I’ll tell him I “really liked” it. He then dismisses my recommendation by saying “but you like everything.” He’s right! I’ve reached the point where 9 out of 10 movies I watch I “really like”. It’s not, however, because I like everything. It’s because I’m able to identify those movies that I probably will “really like” and avoid watching those that I probably won’t like.
As to the question posed in today’s title, “Will I ‘Really Like’ this Blog?” I’ll say this. If you frequently watch movies that you wish you hadn’t, you will “really like” this blog. You won’t have to build your own personal movie selection algorithm. You will, though, gain a better understanding of various movie websites and how they can help you pick a movie to watch that you will “really like”.