movie sales analysis.pptx
DESCRIPTION
This is presentation to analyze the movie sales and different movie release on a particular period of time and predicting the future movie based on this.TRANSCRIPT
Movie Madness
Applied Regression Analysis
Flora Amores, Kristen Bierfeldt, Ben Bronfman, Jena Deng,
Jennie Goldstein, Christiana Kwon, Jonathan Ragins
Objective & Approach
Identify factors that influence ticket sales Create a model to help studios predict gross ticket sales
Description of independent and
dependent variables
Histogram analysis
Distribution Test
Correlation Analysis
Descriptive Statistics
Best subset regression
analysis: identify relevant
independent variables
Residual analysis
Regression Analysis
Forecasting gross domestic ticket sales for 2014
releases
Forecasting
Dataset
Data Field Description Source
Movie Title Name of the film Box Office Mojo
Total Gross Total amount of money the film made while it was in theaters Box Office Mojo
Awareness Score Public’s awareness of top-billed celebrity; 0-100 Rating IMDB; E-Score
Appeal Score Appeal of top-billed celebrity; 0-100 Rating IMDB; E-Score
Rotten Tomatoes Rating
Film critic and writer reviews Rotten Tomatoes
Production Budget Amount the studio spent on production of the film (excl Marketing) IMDB
Studio CBS, Disney, FilmDistrict, Focus, Fox, Lionsgate, Open Road, Paramount, Relativity, Screen Gems, Sony, Summit, Universal, Warner Bros, Weinstein
IMDB
Season Season the film premiered in: Winter, Spring, Summer, Fall Box Office Mojo
Genre Action/Adventure, Animated, Comedy, Drama, Horror, Sci-Fi/Fantasy, Suspense/Thriller, Romance, Documentary
Movies.com
Rating G, PG, PG-13, R Movies.com
Skewed right with long right tail Even among the top films, very few
achieve extreme success (>$300M),
Most of the top 100 earned closer to $75M, compared to the mean production budget of about $74M
Histograms – Revenue Minus Cost
Skewed right with long right tail Many production budgets for top
grossing films are over $40M (long right tail)
However, about 30% are made with a <$40M budget, indicating a
massive budget is not necessary for commercial success
Fairly normal, but skewed left Top grossing films tend to star
actors with higher appeal scores
Histogram – Independent Variables
Does not appear normal Current stars or rising stars tend to
be more common in top films Lower awareness more common in
top grossing films
Distribution Test
Skewed right with long right tail Even among the top films, only very few are extremely successful (>$300M)
Actual blockbusters are rare - Most of the top 100 earn closer to $75M This compares to about $75M average production budget, indicating limited profitability
even among the most successful films
Full Model OutputCoefficients Term Coef SE Coef T-Value P-Value VIF Constant -96476280 92212802 -1.05 0.299 Awareness Score 31609 347267 0.09 0.928 1.76 Appeal Score 1499880 842243 1.78 0.079 2.13 Rotten Tomatoes Rating (Tomato 114213804 33548802 3.40 0.001 1.92 Production Budget 0.887 0.184 4.82 0.000 3.51 CBS -20713796 74706088 -0.28 0.782 1.34 Disney 37249040 40706931 0.92 0.363 3.29 FilmDistrict 40710067 58480417 0.70 0.489 1.62 Focus -80902310 57291818 -1.41 0.163 1.56 Fox -29008644 38769488 -0.75 0.457 3.56 Lionsgate 68249084 40197434 1.70 0.094 2.21 Open Road 25477257 49761753 0.51 0.610 1.74 Paramount 25497332 40556061 0.63 0.532 2.93 Relativity 15707538 48630710 0.32 0.748 2.20 Screen Gems 16569478 59756962 0.28 0.782 1.69 Sony -14683522 37111355 -0.40 0.694 3.77 Summit -22185861 44280947 -0.50 0.618 2.25 Universal 24118387 37529997 0.64 0.523 4.35 Warner Bros. 4413709 37927330 0.12 0.908 4.19 Fall 9031477 24175518 0.37 0.710 2.35 Spring 28732675 21196246 1.36 0.180 1.93 Summer 10441133 20942975 0.50 0.620 2.27 Action/Adventure -68845716 86491210 -0.80 0.429 36.51 Animated -14257945 77891882 -0.18 0.855 14.38 Comedy -16598316 84045022 -0.20 0.844 29.34 Drama -45802940 84873963 -0.54 0.591 23.44 Horror -16445679 85217610 -0.19 0.848 11.44 Sci-Fi/Fantasy -88908859 87255620 -1.02 0.312 16.59 Suspense/Thriller -35039393 92033894 -0.38 0.705 5.97 Romance -26475377 97313741 -0.27 0.786 4.49 PG-13 -4388273 17825515 -0.25 0.806 1.92 PG 14749424 41440261 0.36 0.723 5.30 G -97902869 88703709 -1.10 0.274 1.89
R2 = 63.4%
Adj. R2 = 45.9% Several
variables with extremely high VIF scores
Multiple insignificant variables
Correlation Matrix
Production budget, rating, and Disney studio have highest correlations to the dependent variable
Action/Adventure and budget are highly correlated, likely due to special effects costs. Viewers also benefit the most from this genre’s in-theater experience Independent variables have low correlation to each other (<40%)
Total Gross
Awareness Score
Appeal Score
Rotten Tomatoes Rating (Tomato Meter)
Production Budget CBS Disney
FilmDistrict Focus Fox Lionsgate
Open Road
Paramount Relativity
Screen Gems Sony Summit Universal
Warner Bros.
Weinstein Fall Spring Summer Winter
Action/Adventure Animated Comedy Drama Horror
Sci-Fi/Fantas
ySuspense/Thriller Romance
Documentary PG-13 PG G
Total Gross
Awareness Score 12.75%
Appeal Score 30.39% 35.33%Rotten Tomatoes Rating (Tomato Meter) 36.15% -2.26% 26.98%
Production Budget 60.26% 20.61% 21.12% 6.98%
CBS -4.24% 9.12% 11.91% -2.69% -7.04%
Disney 36.46% 4.45% 8.05% 7.22% 35.52% -3.16%
FilmDistrict -1.53% -13.55% -2.34% -5.98% -7.93% -1.44% -4.49%
Focus -12.15% -3.93% 10.50% 20.10% -12.46% -1.44% -4.49% -2.04%
Fox -3.86% 10.40% 8.61% 0.08% 7.33% -3.53% -11.06% -5.02% -5.02%
Lionsgate 2.30% -7.81% -5.65% -11.21% -15.93% -2.54% -7.95% -3.61% -3.61% -8.88%
Open Road -13.68% -18.47% -7.09% -13.54% -13.39% -1.77% -5.53% -2.51% -2.51% -6.18% -4.44%
Paramount 8.42% 11.73% 0.48% 3.47% 9.00% -2.96% -9.27% -4.21% -4.21% -10.37% -7.45% -5.19%
Relativity -12.48% -4.58% -1.50% -24.87% -13.21% -2.05% -6.42% -2.92% -2.92% -7.18% -5.16% -3.59% -6.02%
Screen Gems -11.07% -19.95% -8.12% -12.70% -6.30% -1.44% -4.49% -2.04% -2.04% -5.02% -3.61% -2.51% -4.21% -2.92%
Sony -7.85% 9.13% -7.66% 4.97% -4.33% -3.89% -12.16% -5.52% -5.52% -13.59% -9.77% -6.80% -11.40% -7.89% -5.52%
Summit -9.06% 0.05% 2.85% 4.21% -4.41% -2.31% -7.21% -3.28% -3.28% -8.07% -5.80% -4.03% -6.77% -4.68% -3.28% -8.87%
Universal -1.80% 6.14% 0.43% 1.08% -6.94% -4.22% -13.21% -6.00% -6.00% -14.77% -10.61% -7.39% -12.39% -8.57% -6.00% -16.24% -9.64%
Warner Bros. 12.74% -9.81% -4.27% 11.97% 19.56% -4.06% -12.69% -5.76% -5.76% -14.18% -10.19% -7.10% -11.90% -8.24% -5.76% -15.60% -9.26% -16.95%
Weinstein -11.71% 3.04% -0.87% -1.14% -16.98% -2.31% -7.21% -3.28% -3.28% -8.07% -5.80% -4.03% -6.77% -4.68% -3.28% -8.87% -5.26% -9.64% -9.26%
Fall 5.97% 3.23% 26.65% 26.38% -13.91% 19.49% 9.52% 10.17% 10.17% -10.28% -2.69% -9.07% -6.15% 14.53% 10.17% -5.33% -0.56% -1.03% -6.65% -0.56%
Spring 9.19% -3.71% -21.54% -11.25% 15.15% -5.49% -0.58% 9.17% -7.81% -4.03% -3.80% 4.32% 10.16% 0.97% -7.81% 0.07% -1.64% -2.99% 5.34% -1.64% -28.18%
Summer 2.74% -3.11% 11.00% -0.95% 12.23% -6.74% 1.59% -9.58% 5.87% 10.99% -7.83% -11.79% -11.80% -13.68% 5.87% 19.10% 4.46% 2.12% -2.12% -5.46% -34.56% -36.63%
Winter -17.48% 3.89% -15.88% -12.86% -14.70% -5.80% -10.09% -8.25% -8.25% 1.85% 14.59% 16.92% 8.51% 0.00% -8.25% -15.45% -2.65% 1.62% 3.33% 7.95% -29.77% -31.55% -38.70%
Action/Adventure 16.57% 27.07% 32.24% -3.22% 37.74% -6.27% 11.52% 7.00% -8.91% -0.57% -6.38% -10.97% 6.24% -1.36% -8.91% 2.38% 6.13% 23.70% -18.74% -14.31% 0.66% 8.26% 6.36% -15.43%
Animated 25.80% 3.23% 0.57% 0.80% 19.67% -3.53% 22.45% -5.02% -5.02% 18.28% -8.88% -6.18% -10.37% 9.13% -5.02% 5.42% -8.07% -5.82% -14.18% 6.60% 5.41% -4.03% 10.99% -12.92% -21.92%
Comedy -16.02% -1.72% -2.18% -18.14% -30.97% 18.92% -8.27% -7.59% 9.66% 4.47% 6.91% 4.81% 2.14% 1.48% -7.59% -6.17% 9.97% -8.79% -0.56% -1.11% -3.67% -11.82% 6.16% 8.36% -33.12% -18.67%
Drama -8.25% 7.34% 0.22% 31.11% -22.34% -4.39% -4.19% -6.23% 13.25% -6.63% 11.95% 8.31% -2.82% -8.91% -6.23% -0.65% -10.01% -18.33% 13.84% 27.53% 4.29% -4.41% -17.46% 18.90% -27.22% -15.34% -23.18%
Horror -10.06% -29.61% -23.51% -3.22% -24.96% -2.76% -8.63% 24.08% -3.92% -9.65% 9.57% -4.82% -8.09% -5.60% 24.08% 1.05% -6.29% 10.43% 0.23% -6.29% 5.10% -5.68% -1.44% 2.26% -17.11% -9.65% -14.57% -11.97%
Sci-Fi/Fantasy -1.48% -23.31% -8.75% -8.56% 26.92% -3.35% -10.48% -4.76% -4.76% -1.07% -8.42% 13.68% 14.74% -6.80% 19.05% -12.89% -7.65% 4.67% 15.37% -7.65% -9.00% 13.47% -0.72% -3.85% -20.79% -11.72% -17.70% -14.55% -9.15%
Suspense/Thriller -4.80% 8.55% -0.77% 2.79% -7.84% -1.77% -5.53% -2.51% -2.51% -6.18% -4.44% -3.09% -5.19% -3.59% -2.51% 10.63% 22.86% -7.39% 9.80% -4.03% 5.33% 18.25% -11.79% -10.15% -10.97% -6.18% -9.34% -7.68% -4.82% -5.86%
Romance -8.57% -8.30% -11.33% -0.60% -10.55% -1.44% -4.49% -2.04% -2.04% -5.02% -3.61% -2.51% -4.21% 33.53% -2.04% -5.52% -3.28% -6.00% 14.82% -3.28% -7.37% -7.81% -9.58% 24.74% -8.91% -5.02% -7.59% -6.23% -3.92% -4.76% -2.51%
Documentary -8.29% -8.92% -35.07% 2.98% -9.80% -1.01% -3.16% -1.44% -1.44% -3.53% -2.54% -1.77% -2.96% -2.05% -1.44% 26.00% -2.31% -4.22% -4.06% -2.31% -5.18% -5.49% 14.99% -5.80% -6.27% -3.53% -5.34% -4.39% -2.76% -3.35% -1.77% -1.44%
PG-13 10.84% 10.61% 1.73% -3.37% 25.33% 10.46% 4.76% 0.57% -13.73% -20.98% 17.87% -5.16% 1.18% -9.40% 0.57% -7.38% 14.69% -6.73% 13.15% 5.51% -0.39% 14.08% -3.81% -9.25% 15.87% -33.78% -2.71% 7.21% -10.67% 14.68% 6.57% 0.57% -9.66%
PG 14.74% -3.57% -13.42% -5.35% 13.80% -4.22% 16.15% -6.00% -6.00% 38.94% -10.61% -7.39% -12.39% 5.72% -6.00% 8.74% -9.64% -9.80% -16.95% 3.21% -1.03% -2.99% 8.17% -4.85% -13.72% 74.74% -15.55% -18.33% -11.53% -4.67% -7.39% -6.00% 23.92% -40.36%
G 19.41% 7.07% 3.78% 9.03% 30.09% -1.01% 31.96% -1.44% -1.44% -3.53% -2.54% -1.77% -2.96% -2.05% -1.44% -3.89% -2.31% -4.22% -4.06% -2.31% -5.18% -5.49% 14.99% -5.80% -6.27% 28.59% -5.34% -4.39% -2.76% -3.35% -1.77% -1.44% -1.01% -9.66% -4.22%
R -26.27% -9.86% 7.39% 5.62% -42.86% -7.54% -23.59% 4.17% 19.05% -6.39% -10.18% 11.24% 8.60% 5.95% 4.17% 1.98% -7.65% 15.17% -0.24% -7.65% 2.25% -11.29% -5.23% 14.43% -5.01% -26.37% 15.49% 7.05% 20.25% -11.11% -0.98% 4.17% -7.54% -72.06% -31.51% -7.54%
Reduced Model - Method
Since Minitab only handles 31 independent variables, we ran a Best Subset by excluding the studios that had the lowest correlation with the dependent variable– The Best Subset model had 13 variables that were used to run a regression
After running the regression we continued to remove insignificant variables from the model until we felt the best reduced model was reached
Reduced Model - Output
R2 = 53% Adj. R2 = 50%
All variables statistically significant at the ~90% level
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.73
R Square 0.53
Adjusted R Square 0.50
Standard Error 61,965,779.77
Observations 100.00
ANOVA
df SS MS F Significance F
Regression 6.00 398,577,185,982,704,000 66,429,530,997,117,400 17.30 0.00
Residual 93.00 357,097,481,180,966,000 3,839,757,862,160,930
Total 99.00 755,674,667,163,671,000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept (65,694,307.46) 35,415,935.25 (1.85) 0.07 (136,023,336) 4,634,721 (136,023,336) 4,634,721
Appeal Score 964,102.15 592,906.32 1.63 0.11 (213,292) 2,141,497 (213,292) 2,141,497
Rotten Tomatoes Rating (Tomato Meter)101,543,955.22 24,678,272.79 4.11 0.00 52,537,796 150,550,114 52,537,796 150,550,114
Production Budget 0.62 0.11 5.90 0.00 0.41 0.83 0.41 0.83
Animated 35,240,004.48 20,513,077.46 1.72 0.09 (5,494,902) 75,974,911 (5,494,902) 75,974,911
Disney 40,129,621.38 23,533,412.34 1.71 0.09 (6,603,072) 86,862,314 (6,603,072) 86,862,314
Focus (78,441,939.30) 45,822,939.55 (1.71) 0.09 (169,437,216) 12,553,337 (169,437,216) 12,553,337
Predictive Equation: Total Gross = -65,694,307 + 964,102Appeal + 101,543,955RottenTomatoes + 0.62Production Budget + 36,240,004Animated + 40,129,621Disney – 78,441,939Focus
Residuals
Our assumption of linearity does not hold Heteroscedasticity present
How Did We Do?
Using our final model, we predicted ticket sales for movies released and closed in 2014
Conclusions
Model indicates that Rotten Tomatoes Rating and Production Budget are significant contributors to Total Gross variability
Studio has little impact on the variability of Total Gross, with the exception of Disney (positive impact) and Focus (negative impact)– Disney: People go to see a movie because it’s a Disney movie, which drives up ticket sales– Focus: This indie film studio has smaller distribution and awareness
Ticket sales are difficult to predict Movie goers are irrational Action/Adventure films have high production budgets but do not consistently
generate high ticket sales. The genre is more prone to outliers in the form of blockbusters
QUESTIONS?
Appendix
Additional Data
In addition to the previous data, we also aggregated the following data that we did not include in our analysis:
Data Field Description Reason Not Included
Theaters (Opening)
Total number of theaters that showed the movie on its opening weekend
Several outliers that had limited openings
Opening Film’s gross on its opening weekend Opening gross is not necessarily representative of total movie performance, especially for limited openings
Theaters Total number of theaters that ever showed the movie
Minimal variance within the data
Open Date of opening Opening date reflected within “Seasons” variable
Close Date of closing Closing date inherently reflected within total gross numbers