Question 1 – Investment Portfolio (12 marks)
Consider the daily percent change in the stock price of two companies, A and B, in an investment portfolio. The dataset is called Investment Portfolio.
Answer the following questions manually. Use statistical software or MS Excel for help with the computation of any summary statistics needed for manual computations.
a) Draw a scatterplot of the company A daily percent changes against the company B daily percent changes. Describe the relationship between daily percent changes that you see in this scatterplot.
b) Determine the regression equation to predict the daily percent change in the stock price of company A from the daily percent change in the stock price of company B. Interpret the value of the slope coefficient.
c) Find the correlation between the percent changes. Does the correlation value support your description of the scatterplot in part a)?
d) Compute the corresponding coefficient of determination and interpret its value. In financial terms, it represents the proportion of non-diversifiable risk in company A.
e) Compute the 95% confidence interval for the slope coefficient. f) Test at the 5% significance level whether the slope coefficient is significantly different
from 1, representing the beta of a highly diversified portfolio. Don’t forget to show your computations.
Questions 2 – Location Analysis (38 marks)
Location analysis is an important decision in operations management of production and service industries. A critical decision for many organizations is where to locate a processing plant, warehouse, retail outlet, etc. A large number of business variables are typically considered in this decision problem.
The management of a large motel/inn chain is aware of the challenges in choosing new motel locations. The chain’s management uses the “operating margin,” which is the ratio of the sum of profit, depreciation, and interest expenses divided by total revenue, to make this type of decision. In general, the higher the “operating margin,” the greater the success of the motel/inn.
The chain’s management has collected data on 100 randomly selected of its current inns. By measuring the “operating margin,” the objective is to predict which sites would likely generate more profit. Below is a description of the different variables considered in this analysis.
Variable Description Location ID Number Location identifier
Operating Margin Operating margin, in percent
Number Number of motels, inns, and hotel rooms within 5 miles
Nearest Number of miles to the closest competitors
Enrollment Number of college and university enrollment (in thousands) in nearby college and universities
Income Average household income (in thousands) of the neighborhood
Distance Distance from downtown
Quality The quality of the service level of the location (1 = bad, 2 = average, 3 = good, 4 = excellent)
High Speed Internet High speed internet availability (1 = no, 2 = yes)
Gym Gym availability (1 = no, 2 = yes)
The dataset is called Location Analysis.
Part 1 (10 marks)
Using Minitab or any other statistical software, run a simple linear regression model to predict Operating Margin based on Distance and answer the following questions:
a) Using an appropriate graph, plot Operating Margin versus Distance and comment on the relationship between these two variables.
b) Write down your estimation of the regression equation for predicting Operating Margin from Distance. Draw the regression line on the plot in part a).
c) Assuming α = 0.01, test whether Distance has statistically significant predictive power in estimating Operating Margin. State the hypotheses, provide a test statistic and p-value, and state your conclusion. Show your calculations.
d) Interpret the values of the regression coefficients (slope and intercept).
Part 2 (6 marks)
Using Minitab or any other statistical software, now perform a multiple linear regression analysis of Operating Margin (response variable) against all the remaining variables as predictors, excluding Location ID Number.
a) Write down the regression equation and provide at least two summary measures of the fit of the model. Based on the summary measures, does the model provide a good fit for the data? Explain.
b) Plot the residuals against the fitted values and comment on whether the usual model conditions are met.
c) The variable Operating Margin New in the dataset corresponds to the Operating Margin variable from which some values have been recorded as missing values. Identify those missing values and explain what they are and why they were recorded as missing.
Part 3 (12 marks)
Using statistical software, run the same multiple linear regression model as in Part 2 above but this time using Operating Margin New as the response variable. Then, answer the following questions:
a) Briefly compare the resulting regression equation and fit with those obtained in Part 2. b) Plot the residuals against the fitted values and comment on whether the model complies
with the usual conditions for multiple linear regression. c) Provide an interpretation for the model intercept and for the regression coefficients
associated with variables Income and Distance. Is an interpretation of the model intercept appropriate in this case? Compare the value of the regression coefficient for Distance with the one obtained in Part 1 above and clearly explain any difference.
d) Do you see any justification for dropping any variable(s) from the model? Explain (hint: multicollinearity; the significance of predictors).
e) Run a final model using Operating Margin New as the response variable and including only the significant predictors (hint: those with a p-value ≤ 5%).
f) Test the overall significance of the final model in part e). Use a 1% significance level and follow all the steps for hypothesis testing indicated in the Instructions section.
Part 4 (10 marks)
Based on your final model in Part 3 above, answer the following questions:
a) Test the marginal contribution of Quality, assuming that the other variables in the model remain constant. Use a 1% significance level, and make sure you follow all the steps for hypothesis testing indicated in the Instructions section. Show the computation of the t- statistics (i.e., the ratio used to compute it).
b) Calculate the 99% prediction interval for the actual operating margin of a new location with the same characteristics as those for Location ID Number 3098 in the data file. Check if the prediction interval includes the actual operating margin associated with Location ID Number 3098 and explain why it does or does not.
c) Calculate the 99% confidence interval for the mean operating margin of a new location with the same characteristics as those for Location ID Number 3098 in the data file. Explain any difference between the size of this interval and the one in part b) above.
Location ID Number
High Speed Internet
Operating Margin New