ISE529 Predictive Analytics
Homework 5
项目类别:商业分析

2024 Fall

Homework 5

Due by: Nov. 20, 2024, 11:59 PM

Instructions:

1.   Print your First and Last name and NetID on your answer sheets

2.   Submit all your answers including Python scripts and report in a single Jupyter Lab file

(.ipynb) or along with a single PDF to Brightspace by due date. No other file formats will be graded. No late submission will be accepted.

3.   Total 3 problems. Total points: 100 

1. (30 points)

Predict per capita crime rate in the Boston.csv data set. Split the data set into 70% for a training set and 30% for a test set. Fit a lasso model, ridge regression model, and PCR model respectively. Use cross-validation method to determine λ and (the number of PCs). Present the test error and discuss results for the approaches that you consider.

2. (30 points)

Predict the number of applications received using the other variables in the College.csv data set. Split the data set into 60% for a training set and 40% for a test set.

(a) Fit a ridge regression model on the training set, with λ chosen by cross-validation. Report the test error obtained.

(b) Fit a lasso model on the training set, with λ chosen by cross-validation. Report the test error obtained, along with the number of non-zero coefficient estimates.

(c) Fit a PLS model on the training set, with M chosen by cross-validation. Report the test error obtained, along with the value of selected by cross-validation.

3. (40 points)

Use the following code to generate a data set with n = 500 and p = 2, such that the observations belong to two classes with a quadratic decision boundary between them.

 

(a) Plot the observations, colored according to their class labels. Your plot should display X1 on the x-axis, and X2 on they-axis.

(b) Fit a logistic regression model to the data using X1, X2, X12, X22, and X1×X2  as predictors.

Obtain a class prediction for each training observation (using full data set). Plot the observations, colored according to the predicted class labels.

(c) Fit a SVM using anon-linear kernel (polynomial with d>1 or RBF kernel) to the data. Obtain a class prediction for each training observation (using full data set). Plot the observations, colored according to the predicted class labels.

(d) Comment on your results.

留学ICU™️ 留学生辅助指导品牌
在线客服 7*24 全天为您提供咨询服务
咨询电话(全球): +86 17530857517
客服QQ:2405269519
微信咨询:zz-x2580
关于我们
微信订阅号
© 2012-2021 ABC网站 站点地图:Google Sitemap | 服务条款 | 隐私政策
提示:ABC网站所开展服务及提供的文稿基于客户所提供资料,客户可用于研究目的等方面,本机构不鼓励、不提倡任何学术欺诈行为。