CS 179: Introduction to Graphical Models
Introduction to Graphical Models
项目类别:计算机

Hello, dear friend, you can consult us at any time if you have any questions, add  WeChat:  zz-x2580


CS 179: Introduction to Graphical Models

Homework 6
The submission for this homework should be a single PDF file containing all of the relevant code, figures, and any
text explaining your results. When coding your answers, try to write functions to encapsulate and reuse code,
instead of copying and pasting the same code multiple times. This will not only reduce your programming efforts,
but also make it easier for us to understand and give credit for your work. Show and explain the reasoning
behind your work!
In this homework, we will Pyro’s stochastic variational inference procedure to explore a simple model for
collaborative filtering, specifically, to predict movie ratings on a (very small) subset of the MovieLens dataset.
For more examples of variational inference in Pyro, please see the course demos on Pyro (VI) and Bayesian
linear regression.
Part 1: Loading the data (20 points)
First, load the training data from 179-hw6-train.csv . The first line of the file is text, listing the names of
M = 10 movies; the remaining lines are comma-separated integers between zero and nine, or NaN (not-a-number),
indicating the value was not observed. The training data file contains a subset of the ratings of N = 200 users
(one line per user). Store the training data in a 200x10 numpy array called Xtr . You may find numpy ’s loadtxt
function helpful.
Part 2: Model Set-up (30 points)
We will use a Bayesian variant of the SVD low-rank decomposition, or latent space reprsentation, for collaborative
filtering described in CS178. If you haven’t seen this and want to read more about it

parts 10 (clustering) and 11 (latent space models).
Our model associates a K-dimensional “position” with each user n and each movie m, denoted Un and Vm,
respectively. Then, the degree to which user n likes movie m is a function of their dot product, Un ·V Tm =

k UnkVmk.
This will try to place similarly rated movies in a similar direction from the origin, with users that like those movies
placed in the same direction, so that the dot product will be large.
A small modification helps the model in practice. Since some users are more positive than others, and some
movies are more widely acknowledged as good or bad, we may want to instead predict “relative” preference. For
this homework, we will do this very simply, by estimating and subtracting the mean over the movies:
1 movie_avg = np.nanmean(Xtr, axis=0, keepdims=True) # ignore NaN when averaging
2 Xtr -= movie_avg
and then do the same thing for the average over the users (estimate the mean over axis 1, and subtract from X .)
(For more sophistication, we could add these as variables in the model and reason over them as well.)
We place simple Gaussian N (0, 1) priors on all the unknown values: the entries in both matrices U , V . Then,
we express the probability of our observations as Gaussian with fixed standard deviation:
rnm ∼N

R ; Un · V Tm , 0.1

.
To do so, define your model(...) function to generate a collection of M two-dimensional Gaussian random
variables (one for each movie). Your model will be most efficient if you use Pyro’s plate for indexing, which
informs Pyro that the indexed elements are conditionally independent from one another

for an example. Similarly, define a plate and a list of 2D Gaussian random variables, Un for each user n.
Finally, the evidence of our model (training data) is what makes our model interesting, but un-normalized
and thus difficult to reason about / sample from. Your model(...) function should accept a data set (along
with any other parameters you need); add all the passed (non-missing / NaN) observations to the model
Part 3: Variational Posterior (30 points)
Next, we define the guide(...) function. The guide function call should take exactly the same parameters as the
model(...) function, and define exactly the same random variables, except for those whose values are observed.
We will define our guide to be a product of independent, 2D Gaussians, i.e., each Vm and Un are independent in
q(·). (Note: they are of course not independent in the true posterior!) More explicitly, define Pyro parameters
for the mean and covariance of each Vm and Un, and then a sample statement to define each variable itself.
Define an optimizer and a SVI inference engine as in the Bayesian regression example. Then, optimize over
the parameters of your q(·) by iteratively calling svi.step . This process can be quite slow; I recommend that you:
(1) Display the current means and variances of your movies, i.e., q(Vm), after every few iterations. You can use
the provided Gaussian plot function at the end of the homework, passing the means and covariances along with
colors if desired. I also prefer to clear the plot outputs each time; again see the Bayesian L.R. demo for an example.
(2) Call step using a sub-sample of the full data. This can be a collection of ≈ 20− 50 randomly selected
ratings, or a subsample of ≈ 5− 10 users’ ratings, depending on how you’ve implemented things. This should
speed up each iteration and help your model converge more quickly.

留学ICU™️ 留学生辅助指导品牌
在线客服 7*24 全天为您提供咨询服务
咨询电话(全球): +86 17530857517
客服QQ:2405269519
微信咨询:zz-x2580
关于我们
微信订阅号
© 2012-2021 ABC网站 站点地图:Google Sitemap | 服务条款 | 隐私政策
提示:ABC网站所开展服务及提供的文稿基于客户所提供资料,客户可用于研究目的等方面,本机构不鼓励、不提倡任何学术欺诈行为。