Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: zz-x2580
MATH5905 - Statistical Inference
Assignment 1 Solutions
Problem One
i) It is known that for independent Poisson distributed random variables X1 ∼ Poisson(λ1) and
X2 ∼ Poisson(λ2) it holds that
X1 +X2 ∼ Poisson(λ1 + λ2).
Show that if Xi ∼ Poisson(λi), i = 1, 2, . . . , k are independent then the conditional distribution of X1
given X1 +X2 + . . . Xk is Binomial and determine the parameters of this Binomial distribution.
ii) Suppose that the X and Y are components of continuous random vector with a density
fX,Y (x, y) = cxy
2, 0 < x < y, 0 < y < 2
(and zero else). Here c is a normalizing constant.
a) Show that c = 516 .
b) Find the marginal density fX(x) and FX(x).
c) Find the marginal density fY (y) and FY (y).
d) Find the conditional density fY |X(y|x).
e) Find the conditional expected value a(x) = E(Y |X = x).
Make sure that you show your working and do not forget to always specify the support of the
respective distribution.
Solution: i) Let us denote Y = X1+X2+ . . . Xk. We are looking at P (X1 = x|Y = y) and we know that
Y ∼ Poisson(λ1 + . . . λk). Here y is any fixed value 0, 1, 2, . . . and for a given y, x can be 0, 1, . . . , y. Using
the conditional distribution definiton we have for the conditional distribution, for any x in the support SX :
fX1|Y (x|y) = P (X1 = x|X1+. . . Xk = y) =
P (X1 = x)P (X1 + . . . Xk = y)
P (X1 + . . . Xk = y)
=
P (X1 = x,X2 + . . . Xk = y − x)
P (X1 + . . . Xk = y)
As X2 + . . . Xk ∼ Poisson(λ2 + . . . λk) this can be continued as:
e−λ1e−(λ2+...λk)λx1(λ2 + . . . λk)
y−xy!
x!(y − x)!e−(λ1+λ2+...λk)
and after cancellation we get it equal to
y!
x!(y − x)! (
λ1
λ1 + . . . λk
)x(1− λ1
λ1 + . . . λk
)y−x
for any x = 0, 1, . . . , y (and zero else). This is precisely Binomial(y, λ1λ1+...λk ) distribution.
ii) a) The support of the density is the region ∆ bounded by the triangle with coordinates (0,0), (0,2)
and (2,2). Now it is easy to see that∫
∆
cxy2dxdy =
∫ 2
0
∫ 2
x
cxy2dydx = 48c/15
As this integral must be equal to 1, we see that c = 516 must hold.
b) fX(x) =
∫ 2
x
fX,Y (x, y)dy =
5
16
∫ 2
x
xy2dy = 548 (8x − x4) for x ∈ (0, 2) (and zero else). For the cdf we
get via integration FX(x) =
5
48 (4x
2 − 0.2x5), 0 < x < 2 (and, of course, FX(x) = 0 when x < 0, FX(x) = 1
when x > 2).
1
MATH5905, T1 2023 Assignment One Solutions Statistical Inference
c) fY (y) =
∫ y
0
fX,Y (x, y)dx =
∫ y
0
5
16xy
2dx = 532y
4 for y ∈ (0, 2) (and zero else) For the cdf we get via
integration: FY (y) =
1
32y
5, 0 < y < 2 (and, of course, FY (y) = 0 when y < 0, FY (y) = 1 when y > 2).
d) For each fixed x ∈ (0, 2) we have
fY |X(y|x) = fX,Y (x, y)
fX(x)
=
5
16xy
2
5
48 (8x− x4)
=
3y2
8− x3
if y ∈ (x, 2) (and zero else).
e)
E(Y |X = x) =
∫ 2
x
yfY |X(y|x)dy =
∫ 2
x
y
3y2
8− x3 dy =
3
4
16− x4
8− x3
for every x ∈ (0, 2).
Problem 2
Let X and Y be independent uniformly distributed in (0, 1) random variables. Further, let
U(X,Y ) = X + Y, V (X,Y ) = Y −X
be a transformation.
a) Sketch the support S(X,Y ) of the random vector X,Y in R
2.
b) Sketch the support S(U,V ) of the random vector (U, V ) in R
2.
c) Determine the Jacobian of the transformation
d) Determine the density of the random vector (U, V )
Justify each step.
Solution: The joint density of X and Y has a support defined on the unit square in R2 limited by
coordinates (0, 0), (0, 1), (1, 1), (1, 0). Denote this region of support by S(X,Y ). It can be seen that the
transformed vector (U, V ) has a support limited by the square with coordinates (0, 0), (1,−1), (2, 0), (1, 1).
Denote this region of support by S(U,V ). Sketches are provided below on page 3.
Using indicators we can write
fX,Y (x, y) = I(0,1)(x)I(0,1)(y).
From the defined transformation, solving the system w.r. to x and y we get
x =
1
2
(v − u), y = 1
2
(u+ v)
(as long as (u, v) is in the support S(U,V ) The Jacobian of this transformation is equal to 1/2. Hence the
density f(U,V )(u, v) = 1/2 when (u, v) ∈ S(U,V ) (and zero else).
2
MATH5905, T1 2023 Assignment One Solutions Statistical Inference
3
MATH5905, T1 2023 Assignment One Solutions Statistical Inference
Problem 3
You are going to the races and want to decide whether or not to bet on the horse Thunderbolt. You
want to apply decision theory to make a decision. You use the information from two independent horse-
racing experts. Data X represents the number of experts recommending you to bet on Thunderbolt
(due, of course, to their belief that this horse will win the race).
If you decide not to bet and Thunderbolt does not win, or when you bet and Thunderbolt wins the
race, nothing is lost. If Thunderbolt does not win and you have decided to bet on him, your subjective
judgment is that your loss would be four times higher than the cost of not betting but the Thunderbolt
does win (as you will have missed other opportunities to invest your money).
You have investigated the history of correct winning bets for the two horse-racing experts and it is
as follows. When Thunderbolt has been a winner, both experts have correctly predicted his win with
probability 5/6 (and a loss with a probability 1/6). When Thunderbolt had not won a race, both experts
had a prediction of 3/5 for him to win. You listen to both experts and make your decision based on the
data X.
a) There are two possible actions in the action space A = {a0, a1} where action a0 is to bet and action
a1 is not to bet. There are two states of nature Θ = {θ0, θ1} labelled symbolically as 0 and 1,
respectively, where θ0 = 0 represents “Thunderbolt winning” and θ1 = 1 represents “Thunderbolt
not winning”. Define the appropriate loss function L(θ, a) for this problem.
b) Compute the probability mass function (pmf) for X under both states of nature.
c) The complete list of all the non-randomized decisions rules D based on x is given by:
d1 d2 d3 d4 d5 d6 d7 d8
x = 0 a0 a1 a0 a1 a0 a1 a0 a1
x = 1 a0 a0 a1 a1 a0 a0 a1 a1
x = 2 a0 a0 a0 a0 a1 a1 a1 a1
For the set of non-randomized decision rules D compute the corresponding risk points.
d) Find the minimax rule(s) among the non-randomized rules in D.
e) Sketch the risk set of all randomized rules D generated by the set of rules in D. You might want
to use R (or your favorite programming language) to make this sketch more precise.
f) Suppose there are two decisions rules d and d′. The decision d strictly dominates d′ ifR(θ, d) ≤ R(θ, d′)
for all values of θ and R(θ, d) < (θ, d′) for at least one value θ. Hence, given a choice between d
and d′ we would always prefer to use d. Any decision rules which is strictly dominated by another
decisions rule (as d′ is in the above) is said to be inadmissible. Correspondingly, if a decision rule
d is not strictly dominated by any other decision rule then it is admissible. Show on the risk plot
the set of randomized decisions rules that correspond to the admissible decision rules.
g) Find the risk point of the minimax rule in the set of randomized decision rules D and determine
its minimax risk. Compare the two minimax risks of the minimax decision rule in D and in D.
Comment.
h) Define the minimax rule in the set D in terms of rules in D.
i) For which prior on {θ1, θ2} is the minimax rule in the set D also a Bayes rule?
j) Prior to listening to the two experts, you believe that Thunderbolt will win the race with proba-
bility 1/2. Find the Bayes rule and the Bayes risk with respect to your prior.
4
MATH5905, T1 2023 Assignment One Solutions Statistical Inference
k) For a small positive ϵ = 0.1, illustrate on the risk set the risk points of all rules which are ϵ-
minimax.
Solution: a) There are two actions: a0 : decide to bet on Thunderbolt and a1 : do not bet. Two states of
nature, θ0 = 0, ”Thundebolt wins the race”, and θ1 = 1 ”Thurderbolt loses the race”. Let x be the number
of experts predicting that Thunderbolt will win. The loss function is given by
L(θ1, a0) = 4, L(θ1, a1) = 0, L(θ0, a0) = 0, L(θ0, a1) = 1.
b) The pmf for both states of nature:
x p(x|θ = 0) p(x|θ = 1)
0 16 · 16 = 136 25 · 25 = 425
1 2 · 16 · 56 = 1036 2 · 35 · 25 = 1225
2 56 · 56 = 2536 35 · 35 = 925
c)
There are 23 = 8 non-randomized decision rules:
d1 d2 d3 d4 d5 d6 d7 d8
x = 0 a0 a1 a0 a1 a0 a1 a0 a1
x = 1 a0 a0 a1 a1 a0 a0 a1 a1
x = 2 a0 a0 a0 a0 a1 a1 a1 a1
Calculation of the risk points {R(θ0, di), R(θ1, di)} is as follows:
For d1 = (0, 4):
R(θ0, d1) = L(θ0, a0)P (x = 0|θ0) + L(θ0, a0)P (x = 1|θ0) + L(θ0, a0)P (x = 2|θ0) = 0
as L(θ0, a0) = 0.
R(θ1, d1) = L(θ1, a0)P (x = 0|θ1) + L(θ1, a0)P (x = 1|θ1) + L(θ1, a0)P (x = 2|θ1) = 4
as L(θ1, a0) = 4.
For d2 = (1/36, 84/25):
R(θ0, d2) = L(θ0, a1)P (x = 0|θ0) + L(θ0, a0)P (x = 1|θ0) + L(θ0, a0)P (x = 2|θ0)
= 1× 1
36
+ 0× 5
18
+ 0× 25
36
=
1
36
R(θ1, d2) = L(θ1, a1)P (x = 0|θ1) + L(θ1, a0)P (x = 1|θ1) + L(θ1, a0)P (x = 2|θ1)
= 0 + 4× 12
25
+ 4× 9
25
=
84
25
For d3 = (5/18, 2.08):
R(θ0, d3) = L(θ0, a0)P (x = 0|θ0) + L(θ0, a1)P (x = 1|θ0) + L(θ0, a0)P (x = 2|θ0)
= 0× 1
36
+ 1× 5
18
+ 0× 25
36
=
5
18
5
MATH5905, T1 2023 Assignment One Solutions Statistical Inference
R(θ1, d3) = L(θ1, a0)P (x = 0|θ1) + L(θ1, a1)P (x = 1|θ1) + L(θ1, a0)P (x = 2|θ1)
= 4× 4
25
+ 0× 12
25
+ 4× 9
25
=
52
25
For d4 = (11/36, 36/25):
R(θ0, d4) = L(θ0, a1)P (x = 0|θ0) + L(θ0, a1)P (x = 1|θ0) + L(θ0, a0)P (x = 2|θ0)
= 1× 1
36
+ 1× 5
18
+ 0× 25
36
=
11
36
R(θ1, d4) = L(θ1, a1)P (x = 0|θ1) + L(θ1, a1)P (x = 1|θ1) + L(θ1, a0)P (x = 2|θ1)
= 0× 4
25
+ 0× 12
25
+ 4× 9
25
=
36
25
For d5 = (25/36, 64/25):
R(θ0, d5) = L(θ0, a0)P (x = 0|θ0) + L(θ0, a0)P (x = 1|θ0) + L(θ0, a1)P (x = 2|θ0)
= 0× 1
36
+ 0× 5
18
+ 1× 25
36
=
25
36
R(θ1, d5) = L(θ1, a0)P (x = 0|θ1) + L(θ1, a0)P (x = 1|θ1) + L(θ1, a1)P (x = 2|θ1)
= 4× 4
25
+ 4× 12
25
+ 0× 9
25
=
64
25
For d6 = (26/36, 1.92):
R(θ0, d6) = L(θ0, a1)P (x = 0|θ0) + L(θ0, a0)P (x = 1|θ0) + L(θ0, a1)P (x = 2|θ0)
= 1× 1
36
+ 0× 5
18
+ 1× 25
36
=
26
36
R(θ1, d6) = L(θ1, a1)P (x = 0|θ1) + L(θ1, a0)P (x = 1|θ1) + L(θ1, a1)P (x = 2|θ1)
= 0× 4
25
+ 4× 12
25
+ 0× 9
25
=
48
25
6
MATH5905, T1 2023 Assignment One Solutions Statistical Inference
For d7 = (35/36, 16/25):
R(θ0, d7) = L(θ0, a0)P (x = 0|θ0) + L(θ0, a1)P (x = 1|θ0) + L(θ0, a1)P (x = 2|θ0)
= 0× 1
36
+ 1× 5
18
+ 1× 25
36
=
35
36
R(θ1, d7) = L(θ1, a0)P (x = 0|θ1) + L(θ1, a1)P (x = 1|θ1) + L(θ1, a1)P (x = 2|θ1)
= 4× 4
25
+ 0× 12
25
+ 0× 9
25
=
16
25
For d8 = (0, 1):
R(θ0, d8) = L(θ0, a1)P (x = 0|θ0) + L(θ0, a1)P (x = 1|θ0) + L(θ0, a1)P (x = 2|θ0) = 1
as L(θ0, a1) = 1.
R(θ = 1, d8) = L(θ1, a1)P (x = 0|θ1) + L(θ1, a1)P (x = 1|θ1) + L(θ1, a1)P (x = 2|θ1) = 0
as L(θ1, a1) = 0.
This leads to the following risk points:
d1 d2 d3 d4 d5 d6 d7 d8
R(θ0, di) 0
1
36
5
18
11
36
25
36
26
36
35
36 1
R(θ1, di) 4
84
25
52
25
36
25
64
25
48
25
16
25 0
d) For each non-randomized decision rule we need to compute:
d1 d2 d3 d4 d5 d6 d7 d8
sup
θ∈{θ0,θ1}
R(θ, di) 4
84
25
52
25
36
25
64
25
48
25
35
36 1
Hence, inf
d∈D
sup
θ∈Θ
R(θ, d) is d7 which therefore the minimax decision in the set D with a minimax risk of
35/36.
e) Sketch of the randomized rules D generated by the set of non-randomized decision rulesD: see attached
graph of the set.
f) The admissible rules are those on the ”south-west boundary” of the risk set: any convex combination
of d8 and d4, or d4 and d2, or d2 and d1.The randomized decisions rules that correspond to admissible rules
are colored in blue on the graph.
g) The minimax decision rule in the set is given by the intersection of the lines y = x and the risk set
towards the south-most boundary. The line d8d4 has an equation