Tutorial & Lab Sheet 1 Tutorial Problems There are no tutorial problems in Week 1. Computer Problems We will be using the statistical software R for all analysis. Basic use of R is assumed knowledge in this course. If you are not familiar with R, you should quickly learn. Check the resource page in canvas to learn or refresh. For Q1-Q3, you will treat R as a calculator to get the quantity you want. For Q4, you will get started with literate programming using R Markdown. For Q5, you should try to produce an R Markdown report yourself. Note that assignments will be based on your R Markdown output so be sure to learn how to make an output in your computer class. Question 1 (Assumed knowledge) Use R to find the following probabilities (a) P (Z > −0.785), Z ∼ N(0, 1), with pnorm(q, lower.tail = FALSE). (b) P (t2 ≥ −1.26), with pt(q, df, lower.tail = FALSE). (c) P (χ24 < 4.7), with pchisq(q, df). (d) P (|t9| > 1.85), with pf(q**2, df1, df2, lower.tail = FALSE) after thinking about how the t and the F distribution relate to each other. Question 2 (Assumed knowledge) Use qnorm, qt, qchisq, and qf to find c in the following (a) P (t4 ≥ c) = .995 with qt, (b) P (|Z| ≤ c) = 1/11 with both, qnorm and qchisq, (c) P (F3,12 ≤ c) = .90 with qf. Question 3 (Assumed knowledge) A machine produces metal pieces that are cylindrical in shape. A sample of 8 pieces is taken and the diameters are 1.01, 0.97, 0.39 1.03, 1.04, 0.99, 0.98, 0.99. (a) Construct a box plot representation of this data set. (Hint: with x <- c(1.01,..., 0.99) and boxplot()). (b) Estimate the average diameter, µ, produced by the machine. Estimate the standard error (= s/ √ n) of your estimate? (mean(x), s = sd(), and n = length(x)). (c) Assuming that the diameter can be modelled by a normal distribution, calculate a 98% con- fidence interval for µ. (Hint: with t.test(x, mu = 1, conf.level = 0.98), gives you the solution for (d) as well). (d) Would you reject the hypothesis H0 : µ = 1.00 at significance level α = .02 on the basis of these data? Question 4 Go to RStudio > File > New File > R Markdown. Click on ‘OK’ to get a pre-filled R Markdown file. Push the Knit button on top of the console just under the file names and examine the output. Have a play around and knit to understand how it works. Where did the data cars and pressure come from? Question 5 In this question, you will attempt to write your own reproducible report for the analysis of the lengths of time of passages of play data from ten international rugby matches involving the “All Blacks”. This (as all other course data is available from Canvas Unit Schedule & Materials) is available as rugby.txt. This exercise helps you to digest part of Lectures 1-2 and to revisit assumed knowledge on R and graphical displays. To get started, you may like to modify the R Markdown file from Q4. (a) Load the tidyverse R packages which will load a collection of R packages including ggplot2 and dplyr. library(tidyverse) (b) Download the data and read it into R, storing them as a data frame rugby. You can use the command below but you will need to make sure that the file you have downloaded is in the right path. Make sure you master about reading data into R. rugby <- read.table("rugby.txt", header = TRUE) (c) Look at the data frame by simply typing its name, rugby, into an R chunk and compiling the pdf with Knit. You should see that the data frame has two columns. Scroll up to see that these columns are headed Game and Time respectively. (These headings were read in from the text file, rugby.txt; R was alerted to the presence of these headings by the header = TRUE syntax in the read.table command.) The variable Game identifies the match (labelled A, B, . . ., K) and the variable Time contains the times of passages of play, in seconds. (d) In reports it is often preferable to only show the first couple of lines in a data frame. Try the following: rugby[1:3, ] head(rugby) head(rugby, 2) (e) Type in rugby$Game into an R chunk and press Knit. (f) The variable type for Game is categorical (or factor as synonym). You get frequencies for each category by table(rugby$Game). Which game had the most separate passages of play? Which had the least? You can use the help function to learn more – try help(table) or equivalently ?table in the R console. (g) We can display the data using a bar plot. You can produce a bar plot with