Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: zz-x2580
PSYC20008
Lab and Statistics Resource
Calculating the Chi-Squared
Statistic
A resource guide
PSYC20008 | Calculating Chi-squared Page 2 of 15
Table of Contents
About this Guide
This guide has been developed to help students of PSYC20008 Developmental Psychology to conduct a chi-squared test of
independence in JASP. This guide is not for publication outside of the subject. Please do not share this guide on third party
websites (e.g., Chegg, Coursehero, StudentVIP, Studocu, and many more).
The guide offers step-by-step instructions to produce a contingency table with observed counts and expected counts, the chi-
squared statistic, degrees of freedom, and p-value, and calculate the standardised residuals. For guidance on what all
these numbers mean, please refer to the notes for PSYC20008 Lecture 4: Calculating Chi-Squared.
The example used throughout this guide is a chi-squared test of independence for the first hypothesis of the lab report, which
is testing the association between two categorical variables, among a specific subset (group) in the sample:
Variable 1: Mid/late childhood psychosocial crisis (three crisis profiles [3 categories]: industry; balanced LC*, inferiority)
Variable 2: Adolescence psychosocial crisis (three crisis profiles [3 categories]: identity; balanced Ad*, role confusion)
Subset of the sample: Participants who identify as adolescents.
Putting all of this together, the first hypothesis is:
Among those who self-identify as adolescents, there will be a statistically significant association between the psychosocial
crises of mid/late childhood and adolescence. More/Fewer participants with ________ crisis profiles will have ______ crisis
profiles than expected by chance. (Students can change the underlined text to match their prediction of what the data will
show).
By following this guide, students will have the statistics they need to address this first hypothesis and an understanding of what
those numbers mean. Once completing these steps to address the first hypothesis, students can follow the steps a second
time, making amendments as they require, to examine their second hypothesis. The second hypothesis will be developed
according to students’ interests, using any subset of the data to investigate the association between any two variables and
predicting outcomes for respective levels within those variables.
About the survey responses
During the 5-week window of data collection (starting two weeks before the semester, and closing in Week 3), the survey was
accessed 1,105 times. We removed 344 cases from the data file, including cases that were duplicate cases, cases with
missing data, cases that were completed too quickly to reasonably say they were genuine (i.e., survey duration was less
than 3 minutes), and “test” or “pilot” cases created by the teaching team.
On average, participants took approximately 8 minutes to complete the survey (SD = 3.2 minutes); with a Median of 7 minutes
and a Mode of 5 minutes.
Two variables in the dataset (Wellbeing, and Age) could only be created once we had the complete dataset and knew the
distribution of their respective continuous variables.
• Wellbeing: Participants’ wellbeing scores were calculated from the WEMWBS (Tennant et al., 2007). The scores were
normally distributed with the Median wellbeing score (43) and Mean score (M = 43, SD = 8.86) both slightly higher
than the scale’s midpoint (42). We used these midpoints to create a categorical variable (“Wellbeing (categorical)”)
with three levels of measurement:
o “Lower wellbeing” (scores ranging from 14 to 37);
o “Middle wellbeing” (scores ranging from 38 to 46); and
o “Higher wellbeing” (scores ranging from 47 to 70).
• Age: The age range for the sample was positively skewed, with the Mode (19) being slightly lower than the Median
(20) and Mean (21.25, SD = 5.26). We used this distribution to create a categorical variable (“Age (categorical)”) with
three levels of measurement:
o “17 to 19” (ages ranging from 17 to 19);
o “20 to 21” (ages ranging from 20 to 21); and
o “22 to 57” (ages ranging from 22 to 57).
* NOTE: Balanced LC = Balanced late childhood profile, Balanced Ad = Balanced adolescence profile
PSYC20008 | Calculating Chi-squared Page 4 of 15
Ahead of the Analysis
Obtaining the CSV file
1. Navigate to the Week 5 Activities page (or Week 4 Activities page) on the PSYC20008 Canvas site.
2. Download the document “2023 PSYC20008 lab report data.csv”
3. Save the file somewhere that you can access it again (e.g., save it to a USB drive).
Obtaining JASP
4. Navigate to https://jasp-stats.org.
5. Click the orange button that says “Download JASP”.
a) Click the orange button that is most appropriate for your device: Windows, macOS, or Linux.
b) While the file is downloading, check out the links to “Getting Started” and “How to Use JASP” pages. Note that on the
“how to use JASP” page, there is a link to the YouTube video for “Contingency Tables”. If you are new to JASP, you can
return to these resources at any time.
c) Once the file has been downloaded, double click on the file to run the installation software. Follow the prompts to
install JASP.
d) Once the setup wizard has finished, click Finish to close the wizard and launch JASP.
Familiarising yourself with JASP
When JASP opens, you will see a screen similar to Figure 1.
Figure 1. The JASP home screen.
Opening the dataset in JASP
6. To open your dataset:
a) Click the menu button (three blue lines) in the top-left-hand corner of the window.
b) Select Open.
c) Select Computer. This is shown in Figure 2.
d) Select Browse. Navigate to the .csv file of the dataset that you saved earlier.
PSYC20008 | Calculating Chi-squared Page 5 of 15
Figure 2. Opening a dataset in JASP.
JASP should now display the spreadsheet, similar to Figure 3.
Figure 3. The spreadsheet in JASP.
NOTE: When viewing the spreadsheet in JASP, you cannot adjust the cell values, delete cases, or create new variables. Most
students will not need to do these things. If you want to edit the spreadsheet, follow the steps in the final section of this
guide, called “Opening the dataset in Excel”.
PSYC20008 | Calculating Chi-squared Page 6 of 15
Understanding your columns and rows
7. The columns represent the variables. Read through the columns and check that you know their meaning:
• Response ID: Each participant has a separate identifying number. This information is valuable when checking
details about your file.
• Age (Continuous): The age of the participant, in years. This is the only continuous variable in the dataset.
Presented like this, this variable cannot be used as a variable within the chi-square test; however, it can be used as
a filter or as a way to create other categorical age variables (in Excel).
• Age (Categorical): The age of participants, transformed into a categorical variable with three levels of
measurement: “17 to 19”, “20 to 22”, and “23 to 57”.
• Life Period: A categorical variable recording the self-identified life period of each participant. This variable has five
levels of measurement: Adolescence, Young Adulthood, Mid Adulthood, Late Adulthood, and Other.
• Gender: A categorical variable recording the self-nominated gender of each participant. This variable has five
levels of measurement: Male, Female, Non-binary, Genderqueer, and Prefer not to say.
• Enrolment Status: A categorical variable recording each participants’ enrolment status. This variable has two levels
of measurement: Domestic student and International student.
• Course: A categorical variable recording each participant’s course. This variable has three levels of measurement:
Bachelor degree, Graduate Diploma of Psychology, and Other Course.
• Early Childhood Crisis: A categorical variable recording how each participant has resolved (or is resolving) their
early childhood psychosocial crisis of “Initiative & Guilt”. For more detail about this crisis, revisit Lecture 2 or the
Hoffnung et al., (2019) textbook. This variable has three levels of measurement: Initiative, Balanced EC*, and Guilt.
• Mid/Late Childhood Crisis: A categorical variable recording how each participant has resolved (or is resolving)
their mid/late childhood psychosocial crisis of “industry & Inferiority”. For more detail about this crisis, revisit
Lecture 2 or the Hoffnung et al., (2019) textbook. This variable has three levels of measurement: Industry,
Balanced LC*, and Inferiority.
• Adolescence Crisis: A categorical variable recording how each participant has resolved (or is resolving) their
adolescence psychosocial crisis of “identity & Role confusion”. For more detail about this crisis, revisit Lecture 2 or
the Hoffnung et al., (2019) textbook. This variable has three levels of measurement: Identity, Balanced Ad*, and
Role confusion.
• Young Adulthood Crisis: A categorical variable recording how each participant has resolved (or is resolving) their
young adulthood psychosocial crisis of “intimacy & Isolation”. For more detail about this crisis, revisit Lecture 2 or
the Hoffnung et al., (2019) textbook. This variable has three levels of measurement: Intimacy, Balanced YA*, and
Isolation.
• Wellbeing (categorical): A categorical variable reporting each participants’ level of psychological wellbeing,
relative to the wellbeing of the rest of the sample. This variable has three levels of measurement: Higher
Wellbeing, Middle Wellbeing, and Lower Wellbeing.
8. Each row represents a different participant in the sample. Scroll down the dataset until you reach the bottom case. The
number in the row tells you how many participants are in the sample.
a) Record the total number of participants in the space below:
* NOTE: Balanced EC = Balanced early childhood profile, Balanced LC = Balanced late childhood profile, Balanced Ad =
Balanced adolescence profile, Balanced YA = Balanced young adulthood profile
The total number of participants in the sample is:
______________________________________________
PSYC20008 | Calculating Chi-squared Page 7 of 15
Adjusting JASP to suit your preferences
Before running any analyses, you can use this window to adjust some settings of how JASP will read, present, and analyse the
dataset.
9. Click the menu button (three blue lines) in the top-left-hand corner of the window.
10. Select Preferences. The settings for four aspects of JASP are displayed:
a) Data preferences allow you to change how JASP is reading the .csv file.
• Ensure that Synchronise automatically on data file save is ticked. This means that if you make any edits to the .csv
file (e.g., in Excel) and save the .csv file, those changes will automatically be updated in JASP as well. This is handy
if you decide to create new variables or refresh the data.
• Ensure that Use default spreadsheet editor is ticked. This means that your experience in JASP will look similar to
the figures in this guide and to our video. If you are already quite sufficient in JASP and you have a preferred
different editor, feel welcome to use that. For most people (including us), the default editor is fine.
b) Results preferences allow you to change how the statistical analyses will be presented.
• Ensure that Display exact p-values is ticked. This will give you the p-value in the form that you need when writing
up the chi-squared statistics.
• Adjust the number of decimals to your preference.: 0, 1, 2, or 3 decimal places.
c) Four our purposes, you can leave the Interface and Advanced preferences set to the default.
Running the chi-squared test of independence for Hypothesis 1
Selecting a subset of the sample
Before you create your contingency table (i.e., cross-tabulation), you need to tell JASP who to include in that table. By default,
JASP will include all sample participants in the analysis. These next two steps show you how to ask JASP to focus on
adolescents and young adults.