ALY6000: Data Analysis
Data Analysis
项目类别:统计学

Hello, dear friend, you can consult us at any time if you have any questions, add  WeChat:  zz-x2580


ALY6000: Data Analysis

Overview and Rationale Being able to ask appropriate questions of data is an important part of the work of data analytics. It is also critical to be able to interpret the results of the analysis. This assignment is intended to familiarize you with the data sets and to get you thinking about key business questions you can answer from this data.
Module Outcomes This assignment is directly linked to the following learning:
• Investigate impacts of big data on industry
• Describe the evolution of big data
• Analyze data to complete a data rich and visually appealing report
Assignment Instructions Find one dataset that is of interest to you. Some places to find datasets include:
• The R Project for Statistical Computing
• Kaggle
• U.S. Government’s Open Data
• or your own data. Your data set should have at least 700, but less than 6000, records and eight (8) attributes and the data should not be “clean”. Part of this assignment will require you to clean the data yourself. Please see any accompanying Data Dictionary to understand the fields and values in your chosen dataset is available. The assignment has three parts.
Part I Please review the Data Dictionary document as you review the datasets if one is provided. In order to understand the data we first need to run some descriptive statistics on the data set. Start by providing the following for each appropriate variable in the dataset: 1. Summarize the data in a table.
Page 2 of 4

2. Graphs that help visualize the data. These can be bar charts, histograms, pie charts, etc. Be sure the chosen graph best represents the information you want to highlight. 3. Explain the story the data is telling you.
• What business question do your descriptive analyses answer? Provide a brief discussion of the findings.
• If there are any unusual values, discuss them. If data values are “out of range,” clean the data as needed. Delete the out of range values and run the analysis again.
• If you remove out of range values for any of the variables, present both the analysis with the out of range values and the analysis without the out of range value(s).
• Identify additional questions that the data is leading you to ask. What new attributes are needed to answer those questions?
Part II Create new attributes based on the data and the questions you identified in Part 1. For your data set, compute differences between appropriate variable values and create a new variable. For examples, if the data shows yearly sales for different years, by month, calculate the increase or decrease in sales from month to month. Then, compute the mean and median for each of the variables you have computed.
Part III Now that you have worked with the data, what is the data saying to you? What have you learned about the attributes? What are some follow-up questions you would like to have answered? Identify 3-5 observations or follow-up questions that you have.
What to Submit A presentation slide deck (5-8 slides not including Title and reference list slide) with your findings. Submit a single file with the following filename: _FinalProject.pptx
Format Your presentation must:
• Tell the story of your data through the use of descriptive statistics and visualizations.
o Remember your visualizations are the primary vehicle you'll use to convey information in an analytics presentation.
o Include very concise with written information that is highly connected to the points made in the visualizations as a Notes section on each slide.
• Properly cite all sources using APA citation rules.
Page 3 of 4

Appendix
Assignment Part I Section Example
Business Question: What is the distribution of the status of the 2017 GxP Audits?
Analysis:
Descriptives Table

Audit Status Frequency Percent Valid Percent

Valid Closed 19 19.8 19.8
Completed 4 4.2 4.2
In Progress 18 18.8 18.8
Scheduled 11 11.5 11.5
Pending 14 14.6 14.6
Not In Scope 26 27.1 27.1
Cancelled 4 4.2 4.2
Total 96 100.0 100.0

Audit Status Count


Page 4 of 4

Audit Status Percentages




Discussion: The data file includes information on 96 audits in 2017 for GxP areas. It is unclear if the data file includes all the known GxP audits in 2017 or if it only includes a subset. A large percentage of all GxP Audits (27.1%) are not in scope. 19.8% of audits are closed and 4.2% of audits are completed. It is unclear what the difference between “closed” and “completed” audits is. We should perhaps ask the client. Do we really need two distinct values? 18.8% of the audits are in progress, 11.5% are scheduled and 14.6% are pending. For the pending audits, the dates of the audit process have not been established. 4.2% of the audits were canceled. It may be interesting to have a notes field where the reasons for cancelation are noted.
留学ICU™️ 留学生辅助指导品牌
在线客服 7*24 全天为您提供咨询服务
咨询电话(全球): +86 17530857517
客服QQ:2405269519
微信咨询:zz-x2580
关于我们
微信订阅号
© 2012-2021 ABC网站 站点地图:Google Sitemap | 服务条款 | 隐私政策
提示:ABC网站所开展服务及提供的文稿基于客户所提供资料,客户可用于研究目的等方面,本机构不鼓励、不提倡任何学术欺诈行为。