![]() ![]() We can see that there were far more female respondents than male respondents. The above three figures show the data distribution of how male and female responded to the number of days physical, mental and both health not good during the past 30 days. Summary ( brfss2013 $ sex ) # Male Female NA's Ggplot ( aes ( x = poorhlth, fill = sex ), data = brfss2013 ) + geom_histogram ( bins = 30, position = position_dodge ()) + ggtitle ( 'Number of Days with Poor Physical Or Mental Health in the Past 30 Days' ) Ggplot ( aes ( x = menthlth, fill = sex ), data = brfss2013 ) + geom_histogram ( bins = 30, position = position_dodge ()) + ggtitle ( 'Number of Days Mental Health not Good in the Past 30 Days' ) Part 3: Exploratory data analysis Research quesion 1: ggplot ( aes ( x = physhlth, fill = sex ), data = brfss2013 ) + geom_histogram ( bins = 30, position = position_dodge ()) + ggtitle ( 'Number of Days Physical Health not Good in the Past 30 Days' ) Is there any relation between smoking, drinking alcohol, cholesterol level, blood pressure, weight and having a stroke? Eventually, I would like to see whether stroke can be predicted from the above mentioned variables. Is there any association between income and health care coverage? Research quesion 4: Is there an association between the month in which a respondent was interviewed and the respondent’s self-reported health perception? Research quesion 3: Part 2: Research Questions Research quesion 1:ĭoes the distribution of the number of days in which physical and mental health was not good during the past 30 days differ by gender? Research quesion 2: There is no causation can be established as BRFSS is an observation study that can only establish correlation/association between variables. Potential biases are associated with non-response, incomplete interviews, missing values and convenience bias (some potential respondents may not have been included because they do not have a landline and cell phone). It is based on a large stratified random sample. The sample data should allow us to generalize to the population of interest. The dataset we are working on contains 330 variables for a total of 491, 775 observations in 2013. Disproportionate stratified sampling (DSS) has been used for the landline sample and the cellular telephone respondents are randomly selected with each having equal probability of selection. The data were collected from United States’ all 50 states, the District of Columbia, Puerto Rico, Guam and American Samoa, Federated States of Micronesia, and Palau, by conducting both landline telephone and cellular telephone-based surveys. Data Collection:ĭata collection procedure is explained in brfss_codebook. For example, respondents are asked about their diet and weekly physical activity, their HIV/AIDS status, possible tobacco use, immunization, health status, healthy days - health-related quality of life, health care access, inadequate sleep, hypertension awareness, cholesterol awareness, chronic health conditions, alcohol consumption, fruits and vegetables consumption, arthritis burden, and seatbelt use. The BRFSS is designed to identify risk factors in the adult population and report emerging trends. The Behavioral Risk Factor Surveillance System (BRFSS) is an annual telephone survey in the United States. Load packages library ( ggplot2 ) library ( dplyr ) library ( Hmisc ) library ( corrplot ) Load data load ( "brfss2013.RData" ) Part 1: About the Data
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |