Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SF_12 PCS Pathways #239

Closed
12 tasks done
ld-archer opened this issue May 16, 2023 · 9 comments · May be fixed by #243
Closed
12 tasks done

SF_12 PCS Pathways #239

ld-archer opened this issue May 16, 2023 · 9 comments · May be fixed by #243
Assignees

Comments

@ld-archer
Copy link
Collaborator

ld-archer commented May 16, 2023

Need some additional modules and some modifications to others:

ok my takeaway for the income -> PCS pathways:

  • housing (yes, as it is);
  • neighbourhood safety (yes, as it is);
  • Environment (green space, fast food etc. - will be difficult but may be proxies)
  • diet and nutrition (yes, might be a bit more complicated but fine for now)
  • access to private healthcare (??)
  • Job related stressors (exposure to toxins, etc. - proxied by employment sector?)
  • Tobacco (tes, as it is)
  • access to gym membership, etc (??)
  • Loneliness (as is)
  • Chronic disease -> In my head 2 tiers but I think this is worth a discussion:
    • Some chronic diseases have relatively large impact on health (cancer, diabetes, CVD, many more)
    • Some less large impact (asthma, arthritis, hypertension(??), osteoporosis)
    • Do we make a distinction between the two? Or not include the less impactful ones? Seems like all chronic conditions would have a big enough effect on PCS that they should be included but also seems daft to have a separate module for each one
    • Multi-morbidities could be very important here. Maybe distinction between 1 or 1+ is important.

Steps:

  • Deep dive into the variable search to find variables we can use for these pathways (or ones we can use to create proxies)
    • Dump all information here so we keep a record of it
  • Create modules for each of these using any current module skeleton as a starting point
    • Alcohol
    • Exercise/Fitness
    • Material deprivation
    • Chronic Disease
  • Create transition model equations for each module and add to the model_transitions.txt file
    • Test that current code works to fit the models and iron out any data issues (i.e. missing years / small sample sizes)
  • Create transition model equation for PCS incorporating all these new modules
  • Add modules into the default_config.yaml file and do a test run
  • Visualise PCS outcomes separately from MCS when new modules are included
  • VALIDATION
    • PCS already included in handovers and soon to be in cross-validation
    • Handovers and cross-validation can both be run with single make commands (make handovers, make cross_validation)

NOTE: We don't need variables that span the entire length of data anymore, as we are currently only using the 2017-2018 model for SF_12_MCS. Therefore the only absolute thing we need is data for any variables that make up a pathway in 2017, as we need to be able to fit a one yearly model for the pathway variables (i.e. 2016-2017) and a one yearly model for SF_12_PCS (2017-2018).

WORKING GOOGLE DOC

@ld-archer
Copy link
Collaborator Author

Some work to finish off in #231 before this can be completely finished (cross-validation for VALIDATION point, and some fixes for the outcome visualisation of PCS), but the vast majority of this can be started at any point.

@ld-archer
Copy link
Collaborator Author

ld-archer commented Jun 1, 2023

Discussion Points

Early indications that there will be lots to discuss from this round of data discovery, so I'm listing them all here and will try to arrange a meeting most likely next week.

  • Stress
    • This has to be a predictor of SF12 MCS, but could also be a predictor of PCS also
    • A quick search found multiple studies 1 2 3 that either link psychological stress to health outcomes, or explore the pathways that could lead from one to the other.
    • However others suggest that risk behaviours could be a big part of this puzzle, which we are already tracking (tobacco use, poorer nutrition etc.)
  • Nutrition
    • Lots of good variables (with reasonable sample size) on the type of bread and milk most eaten by a respondent (including none).
    • Can we link this to our nutrition_quality variable? Think e.g. wholemeal bread could be a proxy for nutrition as a whole, as also potentially something like skimmed milk. Worth discussion.
  • Neighbourhood Cohesion
    • This isn't really a point for PCS but more housing or MCS (keeping discussion points in one place for sanity)
    • This seems like it could also be an important piece of the neighbourhood puzzle for SF12 MCS? We only have neighbourhood safety, but no measure of the quality of the neighbourhood or community, which could be important for mental wellbeing.
    • See vars in Neighbourhood Quality section of Housing Data Discovery
    • This also seems like it would be a good predictor of loneliness if we can stomach making this a predicted variable. If not then we should also consider the neighbourhood effect on loneliness, maybe by including neighbourhood_safety as a proxy?
  • Predict X only if Y...
    • This is a bit of an abstract thought but, some variables are important for a pathway, and may only apply to a group of people, but could be very useful
    • Example I have found in variable search is caring for an elderly or disabled person in the household. Caring for a loved one has strong links to loneliness in literature, but predicting whether someone is a carer would produce instances where e.g. someone lived alone but was predicted as a carer
    • We can do prediction of these variables only when specific conditions are met (i.e. if living with pensioner OR someone with disability (based on labour_state of cohabitee / parent))
    • This is additional work but in my head when we do it once we have a decent template to follow. We probably will have to tackle similar problems sooner or later when we come to think about children leaving home, or hhsize variable changing when children leave home or with marital status changes. Would also need to consider accounting of these newly predicted vars to ensure reasonable values
    • Would we be better having a single module that can handle all non-direct pathway vars? Or would each new set of things get its own module?
  • Satisfaction with X...
  • Immigrant / not UK born - ff_ukborn
    • Predictor of a few things including PCS? i.e. hh_income, job_nssec, SF12 and more?
    • Won't need to be predicted as can't change so might be quick win to improve other models?
    • Approx 10% of 30k sample not born in UK in wave 12
    • Combine with ff_yr2uk4 and can work out proportion of life spent in UK if not born here? Again won't change and could improve number of models
    • Combine with plbornc for country of origin (not fed forward). Maybe bin by continent?

@ld-archer
Copy link
Collaborator Author

Modules

After discussion on 8/6/23, these are the modules we have settled on as a good mix of important and achievable.

Material Deprivation

Data Discovery

Should be a simple proxy to create, we have a couple of 4 level ordinal variables asked to every household with either no pensioners or pensioners and children. Can just take the mean of these variables for a material deprivation composite. This should be related to hh_income (as questions are all '{do you have} enough money to i.e. keep your house in a decent state of repair?).

Exercise / Fitness

Data Discovery

Can use government guidelines to determine what is a healthy level of exercise vs unhealthy, then create a binary variable. Guidelines state that health activity level is at least 150 minutes of moderate intensity exercise, or 75 minutes vigorous intensity.

Job Satisfaction

Data Discovery

Can just use the jbsat variable for this, although in the data discovery we have a large amount of related information that could be used to expand this in the areas of hours worked, working arrangements (part-time, on-call, work from home etc.), autonomy of work, psychological job stress (feels uneasy about job, feels depressed, miserable etc.).

Alcohol Use

Have AUDIT-PC scores that we can use to create a 3 level ordinal:

  • Low risk
  • Increasing risk
  • High risk

Which should make quite a simple module.

Chronic Disease

Some work to do here before this can go into a module.

  1. Fit a regression model for all chronic diseases to SF-12 (MCS & PCS)?
    • See if there is any obvious way of binning diseases, or if some can be safely ignored (i.e. small coefficient and insignificant)
  2. Fit regression model for number of chronic diseases to SF-12.
    • i.e. 0 vs 1 vs 2 vs 3 vs 3+
    • Is a number of chronic diseases more useful than tiers? Is it easier to predict?

@paddy-r
Copy link
Collaborator

paddy-r commented Jun 8, 2023

May be way too complicated an approach, but for quantitative measure of severity of chronic disease, how about years of life lost (YLL)? Quick search found this for Germany, see Figure 2, it's also age-specific. So for a given age group, (relative) severity of disease is proportion of YLL.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8212398/

@ld-archer
Copy link
Collaborator Author

Looks like a good place to start and could be used to justify any decisions we make, cheers!

@paddy-r
Copy link
Collaborator

paddy-r commented Jun 8, 2023

Looks like a good place to start and could be used to justify any decisions we make, cheers!

Looks like data for England and Wales are available in massive detail...

https://digital.nhs.uk/data-and-information/publications/statistical/compendium-mortality/current/years-of-life-lost

Of the US variables you found, it looks like most are present one-to-one, e.g. asthma, coronary heart disease, others present but not one-to-one, e.g. emphysema comes under "bronchitis and emphysema".

@ld-archer
Copy link
Collaborator Author

ld-archer commented Jul 10, 2023

Alcohol

Docstring from generate_composite_vars.calculate_auditc_score():

Alcohol use disorders can be assessed via the AUDITC score. This score is derived from 3 questions that form part of
the full 10 question AUDIT screening test, where AUDITC specifically focuses on consumption. The 3 questions are:

  1. How often do you have a drink containing alcohol?
  2. How many units of alcohol do you drink on a typical day when you are drinking?
  3. How often have you had 6 or more units if female, or 8 or more if male, on a single occasion in the last year?

Each question is ordinal with 5 levels, depending on the 'severity' of the answer. We then score each question from
0-4, with higher scores meaning higher 'severity'. The total across the 3 questions then creates a score from 0-12,
with 0-4 meaning sensible drinking, 5-7 meaning hazardous drinking, and 8+ meaning harmful drinking.
See following link for information on scoring:
https://www.drinktalkingportal.co.uk/clinical-guidance/alcohol-abuse-screening/alcohol-audit-audit-c

To calculate this score, ee rely on 4 variables in Understanding Society shown at the following link:
https://www.understandingsociety.ac.uk/documentation/mainstage/dataset-documentation?search_api_views_fulltext=auditc

Question 1 above relies on auditc1 & auditc3, question 2 relies on auditc4, and question 3 uses auditc5.

NOTE: The final variable used (auditc5) specifically mentions 6 or more drink frequency, rather than 6+/8+ units.
This could be a mistake in the description or the actual question asked being incorrect (not the true AUDITC
question). There's no information about which one it is, so I'm treating it the same as the AUDITC3 question
for our purposes. Added benefit that this is simpler to code without checking for gender also.

Sample Check

image
image

Interesting here that the number of missing values jumps in 2020. Assuming this has something to do with COVID? Maybe people were less happy to talk about their consumption during COVID lockdowns? Unfortunately due to the lack of information on the website we don't have any idea why... There is some literature using these variables though so I'll have a look through that also.

Handovers

image
image

Cross-Validation

[EDIT] Forgot to copy 2015 data onto 2014 as no alcohol data in 2014.
image

@ld-archer
Copy link
Collaborator Author

Current PCS Plots

This is with the following variables as predictors of PCS:

  • PCS
  • age
  • sex
  • ethnicity
  • housing_quality
  • tobacco
  • nutrition_quality
  • loneliness
  • financial_situation
  • alcohol

Arguments to be made I think over loneliness and financial situation, will take some time before merging any of this work to get literature backing for any decisions we make.

Handovers

image
image

Cross-Validation

image
image
image
image
image

@ld-archer ld-archer self-assigned this Oct 30, 2023
@ld-archer ld-archer added this to the SF12 PCS and QALY/QALE milestone Oct 30, 2023
@ld-archer
Copy link
Collaborator Author

Happy with pathways at the mo (few issues to resolve still but we have a working model) so closing this now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants