vignettes/using_own_classifications.Rmd
using_own_classifications.Rmd
The package has been built in mind of allowing user to use their own classifications for ICD codes (or any string codes). The classifications can utilise multiple different types of codes. For example, below we use ICD-10 and ICD-9 codes on a single classification definition. The classification definition need to hold following variable names: class_X
and label_X
, where X is the name of class defintion and also at least one other column what defines the regular expression for the class. The regular expression describes all the codes that belong to the same class. For example in our example data set, the first group is called “aids” and regular expression for icd10
column is "^B20|^B21|^B22|^B24"
. This means that all the codes starting with either B20, B21, B22 or B24, which are labeled as "icd10"
by another variable icd
(TODO: codetype) in the data, belong to this class. The symbol “^” states that the codes must begin with these characters. The symbol “|” is OR operator in regular expression syntax. For more information about regular expression, please read Regular Expression Cheat Sheet.
library(regstudies) # ´regstudies´ makes it easy to classify codes such as ICD-codes to groups and also to sum scores based on the groupings. sample_data <- left_join(sample_cohort,sample_regdata) sample_data %>% classify_charlson(icd_codes = CODE1) %>% sum_score(score_charlson) # Users can define the code groups themselves using simple regular expression syntax. my_classes <- tribble(~class_myclass, ~label_myclass, ~icd10, ~icd9, ~score, "aids", "AIDS or HIV", "^B20|^B21|^B22|^B24", "^042|^043|^044", 7, "dementia", "Dementia", "^F00|^F01|^F02|^F03|^F051|^G30|^G311", "^290|^2941|^3312", 3, "pud", "Peptic ulcer disease", "^K25|^K26|^K27|^K28", "^531|^532|^533|^534", 1 ) # This classification definition has two different type of codes. head(my_classes) # Data has three different types of codes. head(sample_data) # Different groups types can are automatically handled simultaneously. sample_data %>% classify_codes(codes=CODE1,diag_tbl = read_classes(my_classes)) %>% sum_score(score)
This calculates per row Elixhauser classification. Check more info of classification from classification tables-section.
# Classify by Elixhauser: my_classes <- read_classes(file = "../data/elixhauser_classes.csv") classified_d <- filtered_d %>% classify_codes(codes = CODE1, diag_tbl = my_classes) head(classified_d)
Another way (a shortcut) for Elixhauser classifications:
elixh_d <- filtered_d %>% classify_elixhauser(icd_codes = CODE1) head(elixh_d)
To calculate the scores of Elixhauser comorbidity index, we can use sum_score
function. As our classification table defines two alternative score definitions, namely score_AHRQ
and score_van_Walraven
, we can calculate both of them on one call as follows:
elixh_score <- filtered_d %>% classify_elixhauser(icd_codes = CODE1) %>% sum_score(score_AHRQ, score_van_Walraven) head(elixh_score)
We also provide Charlson comorbidity classification ready to use in regstudies
. It can be easilly used with function classify_charlson
. The use of classify_charlson
is very identical to use of classify_elixhauser
. Only difference is that in the classification the score-variable is called score_charlson
.
## Classifying Charlson comorbidity classes to long format charlson_score <- filtered_d %>% classify_charlson(icd_codes = CODE1) %>% sum_score(score_charlson) head(charlson_score)
The outputs of all classification functions (classify_elixhauser
, classify_charlson
and classify_codes
) gives output as long data. If user need to have class indicators or scores of each events as wide, it can be accomplished using pivot_wider
as follows.
charlson_d <- filtered_d %>% classify_charlson(icd_codes = CODE1) charlson_d %>% mutate(class_charlson=paste0("cl_",class_charlson)) %>% # TODO: laita tämä ominaisuus classify-funktioon sisälle! mutate(score_charlson=as.integer(score_charlson>0)) %>% tidyr::pivot_wider(names_from="class_charlson", values_from="score_charlson", values_fill=0) %>% select(-all_of(c("cl_NA","label_charlson"))) -> wide # Some events do not belong to any class. That creates NA to data and pivot_wider handles NA classes to cl_NA column in this code. head(wide)
## calculating labels regstudies:::charlson_classes %>% select(class_charlson,label_charlson) %>% mutate(class_charlson=paste0("cl_",class_charlson)) -> labels # setting up the labels for wide data for(i in 1:dim(labels)[1]) { l<-labels$class_charlson[i] if(!is.null(wide[[l]])) { attr(wide[[l]], "label") <- labels$label_charlson[i] } } head(wide)