Skip to contents

dta_mrq() splits a specified column in a data frame that contains multiple responses into multiple binary columns. Each binary column represents whether a given response option exists in the original data. It also allows for custom labeling of the new columns, with options for numeric conversion and clean column names.

Usage

dta_mrq(
  dat,
  .column,
  delimeter,
  prefix = NULL,
  as_numeric = FALSE,
  labels = c(TRUE, FALSE),
  is_clean_names = TRUE
)

Arguments

dat

A data frame containing the column to be split.

.column

The name of the column to split.

delimeter

A string representing the delimiter used to separate the different responses in the original column.

prefix

A string that is added as a prefix to the newly created column names. Default is NULL, which means no prefix will be added.

as_numeric

Logical. If TRUE, the new columns will be converted to numeric (1 for TRUE, 0 for FALSE). Default is FALSE.

labels

A vector of length 2 specifying the labels for the TRUE and FALSE values in the binary columns. Default is c(TRUE, FALSE).

is_clean_names

Logical. If TRUE, janitor::clean_names() is used to standardize the column names (e.g., convert to lowercase and replace spaces with underscores). Default is TRUE.

Value

A data frame with new binary columns added. The new columns represent each of the unique responses found in the original column, with the option to clean column names.

Details

This function is typically used when dealing with survey data where multiple options may be selected for a given question, and you want to split those options into individual binary columns indicating the presence or absence of each option.

Examples

data("data_gadgets")
dat <- data_gadgets
dta_gtable(dat)
gadgets_owned
Smartwatch, Tablet, Smartphone
Tablet, Smartwatch, Smart TV, Desktop Computer
Smartphone
Laptop, Tablet
Tablet, Smart TV, Digital Camera, Laptop
Laptop, Desktop Computer, Digital Camera, Smart TV, Smartphone
Digital Camera, Smartphone, Desktop Computer, Smartwatch
Smartwatch, Smart TV, Laptop, Smartphone
Desktop Computer
Smartphone, Laptop, Smart TV, Smartwatch
Tablet
Digital Camera, Tablet, Desktop Computer
Digital Camera, Desktop Computer, Smart TV, Smartwatch, Laptop
Tablet, Desktop Computer, Smart TV
Digital Camera, Desktop Computer, Smart TV, Smartphone, Laptop
# Split `gadgets_owned` column into separate columns. # The created columns will be logical (i.e. TRUE / FALSE). df <- dta_mrq( dat = dat, .column = gadgets_owned, delimeter = ", ", is_clean_names = TRUE) dta_gtable(df)
gadgets_owned Smartwatch Tablet Smartphone Smart TV Desktop Computer Laptop Digital Camera
Smartwatch, Tablet, Smartphone TRUE TRUE TRUE FALSE FALSE FALSE FALSE
Tablet, Smartwatch, Smart TV, Desktop Computer TRUE TRUE FALSE TRUE TRUE FALSE FALSE
Smartphone FALSE FALSE TRUE FALSE FALSE FALSE FALSE
Laptop, Tablet FALSE TRUE FALSE FALSE FALSE TRUE FALSE
Tablet, Smart TV, Digital Camera, Laptop FALSE TRUE FALSE TRUE FALSE TRUE TRUE
Laptop, Desktop Computer, Digital Camera, Smart TV, Smartphone FALSE FALSE TRUE TRUE TRUE TRUE TRUE
Digital Camera, Smartphone, Desktop Computer, Smartwatch TRUE FALSE TRUE FALSE TRUE FALSE TRUE
Smartwatch, Smart TV, Laptop, Smartphone TRUE FALSE TRUE TRUE FALSE TRUE FALSE
Desktop Computer FALSE FALSE FALSE FALSE TRUE FALSE FALSE
Smartphone, Laptop, Smart TV, Smartwatch TRUE FALSE TRUE TRUE FALSE TRUE FALSE
Tablet FALSE TRUE FALSE FALSE FALSE FALSE FALSE
Digital Camera, Tablet, Desktop Computer FALSE TRUE FALSE FALSE TRUE FALSE TRUE
Digital Camera, Desktop Computer, Smart TV, Smartwatch, Laptop TRUE FALSE FALSE TRUE TRUE TRUE TRUE
Tablet, Desktop Computer, Smart TV FALSE TRUE FALSE TRUE TRUE FALSE FALSE
Digital Camera, Desktop Computer, Smart TV, Smartphone, Laptop FALSE FALSE TRUE TRUE TRUE TRUE TRUE
# Convert the created columns from logical (TRUE / FALSE) # columns to numeric. df2 <- dta_mrq( dat = dat, .column = gadgets_owned, delimeter = ", ", as_numeric = TRUE, is_clean_names = TRUE ) dta_gtable(df2)
gadgets_owned Smartwatch Tablet Smartphone Smart TV Desktop Computer Laptop Digital Camera
Smartwatch, Tablet, Smartphone 1 1 1 0 0 0 0
Tablet, Smartwatch, Smart TV, Desktop Computer 1 1 0 1 1 0 0
Smartphone 0 0 1 0 0 0 0
Laptop, Tablet 0 1 0 0 0 1 0
Tablet, Smart TV, Digital Camera, Laptop 0 1 0 1 0 1 1
Laptop, Desktop Computer, Digital Camera, Smart TV, Smartphone 0 0 1 1 1 1 1
Digital Camera, Smartphone, Desktop Computer, Smartwatch 1 0 1 0 1 0 1
Smartwatch, Smart TV, Laptop, Smartphone 1 0 1 1 0 1 0
Desktop Computer 0 0 0 0 1 0 0
Smartphone, Laptop, Smart TV, Smartwatch 1 0 1 1 0 1 0
Tablet 0 1 0 0 0 0 0
Digital Camera, Tablet, Desktop Computer 0 1 0 0 1 0 1
Digital Camera, Desktop Computer, Smart TV, Smartwatch, Laptop 1 0 0 1 1 1 1
Tablet, Desktop Computer, Smart TV 0 1 0 1 1 0 0
Digital Camera, Desktop Computer, Smart TV, Smartphone, Laptop 0 0 1 1 1 1 1
# You can specify the labels to be used. In the example # below, the columns will be character with Yes / No. df3 <- dta_mrq( dat = dat, .column = gadgets_owned, delimeter = ", ", labels = c("Yes", "No"), is_clean_names = TRUE, ) dta_gtable(df3)
gadgets_owned Smartwatch Tablet Smartphone Smart TV Desktop Computer Laptop Digital Camera
Smartwatch, Tablet, Smartphone Yes Yes Yes No No No No
Tablet, Smartwatch, Smart TV, Desktop Computer Yes Yes No Yes Yes No No
Smartphone No No Yes No No No No
Laptop, Tablet No Yes No No No Yes No
Tablet, Smart TV, Digital Camera, Laptop No Yes No Yes No Yes Yes
Laptop, Desktop Computer, Digital Camera, Smart TV, Smartphone No No Yes Yes Yes Yes Yes
Digital Camera, Smartphone, Desktop Computer, Smartwatch Yes No Yes No Yes No Yes
Smartwatch, Smart TV, Laptop, Smartphone Yes No Yes Yes No Yes No
Desktop Computer No No No No Yes No No
Smartphone, Laptop, Smart TV, Smartwatch Yes No Yes Yes No Yes No
Tablet No Yes No No No No No
Digital Camera, Tablet, Desktop Computer No Yes No No Yes No Yes
Digital Camera, Desktop Computer, Smart TV, Smartwatch, Laptop Yes No No Yes Yes Yes Yes
Tablet, Desktop Computer, Smart TV No Yes No Yes Yes No No
Digital Camera, Desktop Computer, Smart TV, Smartphone, Laptop No No Yes Yes Yes Yes Yes
# Any other labels could be used. For example # Positive / Negative e.g. in the case of diseases. df4 <- dta_mrq( dat = dat, .column = gadgets_owned, delimeter = ", ", labels = c("Positive", "Negative"), is_clean_names = TRUE, ) dta_gtable(df4)
gadgets_owned Smartwatch Tablet Smartphone Smart TV Desktop Computer Laptop Digital Camera
Smartwatch, Tablet, Smartphone Positive Positive Positive Negative Negative Negative Negative
Tablet, Smartwatch, Smart TV, Desktop Computer Positive Positive Negative Positive Positive Negative Negative
Smartphone Negative Negative Positive Negative Negative Negative Negative
Laptop, Tablet Negative Positive Negative Negative Negative Positive Negative
Tablet, Smart TV, Digital Camera, Laptop Negative Positive Negative Positive Negative Positive Positive
Laptop, Desktop Computer, Digital Camera, Smart TV, Smartphone Negative Negative Positive Positive Positive Positive Positive
Digital Camera, Smartphone, Desktop Computer, Smartwatch Positive Negative Positive Negative Positive Negative Positive
Smartwatch, Smart TV, Laptop, Smartphone Positive Negative Positive Positive Negative Positive Negative
Desktop Computer Negative Negative Negative Negative Positive Negative Negative
Smartphone, Laptop, Smart TV, Smartwatch Positive Negative Positive Positive Negative Positive Negative
Tablet Negative Positive Negative Negative Negative Negative Negative
Digital Camera, Tablet, Desktop Computer Negative Positive Negative Negative Positive Negative Positive
Digital Camera, Desktop Computer, Smart TV, Smartwatch, Laptop Positive Negative Negative Positive Positive Positive Positive
Tablet, Desktop Computer, Smart TV Negative Positive Negative Positive Positive Negative Negative
Digital Camera, Desktop Computer, Smart TV, Smartphone, Laptop Negative Negative Positive Positive Positive Positive Positive
# Use numeric values and specify a `prefix` for the # column names. df5 <- dta_mrq( dat = dat, .column = gadgets_owned, delimeter = ", ", prefix = "gad_", labels = c(1, 2), is_clean_names = TRUE ) dta_gtable(df5)
gadgets_owned Smartwatch Tablet Smartphone Smart TV Desktop Computer Laptop Digital Camera
Smartwatch, Tablet, Smartphone 1 1 1 2 2 2 2
Tablet, Smartwatch, Smart TV, Desktop Computer 1 1 2 1 1 2 2
Smartphone 2 2 1 2 2 2 2
Laptop, Tablet 2 1 2 2 2 1 2
Tablet, Smart TV, Digital Camera, Laptop 2 1 2 1 2 1 1
Laptop, Desktop Computer, Digital Camera, Smart TV, Smartphone 2 2 1 1 1 1 1
Digital Camera, Smartphone, Desktop Computer, Smartwatch 1 2 1 2 1 2 1
Smartwatch, Smart TV, Laptop, Smartphone 1 2 1 1 2 1 2
Desktop Computer 2 2 2 2 1 2 2
Smartphone, Laptop, Smart TV, Smartwatch 1 2 1 1 2 1 2
Tablet 2 1 2 2 2 2 2
Digital Camera, Tablet, Desktop Computer 2 1 2 2 1 2 1
Digital Camera, Desktop Computer, Smart TV, Smartwatch, Laptop 1 2 2 1 1 1 1
Tablet, Desktop Computer, Smart TV 2 1 2 1 1 2 2
Digital Camera, Desktop Computer, Smart TV, Smartphone, Laptop 2 2 1 1 1 1 1