Skip to contents

dta_label() assigns variable labels to the columns of a data frame or tibble. It either takes a dictionary containing names and labels or takes vectors of names and labels directly. This function ensures that labels are applied correctly and handles missing or empty label names gracefully.

Usage

dta_label(dat, dict, .names, .labels)

Arguments

dat

A data frame or tibble to which the labels will be applied.

dict

A data frame or tibble (optional) containing two columns representing the variable names and their corresponding labels. If NULL, the labels will be generated from the .names and .labels arguments.

.names

A character vector of variable names. This is required if dict is not provided.

.labels

A character vector of variable labels corresponding to .names. This is required if dict is not provided.

Value

A tibble with the same structure as the input dat, but with the variable labels applied to the columns.

Details

If dict is provided, it should contain at least two columns: which define the mapping between variable names and their labels. If dict is not provided, the function will directly use the .names and .labels arguments to assign labels to the variables.

If the .names vector is missing or has missing entries, the function will attempt to use the column names from the dat object. If the length of .names and the number of columns in dat do not match, an error will be raised.

Examples

# Using named vectors for labels

dat <- data.frame(
  age = c(25, 30, 35, 40),
  gender = c("Male", "Female", "Female", "Male"),
  income = c(50000, 60000, 55000, 65000)
)

names <- c("age", "income")
labels <- c("Age in years", "Annual income")

result <- dta_label(
  dat, dict = NULL, .names = names, .labels = labels
)

dta_gtable(result)
Age in years gender Annual income
25 Male 50000
30 Female 60000
35 Female 55000
40 Male 65000
# Using a dictionary data frame data("data_bmi") dta_gtable(head(data_bmi))
id age height weight
STM/4921 50 1.64 59
STM/4396 34 1.98 57
STM/7908 50 1.95 84
STM/7243 39 1.52 63
STM/4801 52 1.69 65
STM/5134 50 1.71 73
data("dict_labels") dta_gtable(dict_labels)
names labels
age Age in years
height Height in meters
weight Weight in kilograms
result2 <- dta_label( dat = data_bmi, dict = dict_labels, .names = names, .labels = labels ) # Proving only the `.labels` argument will rename all # variables in \code{dat}. In such a case, the length of # `.names` must be equal to the number of columns in # \code{dat}. labels <- c("Unique identifier", dict_labels$labels) result3 <- dta_label( data_bmi, dict = NULL, .names = NULL, .labels = labels )