Skip to contents

dta_crosstab() creates a cross-tabulation (contingency table) for a given dataset, with options to include counts, row percentages, or column percentages. Totals can also be added to rows, columns, or both. The output is styled using the janitor package.

Usage

dta_crosstab(
  dat,
  .row,
  .column,
  cells = c("counts", "row", "col"),
  add_totals = c("both", "row", "col"),
  name = "Variable",
  add_percent_symbol = TRUE,
  digits = 2
)

Arguments

dat

A data frame (not a tibble) containing the variables to tabulate.

.row

The variable to be used as rows in the crosstab.

.column

The variable to be used as columns in the crosstab.

cells

A character string indicating the type of values to display in the crosstab:

"counts"

Display counts (default).

"row"

Display row percentages.

"col"

Display column percentages.

add_totals

A character string specifying where to add totals:

"both"

Add totals to both rows and columns (default).

"row"

Add totals to rows only.

"col"

Add totals to columns only.

name

A character string to rename the first column in the output. Default is "Variable".

add_percent_symbol

Logical indicating whether or not to add a % sign to percentages. Default is TRUE.

digits

An integer specifying the number of decimal places to use for percentages. Default is 1.

Value

A data frame representing the cross-tabulation, styled using janitor functions. The format of the output depends on the cells argument:

Counts

Displays raw counts.

Percentages

Displays percentages with counts in parentheses if cells is "row" or "col".

Examples

data("data_sample")
df <- data_sample

# Crosstabulation of frequencies (counts)

result <- dta_crosstab(
  dat = df, .row = region, .column = age_group
)
dta_gtable(result)
Variable 20-29 30-39 40-49 50-59 60-69 70+ Total
Central 33 68 100 80 45 30 356
North East 65 133 199 174 127 66 764
South 94 151 177 206 154 69 851
West 50 93 117 128 96 45 529
Total 242 445 593 588 422 210 2500
# Calculate column percentages result2 <- dta_crosstab( dat = df, .row = region, .column = age_group, cells = "col", add_totals = "col" ) dta_gtable(result2)
region/age_group 20-29 30-39 40-49 50-59 60-69 70+ Total
Central 33 (13.64%) 68 (15.28%) 100 (16.86%) 80 (13.61%) 45 (10.66%) 30 (14.29%) 356 (14.24%)
North East 65 (26.86%) 133 (29.89%) 199 (33.56%) 174 (29.59%) 127 (30.09%) 66 (31.43%) 764 (30.56%)
South 94 (38.84%) 151 (33.93%) 177 (29.85%) 206 (35.03%) 154 (36.49%) 69 (32.86%) 851 (34.04%)
West 50 (20.66%) 93 (20.90%) 117 (19.73%) 128 (21.77%) 96 (22.75%) 45 (21.43%) 529 (21.16%)
# Calculate row percentages result3 <- dta_crosstab( dat = df, .row = region, .column = age_group, cells = "row", add_totals = "row" ) dta_gtable(result3)
region/age_group 20-29 30-39 40-49 50-59 60-69 70+
Central 33 (9.27%) 68 (19.10%) 100 (28.09%) 80 (22.47%) 45 (12.64%) 30 (8.43%)
North East 65 (8.51%) 133 (17.41%) 199 (26.05%) 174 (22.77%) 127 (16.62%) 66 (8.64%)
South 94 (11.05%) 151 (17.74%) 177 (20.80%) 206 (24.21%) 154 (18.10%) 69 (8.11%)
West 50 (9.45%) 93 (17.58%) 117 (22.12%) 128 (24.20%) 96 (18.15%) 45 (8.51%)
Total 242 (9.68%) 445 (17.80%) 593 (23.72%) 588 (23.52%) 422 (16.88%) 210 (8.40%)
# Remove the percentages symbol result4 <- dta_crosstab( dat = df, .row = region, .column = age_group, cells = "row", add_totals = "row", add_percent_symbol = FALSE ) dta_gtable(result4)
region/age_group 20-29 30-39 40-49 50-59 60-69 70+
Central 33 (9.27) 68 (19.10) 100 (28.09) 80 (22.47) 45 (12.64) 30 (8.43)
North East 65 (8.51) 133 (17.41) 199 (26.05) 174 (22.77) 127 (16.62) 66 (8.64)
South 94 (11.05) 151 (17.74) 177 (20.80) 206 (24.21) 154 (18.10) 69 (8.11)
West 50 (9.45) 93 (17.58) 117 (22.12) 128 (24.20) 96 (18.15) 45 (8.51)
Total 242 (9.68) 445 (17.80) 593 (23.72) 588 (23.52) 422 (16.88) 210 (8.40)