Skip to contents

Using nrow(subset(x, condition)) to count the instances where condition applies inefficiently requires doing a full subset of x just to count the number of rows in the resulting subset. There are a number of equivalent expressions that don't require the full subset, e.g. with(x, sum(condition)) (or, more generically, with(x, sum(condition, na.rm = TRUE))).

Usage

nrow_subset_linter()

See also

linters for a complete list of linters available in lintr.

Examples

# will produce lints
lint(
  text = "nrow(subset(x, is_treatment))",
  linters = nrow_subset_linter()
)
#> ::warning file=<text>,line=1,col=1::file=<text>,line=1,col=1,[nrow_subset_linter] Use arithmetic to count the number of rows satisfying a condition, rather than fully subsetting the data.frame and counting the resulting rows. For example, replace nrow(subset(x, is_treatment)) with sum(x$is_treatment). NB: use na.rm = TRUE if `is_treatment` has missing values.

lint(
  text = "nrow(filter(x, is_treatment))",
  linters = nrow_subset_linter()
)
#> ::warning file=<text>,line=1,col=1::file=<text>,line=1,col=1,[nrow_subset_linter] Use arithmetic to count the number of rows satisfying a condition, rather than fully subsetting the data.frame and counting the resulting rows. For example, replace nrow(subset(x, is_treatment)) with sum(x$is_treatment). NB: use na.rm = TRUE if `is_treatment` has missing values.

lint(
  text = "x %>% filter(x, is_treatment) %>% nrow()",
  linters = nrow_subset_linter()
)
#> ::warning file=<text>,line=1,col=1::file=<text>,line=1,col=1,[nrow_subset_linter] Use arithmetic to count the number of rows satisfying a condition, rather than fully subsetting the data.frame and counting the resulting rows. For example, replace nrow(subset(x, is_treatment)) with sum(x$is_treatment). NB: use na.rm = TRUE if `is_treatment` has missing values.

# okay
lint(
  text = "with(x, sum(is_treatment, na.rm = TRUE))",
  linters = nrow_subset_linter()
)