Table Transformer: obtain a summary table for string columns
Source:R/table_transformers.R
tt_string_info.Rd
With any table object, you can produce a summary table that is scoped to
string-based columns. The output summary table will have a leading column
called ".param."
with labels for each of the three rows, each corresponding
to the following pieces of information pertaining to string length:
Mean String Length (
"length_mean"
)Minimum String Length (
"length_min"
)Maximum String Length (
"length_max"
)
Only string data from the input table will generate columns in the output table. Column names from the input will be used in the output, preserving order as well.
Arguments
- tbl
A data table
obj:<tbl_*>
// requiredA table object to be used as input for the transformation. This can be a data frame, a tibble, a
tbl_dbi
object, or atbl_spark
object.
Examples
Get string information for the string-based columns in the game_revenue
dataset that is included in the pointblank package.
tt_string_info(tbl = game_revenue)
#> # A tibble: 3 x 7
#> .param. player_id session_id item_type item_name acquisition country
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 length_mean 15 24 2.22 7.35 7.97 8.53
#> 2 length_min 15 24 2 5 5 5
#> 3 length_max 15 24 3 11 14 14
Ensure that player_id
and session_id
values always have the same fixed
numbers of characters (15
and 24
, respectively) throughout the table.
tt_string_info(tbl = game_revenue) %>%
col_vals_equal(
columns = player_id,
value = 15
) %>%
col_vals_equal(
columns = session_id,
value = 24
)
#> # A tibble: 3 x 7
#> .param. player_id session_id item_type item_name acquisition country
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 length_mean 15 24 2.22 7.35 7.97 8.53
#> 2 length_min 15 24 2 5 5 5
#> 3 length_max 15 24 3 11 14 14
We see data, and not an error, so both validations were successful!
Let's use a tt_string_info()
-transformed table with the
test_col_vals_lte()
to check that the maximum string length in column f
of the small_table
dataset is no greater than 4
.
tt_string_info(tbl = small_table) %>%
test_col_vals_lte(
columns = f,
value = 4
)
#> [1] TRUE
See also
Other Table Transformers:
get_tt_param()
,
tt_summary_stats()
,
tt_tbl_colnames()
,
tt_tbl_dims()
,
tt_time_shift()
,
tt_time_slice()