Table Transformer: slice a table with a slice point on a time column
Source:R/table_transformers.R
tt_time_slice.Rd
With any table object containing date, date-time columns, or a mixture
thereof, any one of those columns can be used to effectively slice the data
table in two with a slice_point
: and you get to choose which of those
slices you want to keep. The slice point can be defined in several ways. One
method involves using a decimal value between 0
and 1
, which defines the
slice point as the time instant somewhere between the earliest time value (at
0
) and the latest time value (at 1
). Another way of defining the slice
point is by supplying a time value, and the following input types are
accepted: (1) an ISO 8601 formatted time string (as a date or a date-time),
(2) a POSIXct
time, or (3) a Date
object.
Usage
tt_time_slice(
tbl,
time_column = NULL,
slice_point = 0,
keep = c("left", "right"),
arrange = FALSE
)
Arguments
- tbl
A data table
obj:<tbl_*>
// requiredA table object to be used as input for the transformation. This can be a data frame, a tibble, a
tbl_dbi
object, or atbl_spark
object.- time_column
Column with time data
scalar<character>
// default:NULL
(optional
)The time-based column that will be used as a basis for the slicing. If no time column is provided then the first one found will be used.
- slice_point
scalar<numeric|character|POSIXct|Date>
// default:0
The location on the
time_column
where the slicing will occur. This can either be a decimal value from0
to1
, an ISO 8601 formatted time string (as a date or a date-time), aPOSIXct
time, or aDate
object.- keep
Data slice to keep
singl-kw:[left|right]
// default:"left"
Which slice should be kept? The
"left"
side (the default) contains data rows that are earlier than theslice_point
and the"right"
side will have rows that are later.- arrange
Arrange data slice by the time data?
scalar<logical>
// default:FALSE
Should the slice be arranged by the
time_column
? This may be useful if the inputtbl
isn't ordered by thetime_column
. By default, this isFALSE
and the original ordering is retained.
Value
A data frame, a tibble, a tbl_dbi
object, or a tbl_spark
object
depending on what was provided as tbl
.
Details
There is the option to arrange
the table by the date or date-time values in
the time_column
. This ordering is always done in an ascending manner. Any
NA
/NULL
values in the time_column
will result in the corresponding rows
can being removed (no matter which slice is retained).
Examples
Let's use the game_revenue
dataset, included in the pointblank package,
as the input table for the first demo. It has entries in the first 21 days of
2015 and we'll elect to get all of the records where the time
values are
strictly for the first 15 days of 2015. The keep
argument has a default of
"left"
so all rows where the time
column is less than
"2015-01-16 00:00:00"
will be kept.
tt_time_slice(
tbl = game_revenue,
time_column = "time",
slice_point = "2015-01-16"
)
#> # A tibble: 1,208 x 11
#> player_id session_id session_start time item_type
#> <chr> <chr> <dttm> <dttm> <chr>
#> 1 ECPANOIXLZHF896 ECPANOIXLZ~ 2015-01-01 01:31:03 2015-01-01 01:31:27 iap
#> 2 ECPANOIXLZHF896 ECPANOIXLZ~ 2015-01-01 01:31:03 2015-01-01 01:36:57 iap
#> 3 ECPANOIXLZHF896 ECPANOIXLZ~ 2015-01-01 01:31:03 2015-01-01 01:37:45 iap
#> 4 ECPANOIXLZHF896 ECPANOIXLZ~ 2015-01-01 01:31:03 2015-01-01 01:42:33 ad
#> 5 ECPANOIXLZHF896 ECPANOIXLZ~ 2015-01-01 11:50:02 2015-01-01 11:55:20 ad
#> 6 ECPANOIXLZHF896 ECPANOIXLZ~ 2015-01-01 11:50:02 2015-01-01 12:08:56 ad
#> 7 ECPANOIXLZHF896 ECPANOIXLZ~ 2015-01-01 11:50:02 2015-01-01 12:14:08 ad
#> 8 ECPANOIXLZHF896 ECPANOIXLZ~ 2015-01-01 11:50:02 2015-01-01 12:21:44 ad
#> 9 ECPANOIXLZHF896 ECPANOIXLZ~ 2015-01-01 11:50:02 2015-01-01 12:24:20 ad
#> 10 FXWUORGYNJAE271 FXWUORGYNJ~ 2015-01-01 15:17:18 2015-01-01 15:19:36 ad
#> # i 1,198 more rows
#> # i 6 more variables: item_name <chr>, item_revenue <dbl>,
#> # session_duration <dbl>, start_day <date>, acquisition <chr>, country <chr>
Omit the first 25% of records from small_table
, also included in the
package, with a fractional slice_point
of 0.25
on the basis of a timeline
that begins at 2016-01-04 11:00:00
and ends at 2016-01-30 11:23:00
.
small_table %>%
tt_time_slice(
slice_point = 0.25,
keep = "right"
)
#> # A tibble: 8 x 8
#> date_time date a b c d e f
#> <dttm> <date> <int> <chr> <dbl> <dbl> <lgl> <chr>
#> 1 2016-01-11 06:15:00 2016-01-11 4 2-dhe-923 4 3291. TRUE mid
#> 2 2016-01-15 18:46:00 2016-01-15 7 1-knw-093 3 843. TRUE high
#> 3 2016-01-17 11:27:00 2016-01-17 4 5-boe-639 2 1036. FALSE low
#> 4 2016-01-20 04:30:00 2016-01-20 3 5-bce-642 9 838. FALSE high
#> 5 2016-01-20 04:30:00 2016-01-20 3 5-bce-642 9 838. FALSE high
#> 6 2016-01-26 20:07:00 2016-01-26 4 2-dmx-010 7 834. TRUE low
#> 7 2016-01-28 02:51:00 2016-01-28 2 7-dmx-010 8 108. FALSE low
#> 8 2016-01-30 11:23:00 2016-01-30 1 3-dka-303 NA 2230. TRUE high
See also
Other Table Transformers:
get_tt_param()
,
tt_string_info()
,
tt_summary_stats()
,
tt_tbl_colnames()
,
tt_tbl_dims()
,
tt_time_shift()