With yaml_write() we can take different pointblank objects (these are
the ptblank_agent, ptblank_informant, and tbl_store) and write them to
YAML. With an agent, for example, yaml_write() will write that everything
that is needed to specify an agent and it's validation plan to a YAML file.
With YAML, we can modify the YAML markup if so desired, or, use as is to
create a new agent with the yaml_read_agent() function. That agent will
have a validation plan and is ready to interrogate() the data. We can go a
step further and perform an interrogation directly from the YAML file with
the yaml_agent_interrogate() function. That returns an agent with intel
(having already interrogated the target data table). An informant object
can also be written to YAML with yaml_write().
One requirement for writing an agent or an informant to YAML is that we
need to have a table-prep formula specified (it's an R formula that is used
to read the target table when interrogate() or incorporate() is called).
This option can be set when using create_agent()/create_informant() or
with set_tbl() (useful with an existing agent or informant object).
Usage
yaml_write(
  ...,
  .list = list2(...),
  filename = NULL,
  path = NULL,
  expanded = FALSE,
  quiet = FALSE
)Arguments
- ...
- Pointblank agents, informants, table stores - <series of obj:<ptblank_agent|ptblank_informant|tbl_store>>// required- Any mix of pointblank objects such as the agent ( - ptblank_agent), the informant (- ptblank_informant), or the table store (- tbl_store). The agent and informant can be combined into a single YAML file (so long as both objects refer to the same table). A table store cannot be combined with either an agent or an informant so it must undergo conversion alone.
- .list
- Alternative to - ...- <list of multiple expressions>// required (or, use- ...)- Allows for the use of a list as an input alternative to - ....
- filename
- File name - scalar<character>// default:- NULL(- optional)- The name of the YAML file to create on disk. It is recommended that either the - .yamlor- .ymlextension be used for this file. If not provided then default names will be used (- "tbl_store.yml") for a table store and the other objects will get default naming to the effect of- "<object>-<tbl_name>.yml".
- path
- File path - scalar<character>// default:- NULL(- optional)- An optional path to which the YAML file should be saved (combined with - filename).
- expanded
- Expand validation when repeating across multiple columns - scalar<logical>// default:- FALSE- Should the written validation expressions for an agent be expanded such that tidyselect expressions for columns are evaluated, yielding a validation function per column? By default, this is - FALSEso expressions as written will be retained in the YAML representation.
- quiet
- Inform (or not) upon file writing - scalar<logical>// default:- FALSE- . Should the function not inform when the file is written? 
Examples
Writing an agent object to a YAML file
Let's go through the process of developing an agent with a validation plan.
We'll use the small_table dataset in the following examples, which will
eventually offload the developed validation plan to a YAML file.
small_table
#> # A tibble: 13 x 8
#>    date_time           date           a b             c      d e     f
#>    <dttm>              <date>     <int> <chr>     <dbl>  <dbl> <lgl> <chr>
#>  1 2016-01-04 11:00:00 2016-01-04     2 1-bcd-345     3  3423. TRUE  high
#>  2 2016-01-04 00:32:00 2016-01-04     3 5-egh-163     8 10000. TRUE  low
#>  3 2016-01-05 13:32:00 2016-01-05     6 8-kdg-938     3  2343. TRUE  high
#>  4 2016-01-06 17:23:00 2016-01-06     2 5-jdo-903    NA  3892. FALSE mid
#>  5 2016-01-09 12:36:00 2016-01-09     8 3-ldm-038     7   284. TRUE  low
#>  6 2016-01-11 06:15:00 2016-01-11     4 2-dhe-923     4  3291. TRUE  mid
#>  7 2016-01-15 18:46:00 2016-01-15     7 1-knw-093     3   843. TRUE  high
#>  8 2016-01-17 11:27:00 2016-01-17     4 5-boe-639     2  1036. FALSE low
#>  9 2016-01-20 04:30:00 2016-01-20     3 5-bce-642     9   838. FALSE high
#> 10 2016-01-20 04:30:00 2016-01-20     3 5-bce-642     9   838. FALSE high
#> 11 2016-01-26 20:07:00 2016-01-26     4 2-dmx-010     7   834. TRUE  low
#> 12 2016-01-28 02:51:00 2016-01-28     2 7-dmx-010     8   108. FALSE low
#> 13 2016-01-30 11:23:00 2016-01-30     1 3-dka-303    NA  2230. TRUE  highCreating an action_levels object is a common workflow step when creating a
pointblank agent. We designate failure thresholds to the warn, stop,
and notify states using action_levels().
al <-
  action_levels(
    warn_at = 0.10,
    stop_at = 0.25,
    notify_at = 0.35
  )Now let's create the agent and pass it the al object (which serves as a
default for all validation steps which can be overridden). The data will be
referenced in tbl with a leading ~ and this is a requirement for writing
to YAML since the preparation of the target table must be self contained.
agent <-
  create_agent(
    tbl = ~ small_table,
    tbl_name = "small_table",
    label = "A simple example with the `small_table`.",
    actions = al
  )Then, as with any agent object, we can add steps to the validation plan by
using as many validation functions as we want.
agent <-
  agent %>%
  col_exists(columns = c(date, date_time)) %>%
  col_vals_regex(
    columns = b,
    regex = "[0-9]-[a-z]{3}-[0-9]{3}"
  ) %>%
  rows_distinct() %>%
  col_vals_gt(columns = d, value = 100) %>%
  col_vals_lte(columns = c, value = 5)The agent can be written to a pointblank-readable YAML file with the
yaml_write() function. Here, we'll use the filename
"agent-small_table.yml" and, after writing, the YAML file will be in the
working directory:
yaml_write(agent, filename = "agent-small_table.yml")We can view the YAML file in the console with the yaml_agent_string()
function.
yaml_agent_string(filename = "agent-small_table.yml")type: agent
tbl: ~small_table
tbl_name: small_table
label: A simple example with the `small_table`.
lang: en
locale: en
actions:
  warn_fraction: 0.1
  stop_fraction: 0.25
  notify_fraction: 0.35
steps:
- col_exists:
    columns: c(date, date_time)
- col_vals_regex:
    columns: c(b)
    regex: '[0-9]-[a-z]{3}-[0-9]{3}'
- rows_distinct:
    columns: ~
- col_vals_gt:
    columns: c(d)
    value: 100.0
- col_vals_lte:
    columns: c(c)
    value: 5.0Incidentally, we can also use yaml_agent_string() to print YAML in the
console when supplying an agent as the input. This can be useful for
previewing YAML output just before writing it to disk with yaml_write().
Reading an agent object from a YAML file
There's a YAML file available in the pointblank package that's also
called "agent-small_table.yml". The path for it can be accessed through
system.file():
yml_file_path <-
  system.file(
    "yaml", "agent-small_table.yml",
    package = "pointblank"
  )The YAML file can be read as an agent with a pre-existing validation plan by
using the yaml_read_agent() function.
agent <- yaml_read_agent(filename = yml_file_path)
agent
This particular agent is using ~ tbl_source("small_table", "tbl_store.yml")
to source the table-prep from a YAML file that holds a table store (can be
seen using yaml_agent_string(agent = agent)). Let's put that file in the
working directory (the pointblank package has the corresponding YAML
file):
yml_tbl_store_path <-
  system.file(
    "yaml", "tbl_store.yml",
    package = "pointblank"
  )
file.copy(from = yml_tbl_store_path, to = ".")As can be seen from the validation report, no interrogation was yet
performed. Saving an agent to YAML will remove any traces of interrogation
data and serve as a plan for a new interrogation on the same target table. We
can either follow this up with with interrogate() and get an agent with
intel, or, we can interrogate directly from the YAML file with
yaml_agent_interrogate():
agent <- yaml_agent_interrogate(filename = yml_file_path)
agent
Writing an informant object to a YAML file
Let's walk through how we can generate some useful information for a really
small table. We can create an informant object with create_informant()
and we'll again use the small_table dataset.
informant <-
  create_informant(
    tbl = ~ small_table,
    tbl_name = "small_table",
    label = "A simple example with the `small_table`."
  )Then, as with any informant object, we can add info text to the
using as many info_*() functions as we want.
informant <-
  informant %>%
  info_columns(
    columns = a,
    info = "In the range of 1 to 10. (SIMPLE)"
  ) %>%
  info_columns(
    columns = starts_with("date"),
    info = "Time-based values (e.g., `Sys.time()`)."
  ) %>%
  info_columns(
    columns = date,
    info = "The date part of `date_time`. (CALC)"
  )The informant can be written to a pointblank-readable YAML file with the
yaml_write() function. Here, we'll use the filename
"informant-small_table.yml" and, after writing, the YAML file will be in
the working directory:
yaml_write(informant, filename = "informant-small_table.yml")We can inspect the YAML file in the working directory and expect to see the following:
type: informant
tbl: ~small_table
tbl_name: small_table
info_label: A simple example with the `small_table`.
lang: en
locale: en
table:
  name: small_table
  _columns: 8
  _rows: 13.0
  _type: tbl_df
columns:
  date_time:
  _type: POSIXct, POSIXt
info: Time-based values (e.g., `Sys.time()`).
date:
  _type: Date
  info: Time-based values (e.g., `Sys.time()`). The date part of `date_time`.
a:
  _type: integer
  info: In the range of 1 to 10. (SIMPLE)
b:
  _type: character
c:
  _type: numeric
d:
  _type: numeric
e:
  _type: logical
f:
  _type: characterReading an informant object from a YAML file
There's a YAML file available in the pointblank package that's also
called "informant-small_table.yml". The path for it can be accessed through
system.file():
yml_file_path <-
  system.file(
    "yaml", "informant-small_table.yml",
    package = "pointblank"
  )The YAML file can be read as an informant by using the
yaml_read_informant() function.
informant <- yaml_read_informant(filename = yml_file_path)
informant
As can be seen from the information report, the available table metadata was
restored and reported. If you expect metadata to change with time, it might
be beneficial to use incorporate() to query the target table. Or, we can
perform this querying directly from the YAML file with
yaml_informant_incorporate():
informant <- yaml_informant_incorporate(filename = yml_file_path)There will be no apparent difference in this particular case since
small_data is a static table with no alterations over time. However,
using yaml_informant_incorporate() is good practice since this refreshing
of data will be important with real-world datasets.
See also
Other pointblank YAML:
yaml_agent_interrogate(),
yaml_agent_show_exprs(),
yaml_agent_string(),
yaml_exec(),
yaml_informant_incorporate(),
yaml_read_agent(),
yaml_read_informant()