Create consistent metadata for pins

The metadata argument in pins is flexible and can hold any kind of metadata that you can formulate as a dict (convertable to JSON). In some situations, you may want to read and write with consistent customized metadata; you can create functions to wrap pin_write and pin_read for your particular use case.

We’ll begin by creating a temporary board for demonstration:

import pins
import pandas as pd

from pprint import pprint

board = pins.board_temp()

A function to store pandas Categoricals

Say you want to store a pandas Categorical object as JSON together with the categories of the categorical in the metadata.

For example, here is a simple categorical and its categories:

some_cat = pd.Categorical(["a", "a", "b"])

some_cat.categories
Index(['a', 'b'], dtype='object')

Notice that the categories attribute is just the unique values in the categorical.

We can write a function wrapping pin_write that holds the categories in metadata, so we can easily re-create the categorical with them.

def pin_write_cat_json(
    board,
    x: pd.Categorical,
    name,
    **kwargs
):
    metadata = {"categories": x.categories.to_list()}
    json_data = x.to_list()
    board.pin_write(json_data, name = name, type = "json", metadata = metadata, **kwargs)

We can use this new function to write a pin as JSON with our specific metadata:

some_cat = pd.Categorical(["a", "a", "b", "c"])
pin_write_cat_json(board, some_cat, name = "some-cat")
Writing pin:
Name: 'some-cat'
Version: 20240430T203946Z-6ce8e

A function to read categoricals

It’s possible to read this pin using the regular pin_read function, but the object we get is no longer a categorical!

board.pin_read("some-cat")
['a', 'a', 'b', 'c']

However, notice that if we use pin_meta, the information we stored on categories is in the .user field.

pprint(
    board.pin_meta("some-cat")
)
Meta(title='some-cat: a pinned list object',
     description=None,
     created='20240430T203946Z',
     pin_hash='6ce8eaa9de0dfd54',
     file='some-cat.json',
     file_size=20,
     type='json',
     api_version=1,
     version=Version(created=datetime.datetime(2024, 4, 30, 20, 39, 46),
                     hash='6ce8e'),
     tags=None,
     name='some-cat',
     user={'categories': ['a', 'b', 'c']},
     local={})

This enables us to write a special function for reading, to reconstruct the categorical, using the categories stashed in metadata:

def pin_read_cat_json(board, name, version=None, hash=None, **kwargs):
  data = board.pin_read(name = name, version = version, hash = hash, **kwargs)
  meta = board.pin_meta(name = name, version = version, **kwargs)
  return pd.Categorical(data, categories=meta.user["categories"])

pin_read_cat_json(board, "some-cat")
['a', 'a', 'b', 'c']
Categories (3, object): ['a', 'b', 'c']

For an example of how this approach is used in a real project, look at look at how the vetiver package wraps these functions to write and read model binaries as pins.