pitz data model outline

I just finished a whole bunch of documentation on the pitz data model. You can read it here or just read all the stuff I copied below:

There are two classes in pitz: entities and bags. Everything else are subclasses of these two.


Making them

Every entity is an object like a dictionary. You can make an entity like this:

>>> from pitz import Entity
>>> e = Entity(title="example entity",
... creator="Matt",
... importance="not very")

You can also load an entity from a yaml file, but I’ll explain that later.

You can look up a value for any attribute like this:

>>> e['title']
'example entity'
>>> e.keys() #doctest: +NORMALIZE_WHITESPACE
['name', 'creator', 'importance', 'title', 'modified_time',
'created_time', 'type']
>>> e['type']

Viewing them

Entities have a summarized view useful when you want to see a list of entities, and a detailed view that shows all the boring detail:

>>> e.summarized_view
'example entity (entity)'

>>> print(e.detailed_view) #doctest: +SKIP
example entity (entity)



not very

example entity

2009-04-04 07:47:09.456068

2009-04-04 07:47:09.456068


Notice how our entity has some attributes we never set, like name, type, created_time, and modified_time. I make these in the __init__ method of the entity class.

By the way, you can ignore the #doctest: +SKIP comment. That is there so the doctests will skip trying to running this example, which will generate unpredictable values.

Saving and loading them

Entities have an instance method named to_yaml_file and a from_yaml_file classmethod. Here’s how to use them:

>>> outfile = e.to_yaml_file('.') # Writes file to this directory.
>>> e2 = Entity.from_yaml_file(outfile)


Making them

While entities are based on dictionaries, bags are based on lists. You can give a bag instance a title, which is nice for remembering what it is you want it for. Bags make it easy to organize a bunch of entities.

>>> from pitz import Bag
>>> b = Bag(title="Stuff that is not very important")
>>> b.append(e)

Viewing them

Converting a bag to a string prints the summarized view of all the entities inside:

>>> print(b) #doctest: +SKIP
Stuff that is not very important

1 entity entities

0: example entity (entity)

That number 0 can be used to pull out the entity at that position, just like a regular boring old list:

>>> e == b[0]

Querying them

Bags have a matches_dict method that accepts a bunch of key-value pairs and then returns a new bag that contains all the entities in the first bag that match all those key-value pairs.

First, I’ll make a few more entities:

>>> e1 = Entity(title="example #1", creator="Matt",
... importance="Really important")
>>> e2 = Entity(title="example #2", creator="Matt",
... importance="not very")

Now I’ll make a new bag that has both of these new entities:

>>> b = Bag('Everything')
>>> b.append(e1)
>>> b.append(e2)
>>> print(b)

2 entity entities

0: example #1 (entity)
1: example #2 (entity)

Here is how to get a new bag with just the entities that have an importance attribute set to “not very”:

>>> not_very_important = b.matches_dict(importance="not very")
>>> len(not_very_important) == 1
>>> not_very_important[0] == e2

Since matches_dict is the most common method I call on a bag, I made the __call__ method on the Bag class run matches_dict. So that means this works just as well:

>>> not_very_important = b(importance="not very")

Saving and loading them

Bags can send all contained entities to yaml files with to_yaml_files, and bags can load a bunch of entities from yaml files with from_yaml_files.

Right now, there is no way for a bag to save itself to yaml.

The Special Project Bag

After I finished bags and entities, I thought I was done, but then I ran into a few frustrations:

  • When I made a bunch of entities, but didn’t append them all into one bag, then I couldn’t run filters across all of them.
  • At the end of a session, it wasn’t easy for me to make sure that all of the entities got saved out to yaml.
  • I couldn’t figure out an elegant way to store one entity as a value for another entity’s attribute.

So I made a “special” Bag subclass called Project. The idea here is that every entity should be a member of the project bag. Also, every entity should have a reference back to the project.

Using a project is easy. Just pass it in as the first argument when you make an entity. Imagine I want to link some tasks to Matt and some other tasks to Lindsey. First I make a project:

>>> from pitz import Project
>>> weekend_chores = Project(title="Weekend chores")

Now I make the rest of the entities:

>>> matt = Entity(weekend_chores, title="Matt")
>>> lindsey = Entity(weekend_chores, title="Lindsey")
>>> t1 = Entity(weekend_chores, title="Mow the yard", assigned_to=matt)
>>> t2 = Entity(weekend_chores, title="Buy some groceries",
... assigned_to=lindsey)

Now it is easy to get tasks for matt:

>>> chores_for_matt = weekend_chores(assigned_to=matt)
>>> mow_the_yard = chores_for_matt[0]
>>> mow_the_yard['assigned_to'] == matt


There’s a problem in that last example: when I send this mow_the_yard entity out to a YAML file, what will I store as the value for the “assigned_to” attribute?

In SQL, this is what foreign keys are good for. In my chores table, I would store a reference to a particular row in the people table.

I wanted the same functionality in pitz, so I came up with pointers. This is dry stuff, so here’s an example:

>>> class Chore(Entity):
... pointers = dict(assigned_to='person')
>>> class Person(Entity):
... pass
>>> matt = Person(weekend_chores, title="Matt")
>>> lindsey = Person(weekend_chores, title="Lindsey")
>>> ch1 = Chore(weekend_chores, title="Mow the yard", assigned_to=matt)
>>> ch2 = Chore(weekend_chores, title="Buy some groceries",
... assigned_to=lindsey)

Not much is different, but instead of matt, lindsey, and the various chores all being entities, they’re now subclasses. But here’s the advantage of defining pointers on Chore:

>>> ch1['assigned_to'] >>> matt['name'] # doctest: +SKIP
>>> ch1.replace_objects_with_pointers()
>>> ch1['assigned_to'] # doctest: +SKIP

First of all, notice how I printed out the name attribute on matt.

After running the replace_objects_with_pointers method, I don’t have a reference to the matt object. Instead, I have matt’s name now.

Now I can send this data out to a yaml file. And when I load it back in from yaml, I can then reverse this action, and go look up an entity with the same name:

>>> mn = matt.name
>>> matt == weekend_chores.by_name(mn)

In practice, I convert all the entities to pointers, then write out the yaml files, then convert all the pointers back into objects automatically.

That’s the end of the data model documentation. I hope that shines enough light so that it is obvious if pitz would be useful to you or not.

I’m working on a separate article where I show some real-world workflows modeled in pitz, but that will be next week’s post.

Published by


My name is Matt Wilson and I live in Cleveland Heights, Ohio. I love random emails from strangers, so get in touch! matt@tplus1.com.

  • Medela

    Hey thanks a lot for such an informative post, I was looking for this one! Bookmarking this blog right away! thanks once again.

  • James

    This is very interesting! I'm looking forward to being able to use pitz from the command line…

  • http://blog.tplus1.com Matt Wilson

    Yeah, I'm working on it 🙂 Seriously, though, I have one command-line
    script written so far.

    $ pitz-shell path/to/project.yaml

    That will start up IPython and load the project described in that yaml
    file. I'm going to post about it soon.


  • James

    This is very interesting! I'm looking forward to being able to use pitz from the command line…

  • http://blog.tplus1.com Matt Wilson

    Yeah, I'm working on it 🙂 Seriously, though, I have one command-line
    script written so far.

    $ pitz-shell path/to/project.yaml

    That will start up IPython and load the project described in that yaml
    file. I'm going to post about it soon.