Need help with data files and setup.py

I’m working on a package that includes some files that are meant to be copied and edited by people using the package.

My project is named “pitz” and it is a bugtracker. Instead of using a config file to set the options for a project, I want to use python files.

When somebody installs pitz, I want to save some .py files somewhere so that when they run my pitz-setup script, I can go find those .py files and copy them into their working directory.

I have two questions:

  1. Do I need to write my setup.py file to specify that the .py files in particular directory need to be treated like data, not code? For example, I don’t want the installer to hide those files inside an egg.
  2. How can I find those .py files later and copy them?

Here’s my setup.py so far:

from setuptools import setup, find_packages
version = '0.1'
setup(name='pitz',
version=version,
description="Python to-do tracker inspired by ditz (ditz.rubyforge.org)",

long_description="""\
ditz (http://ditz.rubyforge.org) is the best distributed ticketing
system that I know of. There's a few things I want to change, so I
started pitz.""",

classifiers=[],
keywords='ditz',
author='Matt Wilson',
author_email='[email protected]',
url='http://tplus1.com',
license='',
packages=find_packages(exclude=['ez_setup', 'examples', 'tests']),

include_package_data=True,
package_dir={'pitz':'pitz'},

data_files=[('share/pitz',
[
'pitz/pitztypes/agilepitz.py.sample',
'pitz/pitztypes/tracpitz.py.sample',
])],

zip_safe=False,
install_requires=[
# 'PyYAML',
# 'sphinx',
# 'nose',
# 'jinja2',
# -*- Extra requirements: -*-
],

# I know about the much fancier entry points, but I prefer this
# solution. Why does everything have to be zany?
scripts = ['scripts/pitz-shell'],

test_suite = 'nose.collector',
)

When I run python setup.py install, I do get those .sample files copied, but they get copied into a folder way inside of my pitz install:

$ cd ~/virtualenvs/scratch/lib/
$ find -type f -name '*.sample'
./python2.6/site-packages/pitz-0.1dev-py2.6.egg/share/pitz/tracpitz.py.sample
./python2.6/site-packages/pitz-0.1dev-py2.6.egg/share/pitz/agilepitz.py.sample

I don’t know how I can write a script to copy those tracpitz.py.sample files out. Maybe I can ask pitz what its version is, and then build a tring and use os.path.join, but that doesn’t look like any fun at all.

So, what should I do instead?

I submitted my proposal for PyOhio 2009

Here it is: Clever uses for metaclasses

SUMMARY

This talk introduces metaclasses and attempts to “defang” them, showing what they are good for, and when they are silly.

Metaclasses can reduce redundancy in code but can be very confusing, so in this talk, I will walk through several examples of how to use metaclasses to solve problems.

I plan to cover (at least) these examples:

  • Add extra methods and attributes to a class after its definition.
  • Verify a class correctly implements a specification.
  • Give a subclass its own class-variables rather than sharing them with the parent class.
  • Implement the basics of an object-relational mapper (ORM).

I published an article in the November 2008 issue of Python Magazine with the same title (Clever Uses for Metaclasses) and I’ll use some code from that article.

I want to give this talk in a friendly, informal manner, so that people that feel intimidated by metaclasses realize that there’s nothing to be scared of.

EXPERTISE LEVEL

This talk is aimed at the intermediate-level programmer, already familiar with object-oriented concepts and is really comfortable with Python.

I do not expect people in the audience to know ANYTHING about metaclasses before this talk.

AREAS OF PYTHON

  • object-oriented programming
  • metaprogramming
  • dynamic language tomfoolery
  • introspection

How to see the to-do list for pitz

pitz is (among other things) a to-do list tracker like trac or bugzilla or version one.

I’m storing the list of stuff to do for pitz in the pitz source code. Here’s how to see the unfinished stuff in pitz.

Get a copy of the code


$ git clone git://github.com/mw44118/pitz.git
Initialized empty Git repository in /home/matt/pitz/.git/
remote: Counting objects: 621, done.
remote: Compressing objects: 100% (604/604), done.
remote: Total 621 (delta 383), reused 0 (delta 0)
Receiving objects: 100% (621/621), 98.52 KiB | 135 KiB/s, done.
Resolving deltas: 100% (383/383), done.

Install it, probably in a virtualenv


$ source ~/virtualenvs/pitz/bin/activate
$ python setup.py develop

Fire up pitz-shell

I have written one command-line tool so far: pitz-shell. Use it to start a python interpreter loaded with any pitz project. Here’s how to start a session for pitz itself:

$ pitz-shell pitz/pitzfiles/project-99c58812-5c1c-4fec-874c-c998933ba88b.yaml
/home/matt/virtualenvs/pitz/lib/python2.6/site-packages/ipython-0.9.1-py2.6.egg/IPython/Magic.py:38: DeprecationWarning: the sets module is deprecated
from sets import Set

pitz-shell imports a bunch of classes and makes an object named p (p stands for project). p has all the information about the project described in the yaml file passed in as an argument to pitz-shell. The __repr__ method on p gives some summarized data:

In [1]: p
Out[1]:

p.todo is a property that just returns a bag of unfinished tasks for the project:

In [4]: p.todo

Out[4]:
You can print any bag to see all the contents of the bag, and p.todo is no different:

In [5]: print(p.todo)
===========
Stuff to do
===========

(23 task entities)
------------------

0: Add support for something like 'ditz grep' (unknown status)
1: Update entities by loading a CSV file (unknown status)
2: Figure out why some tasks are not converting pointers to objects (unknown status)
3: Support intersection, union, and other set operations on bags (unknown status)
4: Demonstrate really simple tasks and priorities workflow (unknown status)
5: Support a .pitz config file with all pitz scripts (unknown status)
6: Add a todo property on project (or maybe bag) (unknown status)
7: Write code to use strings as keys (unknown status)
8: Prompt to save work at the end of an interactive pitz session (unknown status)
9: Make it possible to support a filter like attribute!=value (unknown status)
10: Write code to support sorting by anything (unknown status)
11: Support hooks (unknown status)
12: Write an attributes property on a bag that lists count of each attribute in any entities (unknown status)
13: Allow two bags to be compared for equality by using their entities (unknown status)
14: Make it easy to list each employee's tasks (unknown status)
15: Support a $PITZDIR env var to tell where yaml files live (unknown status)
16: Demonstrate release -< iteration -< user story -< task workflow. (unknown status) 17: Load new entities from a CSV file (unknown status) 18: Support grep on entities (unknown status) 19: write data to yaml in order (unknown status) 20: Support entity subclasses like releases, iterations, user stories, and tasks (unknown status) 21: A bag should dump to a single CSV file (unknown status) 22: Support using substring of name as name (unknown status)

That's how you see the to-do list for pitz!

In a future post, I'll show how to make new tasks and how to update tasks.

I also need to explain how Pitz lets you come up with whatever wacky workflow you want. When you set up a pitz project, you can use the classes I came up with, or subclass Entity into your own weird types. In a future post, I'll show I'm using pitz to model an agile development system using releases, iterations, checkpoints, user stories, tasks, and people.

pitz data model outline

I just finished a whole bunch of documentation on the pitz data model. You can read it here or just read all the stuff I copied below:

There are two classes in pitz: entities and bags. Everything else are subclasses of these two.

Entities

Making them

Every entity is an object like a dictionary. You can make an entity like this:


>>> from pitz import Entity
>>> e = Entity(title="example entity",
... creator="Matt",
... importance="not very")

You can also load an entity from a yaml file, but I’ll explain that later.

You can look up a value for any attribute like this:


>>> e['title']
'example entity'
>>> e.keys() #doctest: +NORMALIZE_WHITESPACE
['name', 'creator', 'importance', 'title', 'modified_time',
'created_time', 'type']
>>> e['type']
'entity'

Viewing them

Entities have a summarized view useful when you want to see a list of entities, and a detailed view that shows all the boring detail:

>>> e.summarized_view
'example entity (entity)'

>>> print(e.detailed_view) #doctest: +SKIP
example entity (entity)
-----------------------

name:
entity-bdd31951-cff0-42a5-92b4-97ef966a6f6f

creator:
Matt

importance:
not very

title:
example entity

modified_time:
2009-04-04 07:47:09.456068

created_time:
2009-04-04 07:47:09.456068

type:
entity

Notice how our entity has some attributes we never set, like name, type, created_time, and modified_time. I make these in the __init__ method of the entity class.

By the way, you can ignore the #doctest: +SKIP comment. That is there so the doctests will skip trying to running this example, which will generate unpredictable values.

Saving and loading them

Entities have an instance method named to_yaml_file and a from_yaml_file classmethod. Here’s how to use them:

>>> outfile = e.to_yaml_file('.') # Writes file to this directory.
>>> e2 = Entity.from_yaml_file(outfile)

Bags

Making them

While entities are based on dictionaries, bags are based on lists. You can give a bag instance a title, which is nice for remembering what it is you want it for. Bags make it easy to organize a bunch of entities.

>>> from pitz import Bag
>>> b = Bag(title="Stuff that is not very important")
>>> b.append(e)

Viewing them

Converting a bag to a string prints the summarized view of all the entities inside:

>>> print(b) #doctest: +SKIP
================================
Stuff that is not very important
================================

1 entity entities
-----------------

0: example entity (entity)

That number 0 can be used to pull out the entity at that position, just like a regular boring old list:

>>> e == b[0]
True

Querying them

Bags have a matches_dict method that accepts a bunch of key-value pairs and then returns a new bag that contains all the entities in the first bag that match all those key-value pairs.

First, I’ll make a few more entities:

>>> e1 = Entity(title="example #1", creator="Matt",
... importance="Really important")
>>> e2 = Entity(title="example #2", creator="Matt",
... importance="not very")

Now I’ll make a new bag that has both of these new entities:

>>> b = Bag('Everything')
>>> b.append(e1)
>>> b.append(e2)
>>> print(b)
==========
Everything
==========

2 entity entities
-----------------

0: example #1 (entity)
1: example #2 (entity)


Here is how to get a new bag with just the entities that have an importance attribute set to “not very”:

>>> not_very_important = b.matches_dict(importance="not very")
>>> len(not_very_important) == 1
True
>>> not_very_important[0] == e2
True

Since matches_dict is the most common method I call on a bag, I made the __call__ method on the Bag class run matches_dict. So that means this works just as well:

>>> not_very_important = b(importance="not very")

Saving and loading them

Bags can send all contained entities to yaml files with to_yaml_files, and bags can load a bunch of entities from yaml files with from_yaml_files.

Right now, there is no way for a bag to save itself to yaml.

The Special Project Bag

After I finished bags and entities, I thought I was done, but then I ran into a few frustrations:

  • When I made a bunch of entities, but didn’t append them all into one bag, then I couldn’t run filters across all of them.
  • At the end of a session, it wasn’t easy for me to make sure that all of the entities got saved out to yaml.
  • I couldn’t figure out an elegant way to store one entity as a value for another entity’s attribute.

So I made a “special” Bag subclass called Project. The idea here is that every entity should be a member of the project bag. Also, every entity should have a reference back to the project.

Using a project is easy. Just pass it in as the first argument when you make an entity. Imagine I want to link some tasks to Matt and some other tasks to Lindsey. First I make a project:

>>> from pitz import Project
>>> weekend_chores = Project(title="Weekend chores")

Now I make the rest of the entities:

>>> matt = Entity(weekend_chores, title="Matt")
>>> lindsey = Entity(weekend_chores, title="Lindsey")
>>> t1 = Entity(weekend_chores, title="Mow the yard", assigned_to=matt)
>>> t2 = Entity(weekend_chores, title="Buy some groceries",
... assigned_to=lindsey)

Now it is easy to get tasks for matt:

>>> chores_for_matt = weekend_chores(assigned_to=matt)
>>> mow_the_yard = chores_for_matt[0]
>>> mow_the_yard['assigned_to'] == matt
True

Pointers

There’s a problem in that last example: when I send this mow_the_yard entity out to a YAML file, what will I store as the value for the “assigned_to” attribute?

In SQL, this is what foreign keys are good for. In my chores table, I would store a reference to a particular row in the people table.

I wanted the same functionality in pitz, so I came up with pointers. This is dry stuff, so here’s an example:

>>> class Chore(Entity):
... pointers = dict(assigned_to='person')
...
>>> class Person(Entity):
... pass
>>> matt = Person(weekend_chores, title="Matt")
>>> lindsey = Person(weekend_chores, title="Lindsey")
>>> ch1 = Chore(weekend_chores, title="Mow the yard", assigned_to=matt)
>>> ch2 = Chore(weekend_chores, title="Buy some groceries",
... assigned_to=lindsey)

Not much is different, but instead of matt, lindsey, and the various chores all being entities, they’re now subclasses. But here’s the advantage of defining pointers on Chore:

>>> ch1['assigned_to'] >>> matt['name'] # doctest: +SKIP
'person-530ad3cc-14f1-491a-bdb6-ed1dd65afe46'
>>> ch1.replace_objects_with_pointers()
>>> ch1['assigned_to'] # doctest: +SKIP
'person-530ad3cc-14f1-491a-bdb6-ed1dd65afe46'

First of all, notice how I printed out the name attribute on matt.

After running the replace_objects_with_pointers method, I don’t have a reference to the matt object. Instead, I have matt’s name now.

Now I can send this data out to a yaml file. And when I load it back in from yaml, I can then reverse this action, and go look up an entity with the same name:

>>> mn = matt.name
>>> matt == weekend_chores.by_name(mn)
True

In practice, I convert all the entities to pointers, then write out the yaml files, then convert all the pointers back into objects automatically.

That’s the end of the data model documentation. I hope that shines enough light so that it is obvious if pitz would be useful to you or not.

I’m working on a separate article where I show some real-world workflows modeled in pitz, but that will be next week’s post.

Instead of setting instance attributes within __init__

I want to make sure that instances of my class have a bunch of attributes. This is the way I’ve always done it in the past:

>>> class C(object):
... def __init__(self):
... self.a = 1
... self.b = 2
...

It gets the job done fine. But when there’s real work to be done in __init__, then I end up with a really long __init__ method with essentially two separate goals. One section does interesting stuff with the parameters passed in, and the other section creates a bunch of attributes.

So now I’m experimenting with setting properties up for my classes like this:

>>> class D(object):
... @property
... def a(self):
... if not hasattr(self, 'a'):
... self.a =1
... return self.a

I like how I’ve moved the instance variables out of __init__, so that my __init__ method can focus entirely on handling the parameters. I’m curious what problems I’m going to have. At first, I thought I would trigger some infinite loop by accessing self.a when inside the property for a, but so far, I haven’t had any problems.

So what is wrong with this approach?

How to load ditz issues into python

Ditz is a fantastic distributed bug tracking system written in Ruby.

Here’s some code that you can use to load some ditz issues into a python interpreter. You need to install my pitz project first though.


>>> from pitz.junkyard.ditzloader import *
>>> from glob import glob
>>> issue_file_path = glob('../.ditz/issue-*.yaml')[0]
>>> import yaml
>>> issue = yaml.load(open(issue_file_path))
>>> issue.title
'Distribute reports by email'
>>> print issue.desc
Somehow allow people to sign up for report subscriptions.


So when new reports come out, they get updated. Maybe I can use
RSS feeds to hold the reports.
>>> print issue.log_events
[[datetime.datetime(2008, 9, 2, 17, 47, 44, 549355), 'Matthew Wilson ', 'created', ''], [datetime.datetime(2008, 9, 2, 19, 18, 3, 286902), 'Matthew Wilson ', 'assigned to release 3.5.1 from unassigned', ''], [datetime.datetime(2008, 9, 4, 18, 27, 19, 571991), 'Matthew Wilson ', 'unassigned from release 3.5.1', '']]

That above just showed how to read the ditz issue. I’m having trouble updating the issue and then saving it in a format that ditz can still read. Updating the ditz issue and saving it out again is easy:


>>> issue.title = 'Distribute reports by email or RSS'
>>> open(issue_file_path, 'w').write(yaml.dump(issue))

But when I try to load it with ditz, I get this error:

$ ditz show 1209b
/home/matt/checkouts/ditz/lib/ditz/model-objects.rb:124:in `sort_by': comparison of String with Time failed (ArgumentError)
from /home/matt/checkouts/ditz/lib/ditz/model-objects.rb:124:in `assign_issue_names!'
from /home/matt/checkouts/ditz/lib/ditz/model-objects.rb:51:in `issues='
from /home/matt/checkouts/ditz/lib/ditz/file-storage.rb:21:in `load'
from /home/matt/checkouts/ditz/bin/ditz:165

I compared my dumped file to the original ruby file and found that python wrote dates out like this:

>>> print yaml.dump(issue.creation_time)
2008-09-02 17:47:43.268059

But in the ruby yaml files, the dates look like this:

$ grep creation_time ../../../.ditz/issue-1209b17b64335383a710ccadf10b74c3401dbcb2.yaml
creation_time: 2008-09-02 17:47:43.268059 Z

That trailing Z seems important.

Also, python and ruby seem to write out lists differently. Here’s how python dumped a list of lists:

>>> print yaml.dump(issue.log_events)
- [!!timestamp '2008-09-02 17:47:44.549355', Matthew Wilson , created,
'']
- [!!timestamp '2008-09-02 19:18:03.286902', Matthew Wilson , assigned
to release 3.5.1 from unassigned, '']
- [!!timestamp '2008-09-04 18:27:19.571991', Matthew Wilson , unassigned
from release 3.5.1, '']

But the same data dumped by ruby looks like:

log_events:
- - 2008-09-02 17:47:44.549355 Z
- Matthew Wilson
- created
- ""
- - 2008-09-02 19:18:03.286902 Z
- Matthew Wilson
- assigned to release 3.5.1 from unassigned
- ""
- - 2008-09-04 18:27:19.571991 Z
- Matthew Wilson
- unassigned from release 3.5.1
- ""

So, there’s clearly some more work for me (or you, this is an open-source project) to do.

Some progress on pitz

Pitz is my open-source, distributed, plain-text, command-line, very flexible issue tracker. I just started working on it and I’m looking for feedback and contributors.

This post describes a little of what I’ve gotten done so far. I haven’t written any interface yet other than from within an interactive python session. Deal with it.

Getting started

You have to make a bag to keep all the tasks and then you make tasks with any keywords you can dream up.


>>> from pitz import Bag, Task
>>> b = Bag('.pitz')
>>> b.append(Task(title='Wash the dishes', creator='Matt', importance='Not very'))
>>> b.append(Task(title='Clean the cat box', creator='Matt', importance='Not very'))

You can add tasks to the bag by using the append method on the bag, or by passing in the bag as the first argument to the task:

>>> Task(b, title='Set new high score on Sushi-go-round', creator='Matt', importance='critical')

Printing a bag really prints the title of each task in the bag:


>>> print(b)
Clean the cat box
Set new high score on Sushi-go-round
Wash the dishes

Running queries

I’m proud of this one. This is the main reason I’m working on ditz. Bag instances have a feature called matching_pairs that can filter the tasks down to a smaller new bag.

You can filter by a single pair like this:


>>> critical_tasks = b.matching_pairs([('importance', 'critical')])
>>> print(critical_tasks)
Set new high score on Sushi-go-round

Or you can filter by multiple pairs and the filtered tasks must satisfy ALL the pairs.

>>> print(b.matching_pairs([('creator', 'Matt'), ('importance', 'Not very')]))
Clean the cat box
Wash the dishes

Dumping and loading

Use the bag that holds all the tasks to quickly write all the tasks out to your hard drive like this:


>>> b.to_yaml_files()
['.pitz/task-07e1af97-0ac6-4904-9187-0c2fd61692b6.yaml', '.pitz/task-6a7af07c-d0fb-4a77-9347-8dc78ef490fe.yaml', '.pitz/task-5ce725dc-c1db-4eca-a74c-55cd0e910786.yaml']

The returned stuff is a list of files that pitz just wrote.

Loading from the hard drive is pretty simple too. Just tell the bag where to load from.

>>> b2 = Bag('.pitz')
>>> print(b2)
Clean the cat box
Set new high score on Sushi-go-round
Wash the dishes

Task details

Printing a task by itself gives all the details on the task.


>>> t = Task.from_yaml_file ('.pitz/task-4d9c1db2-fef3-4b50-8095-b2339384e118.yaml')
>>> print(t)
Do the 2008 taxes
-----------------

type: Task
name: task-4d9c1db2-fef3-4b50-8095-b2339384e118
title: Do the 2008 taxes
created date: 2009-03-01 22:29:58.242512
modified date: 2009-03-01 22:29:58.242512
creator: Matt
last modified by: Matt

description:
Do the 2008 taxes

Other stuff

You can use bags as iterators to go through the tasks one-by-one:

for task in b:
print(t['title'])

Also notice that tasks are really just subclassed dictionaries (UserDict, actually) with some extra methods bolted on.

My new ticket tracking system is now vaporware!

I set up http://pitz.tplus1.com to host my pitz project, which is a python implementation of ditz.

Instead of just banging out code, I decided to write the documentation and the list of supported features FIRST.

Once I have my feature set established, then I’ll write the tests for those features, and finally, I’ll write the code.

I’m looking for feedback on the feature set. What would an ideal bugtracker look like?

git cherry is neat

Sometimes I’ll patch a production bug in my production branch and then I will forget to merge that commit into the development branch. This is how I can check for that:

$ git checkout production_branch
$ git cherry dev_branch

This will spit out a list of commits that are in production_branch but not in dev_branch. It will not return any commits made to dev_branch but not in production_branch. It is not the same as a diff of the two branches either.

Ditz versus bugs everywhere

A few months ago, I sketched out a ticket-tracking system that would be married with my source code. Then some commenters told me about bugs everywhere (be) and ditz.

I’ve looked at both, but I’ve been using ditz full-time while just watching be. Anyhow, here’s a few comparisons:

setting up a project

Here’s what it looks like when you set up a project in ditz:

$ ditz init
I wasn't able to find a configuration file ./.ditz-config.
We'll set it up right now.
Your name (enter for Matthew Wilson):
Your email address (enter for [email protected]):
Directory to store issues state in (enter for .ditz):
Use your text editor for multi-line input when possible (y/n)? y
Paginate output (always/never/auto)? auto
Project name (enter for scratch):
Issues can be tracked across the project as a whole, or the project can be
split into components, and issues tracked separately for each component.
Track issues separately for different components? (y/n): y

Current components:
None!

(A)dd component, (r)emove component, or (d)one: a
Component name: documentation

... snip ...

(A)dd component, (r)emove component, or (d)one: d
Ok, .ditz directory created successfully.

And here’s how you can create a single issue.

$ ditz add
Title: Write something justifying yet another web framework
Is this a (b)ugfix, a (f)eature, or a (t)ask? t
Choose a component:
1) scratch
2) documentation
3) model code
4) controller code
5) view code
Component (1--5): 2
Issue creator (enter for Matthew Wilson ):
Added issue documentation-1 (e8a4a43f78ee83300cc0372a13375d9534b97abb).

You can’t tell, but when I punched in the title, ditz opened my $EDITOR and I wrote a longer description in there.

Now the same thing in be:

$ be set-root
Guessing id 'matt '
No revision control detected.
Directory initialized.

$ be new 'Write something justifying yet another web framework'
Guessing id 'matt '
Guessing id 'matt '
Created bug with ID 4d4

Not quite the same experience!

Here’s what a ditz issue looks like:

$ ditz show documentation-1
Issue documentation-1
---------------------
Title: Write something justifying yet another web framework
Description: Why not just polish any of the ones already out there?
Type: task
Status: unstarted
Creator: Matthew Wilson
Age: four minutes
Release:
References:
Identifier: e8a4a43f78ee83300cc0372a13375d9534b97abb

Event log:
- created (matt, four minutes ago)

And in be:

$ be show 4d4
Guessing id 'matt '
ID : 4d4e6a17-2097-42bb-a3cd-3c17566ecce8
Short name : 4d4
Severity : minor
Status : open
Assigned :
Target :
Creator : matt
Created : Mon, 22 Dec 2008 20:25 (Tue, 23 Dec 2008 01:25:04 +0000)
Write something justifying yet another web framework

Ditz issues have titles, long descriptions, types (feature, bugfix, or task), releases (optionally) and links to components (also optionally). There are ditz plugins to add support for assigning issues to people.

be has most of the same concepts, just with different names.

data serialization and storage

ditz makes a .ditz directory at the top of a project and be makes a .be directory in the top of the project.

Inside the .ditz folder, there’s one project.yaml file that lists releases (groupings of issues) and components (also groupings of issues, but cross-cutting). Then each issue lives in its own yaml file, and they look like this:

$ cat .ditz/issue-ac3177b3bf8c6757625977ef27279c1fe05df662.yaml
--- !ditz.rubyforge.org,2008-03-06/issue
title: Write some "WHY?" documentation
desc: Justify the existence of this project.
type: :task
component: documentation
release:
reporter: Matthew Wilson
status: :unstarted
disposition:
creation_time: 2008-12-23 00:59:05.840956 Z
references: []

id: ac3177b3bf8c6757625977ef27279c1fe05df662
log_events:
- - 2008-12-23 00:59:05.841349 Z
- Matthew Wilson
- created
- ""
- - 2008-12-23 01:08:58.605955 Z
- Matthew Wilson
- commented
- |-
Yeah, if you're gonna build another web framework, this needs to be
really good.

Meanwhile, be is fairly similar, but bugs get whole directories to themselves. be uses what seems to be a home-made plain-text format for storing bugs:

$ cat .be/bugs/4da8ee85-9353-4a92-a654-8510bb8be0d0/values

creator=matt

severity=minor

status=open

summary=Write some "WHY?" documentation

time=Tue, 23 Dec 2008 01:12:13 +0000

There’s actually much more whitespace than that. I replaced the eight blank lines between each line of text with just two blank lines.

While ditz stores the comments inside the issue’s yaml file, be makes a directory under the issue’s directory, and then stores the text of the comments in one file and the information about who said it in a separate file.

The community

The ditz mailing list is really active with people debating ideas for new features. The be mailing list is now showing some signs of life after looking dead in August.

What ditz has that be lacks

ditz can make really pretty HTML pages for all the issues for a project. example.

yaml was a really good choice. yaml makes it easy to deserialize to higher objects than just crappy boring primitive types like arrays. Instead, you can hop all the way to your own weird home made objects by specifying a tag. Then all the stuff in the yaml file gets passed in to your object.

Ditz has lots and lots of commands that are only on the be roadmap. You can search your issues with regular expressions with ditz grep, you can claim issues for yourself, you can group issues by releases and components, etc, etc, etc.

The ditz issue data model can be extended with plugins. Like I mentioned earlier, one plugin makes it possible for people to claim issues as assigned to them.

What I like about be

It’s written in python. I hate to feed the python snobbery monster, but there are certain python niceties that I don’t like doing without. In particular, ipython is just too awesome. When I read the ditz code, I spent most of my time navigating the code to get to the part that I cared about that. With ipython, I don’t have that problem. I just hit foo?? and immediately see the source code.

And ruby’s documentation is not what I’ve grown accustomed to with python. For comparison:

I think the Python docs have more explanatory text in just the table of contents.

In addition, ditz uses a lot homemade code: there’s a homemade option parser library (trollop), a homemade hack on the way ruby stores data files so that all the HTML templates are available, and all sorts of gymnastic FP tricks to get a lot of shit done in a very small number of lines. That’s cool, but as a yellow-belt in Ruby, it is really @#$ing hard to make any contributions to this project. Here’s some code that I find a little difficult to read:

def operation method, desc, *args_spec, &options_blk
@operations ||= {}
@operations[method] = { :desc => desc, :args_spec => args_spec,
:options_blk => options_blk }
end

operation :stop, "Stop work on an issue", :started_issue do
opt :comment, "Specify a comment", :short => 'm', :type => String
opt :no_comment, "Skip asking for a comment", :default => false
end

def stop project, config, opts, issue
puts "Stopping work on issue #{issue.name}: #{issue.title}."
issue.stop_work config.user, get_comment(opts)
puts "Recorded work stop for #{issue.name}."
end

After tracing through a few hundred lines of stuff like that, I usually get discouraged and just write a feature request rather than a patch.

In summary

I like ditz. I like reading nearly inscrutable Ruby code to see how wacky people solve problems. My experience with ditz so far has been about an A minus, which is pretty good!

Why I’m going to write my own

There have been a few times where the lack of a proper database system has bit me. Like when I renamed a release, I had to do some searching and replacing in lots and lots of files. Also, regenerating my HTML views is taking almost two minutes now that I have so many issues. Also, certain operations, like moving a handful of issues from one release to another, or searching for intersections of issue subsets, are trickier than what they should be.

Besides all that, I’m fascinated by couchdb, and I think this would be a good use.

I think my system is going to use a local couchdb server that loads in all the issues from local yaml files into the server on startup. Then after lots of work updating, I’ll write out all the issues back into yaml. So, when you update your checkout of your code, you’ll need to restart or reload your couchdb server. Then you can use the couchdb server to work with the system, and then at the end, re-serialize the data back out to JSON, and then to yaml.

ditz and be are sort of like old-school CGI web apps where each user action has to start up some the framework, do the action, then tear down. My system will instead keep all the issue data in memory and require explicit startups and shutdowns.