Instead of setting instance attributes within __init__

I want to make sure that instances of my class have a bunch of attributes. This is the way I’ve always done it in the past:

>>> class C(object):
... def __init__(self):
... self.a = 1
... self.b = 2
...

It gets the job done fine. But when there’s real work to be done in __init__, then I end up with a really long __init__ method with essentially two separate goals. One section does interesting stuff with the parameters passed in, and the other section creates a bunch of attributes.

So now I’m experimenting with setting properties up for my classes like this:

>>> class D(object):
... @property
... def a(self):
... if not hasattr(self, 'a'):
... self.a =1
... return self.a

I like how I’ve moved the instance variables out of __init__, so that my __init__ method can focus entirely on handling the parameters. I’m curious what problems I’m going to have. At first, I thought I would trigger some infinite loop by accessing self.a when inside the property for a, but so far, I haven’t had any problems.

So what is wrong with this approach?

How to load ditz issues into python

Ditz is a fantastic distributed bug tracking system written in Ruby.

Here’s some code that you can use to load some ditz issues into a python interpreter. You need to install my pitz project first though.


>>> from pitz.junkyard.ditzloader import *
>>> from glob import glob
>>> issue_file_path = glob('../.ditz/issue-*.yaml')[0]
>>> import yaml
>>> issue = yaml.load(open(issue_file_path))
>>> issue.title
'Distribute reports by email'
>>> print issue.desc
Somehow allow people to sign up for report subscriptions.


So when new reports come out, they get updated. Maybe I can use
RSS feeds to hold the reports.
>>> print issue.log_events
[[datetime.datetime(2008, 9, 2, 17, 47, 44, 549355), 'Matthew Wilson ', 'created', ''], [datetime.datetime(2008, 9, 2, 19, 18, 3, 286902), 'Matthew Wilson ', 'assigned to release 3.5.1 from unassigned', ''], [datetime.datetime(2008, 9, 4, 18, 27, 19, 571991), 'Matthew Wilson ', 'unassigned from release 3.5.1', '']]

That above just showed how to read the ditz issue. I’m having trouble updating the issue and then saving it in a format that ditz can still read. Updating the ditz issue and saving it out again is easy:


>>> issue.title = 'Distribute reports by email or RSS'
>>> open(issue_file_path, 'w').write(yaml.dump(issue))

But when I try to load it with ditz, I get this error:

$ ditz show 1209b
/home/matt/checkouts/ditz/lib/ditz/model-objects.rb:124:in `sort_by': comparison of String with Time failed (ArgumentError)
from /home/matt/checkouts/ditz/lib/ditz/model-objects.rb:124:in `assign_issue_names!'
from /home/matt/checkouts/ditz/lib/ditz/model-objects.rb:51:in `issues='
from /home/matt/checkouts/ditz/lib/ditz/file-storage.rb:21:in `load'
from /home/matt/checkouts/ditz/bin/ditz:165

I compared my dumped file to the original ruby file and found that python wrote dates out like this:

>>> print yaml.dump(issue.creation_time)
2008-09-02 17:47:43.268059

But in the ruby yaml files, the dates look like this:

$ grep creation_time ../../../.ditz/issue-1209b17b64335383a710ccadf10b74c3401dbcb2.yaml
creation_time: 2008-09-02 17:47:43.268059 Z

That trailing Z seems important.

Also, python and ruby seem to write out lists differently. Here’s how python dumped a list of lists:

>>> print yaml.dump(issue.log_events)
- [!!timestamp '2008-09-02 17:47:44.549355', Matthew Wilson , created,
'']
- [!!timestamp '2008-09-02 19:18:03.286902', Matthew Wilson , assigned
to release 3.5.1 from unassigned, '']
- [!!timestamp '2008-09-04 18:27:19.571991', Matthew Wilson , unassigned
from release 3.5.1, '']

But the same data dumped by ruby looks like:

log_events:
- - 2008-09-02 17:47:44.549355 Z
- Matthew Wilson
- created
- ""
- - 2008-09-02 19:18:03.286902 Z
- Matthew Wilson
- assigned to release 3.5.1 from unassigned
- ""
- - 2008-09-04 18:27:19.571991 Z
- Matthew Wilson
- unassigned from release 3.5.1
- ""

So, there’s clearly some more work for me (or you, this is an open-source project) to do.

Some progress on pitz

Pitz is my open-source, distributed, plain-text, command-line, very flexible issue tracker. I just started working on it and I’m looking for feedback and contributors.

This post describes a little of what I’ve gotten done so far. I haven’t written any interface yet other than from within an interactive python session. Deal with it.

Getting started

You have to make a bag to keep all the tasks and then you make tasks with any keywords you can dream up.


>>> from pitz import Bag, Task
>>> b = Bag('.pitz')
>>> b.append(Task(title='Wash the dishes', creator='Matt', importance='Not very'))
>>> b.append(Task(title='Clean the cat box', creator='Matt', importance='Not very'))

You can add tasks to the bag by using the append method on the bag, or by passing in the bag as the first argument to the task:

>>> Task(b, title='Set new high score on Sushi-go-round', creator='Matt', importance='critical')

Printing a bag really prints the title of each task in the bag:


>>> print(b)
Clean the cat box
Set new high score on Sushi-go-round
Wash the dishes

Running queries

I’m proud of this one. This is the main reason I’m working on ditz. Bag instances have a feature called matching_pairs that can filter the tasks down to a smaller new bag.

You can filter by a single pair like this:


>>> critical_tasks = b.matching_pairs([('importance', 'critical')])
>>> print(critical_tasks)
Set new high score on Sushi-go-round

Or you can filter by multiple pairs and the filtered tasks must satisfy ALL the pairs.

>>> print(b.matching_pairs([('creator', 'Matt'), ('importance', 'Not very')]))
Clean the cat box
Wash the dishes

Dumping and loading

Use the bag that holds all the tasks to quickly write all the tasks out to your hard drive like this:


>>> b.to_yaml_files()
['.pitz/task-07e1af97-0ac6-4904-9187-0c2fd61692b6.yaml', '.pitz/task-6a7af07c-d0fb-4a77-9347-8dc78ef490fe.yaml', '.pitz/task-5ce725dc-c1db-4eca-a74c-55cd0e910786.yaml']

The returned stuff is a list of files that pitz just wrote.

Loading from the hard drive is pretty simple too. Just tell the bag where to load from.

>>> b2 = Bag('.pitz')
>>> print(b2)
Clean the cat box
Set new high score on Sushi-go-round
Wash the dishes

Task details

Printing a task by itself gives all the details on the task.


>>> t = Task.from_yaml_file ('.pitz/task-4d9c1db2-fef3-4b50-8095-b2339384e118.yaml')
>>> print(t)
Do the 2008 taxes
-----------------

type: Task
name: task-4d9c1db2-fef3-4b50-8095-b2339384e118
title: Do the 2008 taxes
created date: 2009-03-01 22:29:58.242512
modified date: 2009-03-01 22:29:58.242512
creator: Matt
last modified by: Matt

description:
Do the 2008 taxes

Other stuff

You can use bags as iterators to go through the tasks one-by-one:

for task in b:
print(t['title'])

Also notice that tasks are really just subclassed dictionaries (UserDict, actually) with some extra methods bolted on.

My new ticket tracking system is now vaporware!

I set up http://pitz.tplus1.com to host my pitz project, which is a python implementation of ditz.

Instead of just banging out code, I decided to write the documentation and the list of supported features FIRST.

Once I have my feature set established, then I’ll write the tests for those features, and finally, I’ll write the code.

I’m looking for feedback on the feature set. What would an ideal bugtracker look like?

Ditz versus bugs everywhere

A few months ago, I sketched out a ticket-tracking system that would be married with my source code. Then some commenters told me about bugs everywhere (be) and ditz.

I’ve looked at both, but I’ve been using ditz full-time while just watching be. Anyhow, here’s a few comparisons:

setting up a project

Here’s what it looks like when you set up a project in ditz:

$ ditz init
I wasn't able to find a configuration file ./.ditz-config.
We'll set it up right now.
Your name (enter for Matthew Wilson):
Your email address (enter for [email protected]):
Directory to store issues state in (enter for .ditz):
Use your text editor for multi-line input when possible (y/n)? y
Paginate output (always/never/auto)? auto
Project name (enter for scratch):
Issues can be tracked across the project as a whole, or the project can be
split into components, and issues tracked separately for each component.
Track issues separately for different components? (y/n): y

Current components:
None!

(A)dd component, (r)emove component, or (d)one: a
Component name: documentation

... snip ...

(A)dd component, (r)emove component, or (d)one: d
Ok, .ditz directory created successfully.

And here’s how you can create a single issue.

$ ditz add
Title: Write something justifying yet another web framework
Is this a (b)ugfix, a (f)eature, or a (t)ask? t
Choose a component:
1) scratch
2) documentation
3) model code
4) controller code
5) view code
Component (1--5): 2
Issue creator (enter for Matthew Wilson ):
Added issue documentation-1 (e8a4a43f78ee83300cc0372a13375d9534b97abb).

You can’t tell, but when I punched in the title, ditz opened my $EDITOR and I wrote a longer description in there.

Now the same thing in be:

$ be set-root
Guessing id 'matt '
No revision control detected.
Directory initialized.

$ be new 'Write something justifying yet another web framework'
Guessing id 'matt '
Guessing id 'matt '
Created bug with ID 4d4

Not quite the same experience!

Here’s what a ditz issue looks like:

$ ditz show documentation-1
Issue documentation-1
---------------------
Title: Write something justifying yet another web framework
Description: Why not just polish any of the ones already out there?
Type: task
Status: unstarted
Creator: Matthew Wilson
Age: four minutes
Release:
References:
Identifier: e8a4a43f78ee83300cc0372a13375d9534b97abb

Event log:
- created (matt, four minutes ago)

And in be:

$ be show 4d4
Guessing id 'matt '
ID : 4d4e6a17-2097-42bb-a3cd-3c17566ecce8
Short name : 4d4
Severity : minor
Status : open
Assigned :
Target :
Creator : matt
Created : Mon, 22 Dec 2008 20:25 (Tue, 23 Dec 2008 01:25:04 +0000)
Write something justifying yet another web framework

Ditz issues have titles, long descriptions, types (feature, bugfix, or task), releases (optionally) and links to components (also optionally). There are ditz plugins to add support for assigning issues to people.

be has most of the same concepts, just with different names.

data serialization and storage

ditz makes a .ditz directory at the top of a project and be makes a .be directory in the top of the project.

Inside the .ditz folder, there’s one project.yaml file that lists releases (groupings of issues) and components (also groupings of issues, but cross-cutting). Then each issue lives in its own yaml file, and they look like this:

$ cat .ditz/issue-ac3177b3bf8c6757625977ef27279c1fe05df662.yaml
--- !ditz.rubyforge.org,2008-03-06/issue
title: Write some "WHY?" documentation
desc: Justify the existence of this project.
type: :task
component: documentation
release:
reporter: Matthew Wilson
status: :unstarted
disposition:
creation_time: 2008-12-23 00:59:05.840956 Z
references: []

id: ac3177b3bf8c6757625977ef27279c1fe05df662
log_events:
- - 2008-12-23 00:59:05.841349 Z
- Matthew Wilson
- created
- ""
- - 2008-12-23 01:08:58.605955 Z
- Matthew Wilson
- commented
- |-
Yeah, if you're gonna build another web framework, this needs to be
really good.

Meanwhile, be is fairly similar, but bugs get whole directories to themselves. be uses what seems to be a home-made plain-text format for storing bugs:

$ cat .be/bugs/4da8ee85-9353-4a92-a654-8510bb8be0d0/values

creator=matt

severity=minor

status=open

summary=Write some "WHY?" documentation

time=Tue, 23 Dec 2008 01:12:13 +0000

There’s actually much more whitespace than that. I replaced the eight blank lines between each line of text with just two blank lines.

While ditz stores the comments inside the issue’s yaml file, be makes a directory under the issue’s directory, and then stores the text of the comments in one file and the information about who said it in a separate file.

The community

The ditz mailing list is really active with people debating ideas for new features. The be mailing list is now showing some signs of life after looking dead in August.

What ditz has that be lacks

ditz can make really pretty HTML pages for all the issues for a project. example.

yaml was a really good choice. yaml makes it easy to deserialize to higher objects than just crappy boring primitive types like arrays. Instead, you can hop all the way to your own weird home made objects by specifying a tag. Then all the stuff in the yaml file gets passed in to your object.

Ditz has lots and lots of commands that are only on the be roadmap. You can search your issues with regular expressions with ditz grep, you can claim issues for yourself, you can group issues by releases and components, etc, etc, etc.

The ditz issue data model can be extended with plugins. Like I mentioned earlier, one plugin makes it possible for people to claim issues as assigned to them.

What I like about be

It’s written in python. I hate to feed the python snobbery monster, but there are certain python niceties that I don’t like doing without. In particular, ipython is just too awesome. When I read the ditz code, I spent most of my time navigating the code to get to the part that I cared about that. With ipython, I don’t have that problem. I just hit foo?? and immediately see the source code.

And ruby’s documentation is not what I’ve grown accustomed to with python. For comparison:

I think the Python docs have more explanatory text in just the table of contents.

In addition, ditz uses a lot homemade code: there’s a homemade option parser library (trollop), a homemade hack on the way ruby stores data files so that all the HTML templates are available, and all sorts of gymnastic FP tricks to get a lot of shit done in a very small number of lines. That’s cool, but as a yellow-belt in Ruby, it is really @#$ing hard to make any contributions to this project. Here’s some code that I find a little difficult to read:

def operation method, desc, *args_spec, &options_blk
@operations ||= {}
@operations[method] = { :desc => desc, :args_spec => args_spec,
:options_blk => options_blk }
end

operation :stop, "Stop work on an issue", :started_issue do
opt :comment, "Specify a comment", :short => 'm', :type => String
opt :no_comment, "Skip asking for a comment", :default => false
end

def stop project, config, opts, issue
puts "Stopping work on issue #{issue.name}: #{issue.title}."
issue.stop_work config.user, get_comment(opts)
puts "Recorded work stop for #{issue.name}."
end

After tracing through a few hundred lines of stuff like that, I usually get discouraged and just write a feature request rather than a patch.

In summary

I like ditz. I like reading nearly inscrutable Ruby code to see how wacky people solve problems. My experience with ditz so far has been about an A minus, which is pretty good!

Why I’m going to write my own

There have been a few times where the lack of a proper database system has bit me. Like when I renamed a release, I had to do some searching and replacing in lots and lots of files. Also, regenerating my HTML views is taking almost two minutes now that I have so many issues. Also, certain operations, like moving a handful of issues from one release to another, or searching for intersections of issue subsets, are trickier than what they should be.

Besides all that, I’m fascinated by couchdb, and I think this would be a good use.

I think my system is going to use a local couchdb server that loads in all the issues from local yaml files into the server on startup. Then after lots of work updating, I’ll write out all the issues back into yaml. So, when you update your checkout of your code, you’ll need to restart or reload your couchdb server. Then you can use the couchdb server to work with the system, and then at the end, re-serialize the data back out to JSON, and then to yaml.

ditz and be are sort of like old-school CGI web apps where each user action has to start up some the framework, do the action, then tear down. My system will instead keep all the issue data in memory and require explicit startups and shutdowns.

A few simple PostgreSQL tricks

I’ve got a bunch of these things collecting dust in a text file in my home directory. Maybe they’ll help somebody else out.

Transpose display in psql with \x


matt=# select now(), 1 + 1 as sum;
now | sum
-------------------------------+-----
2008-08-20 13:00:44.178076-04 | 2
(1 row)

matt=# \x
Expanded display is on.
matt=# select now(), 1 + 1 as sum;
-[ RECORD 1 ]----------------------
now | 2008-08-20 13:00:46.930375-04
sum | 2
matt=# \x
Expanded display is off.
matt=# select now(), 1 + 1 as sum;
now | sum
-------------------------------+-----
2008-08-20 13:01:19.725394-04 | 2
(1 row)

See how long every query takes


matt=# \timing
Timing is on.
matt=# select now();
now
-----------------------------
2008-12-18 12:31:50.60008-05
(1 row)

Time: 76.322 ms

By the way, you can put these into your $HOME/.psqlrc to always turn on timing at the beginning of a session::

$ cat $HOME/.psqlrc

-- Report time used for each query.
\timing

Define a function in python

First you have to install plpython. Then connect to the database as somebody with sufficient privileges and then type:

CREATE LANGUAGE 'plpythonu';

to allow functions defined in plpython.

Here’s a toy example:

matt=# create or replace function snoz (INT)
returns INT AS
$$
x = args[0]
return x * x * x
$$ language 'plpythonu';
CREATE FUNCTION

And here it is in use:

matt=# select 3, snoz(3);
?column? | snoz
----------+------
3 | 27
(1 row)

Use a trigger to set the modified date column

First define a function to set the modifieddate column:

create or replace function ts_modifieddate()
returns TRIGGER as
'
BEGIN
NEW.modifieddate = now();
RETURN NEW;
END;
' language 'plpgsql';

Now set up the trigger to call that function::

create trigger set_modifieddate before update
on knid for each row
execute procedure ts_modifieddate();

Use a trigger to set one column based on other columns

I got a table like this:

create table snoz (
a bool default false,
b bool default false,
c bool default false
);

I want a trigger to set c to true when a and/or b is true.

First, I define a function that does what I want:

create or replace function set_c()
returns trigger as $$
BEGIN
if NEW.a = true or NEW.b = true then NEW.c = true; END IF;
RETURN NEW;
END;
$$ language 'plpgsql';

Now I wire up that trigger to execute when I want:

create trigger update_c
before insert or update
on snoz
for each row
execute procedure set_c();

Here it is all in action:

matt=# insert into snoz (a, b) values (false, false), (false, false);
INSERT 0 2
matt=# select * from snoz;
a | b | c
---+---+---
f | f | f
f | f | f
(2 rows)

matt=# insert into snoz (a, b) values (false, true), (true, false);
INSERT 0 2
matt=# select * from snoz;
a | b | c
---+---+---
f | f | f
f | f | f
f | t | t
t | f | t
(4 rows)

See, it works!

code-formatting people: I need your help

This is the sort of post my wife says doesn’t count when I tell her I wrote a new blog entry.

Anyhow, I’m curious how people format their code. For the stuff below, how would you write it? Feel free to do whatever you want; make temporary variables, rearrange blocks, whatever. I’m looking for interesting new ways to make my code as easy to comprehend as possible.

Here’s an example of how I break assignments over multiple lines:

# By the way, self.categories is a dictionary, not a list,
# because there's no guarantee that all the keys everywhere
# will be contiguous.
cat1, cat2, cat3, cat4, cat5 = \
[self.categories[x] for x in range(1, 6)]

This next excerpt shows a couple of my habits. I love the new conditional assignment possible in 2.5, I hate making lots of intermediate variables, and I really love using list comprehensions rather than for loops.

def my_send_methods(self, SendMethod, disabled=False):
return [(x.id, x.display_name,
{
'selected':1 if self.preferred_send_method == x else None,
'disabled':1 if disabled else None,
})
for x in SendMethod.constants.values()]

viewer discretion advised on this next block

This next block is the return statement from one of my TurboGears 1.0 controller methods.

return dict(htmlclass="advancedscheduling", v2org=v2org, name=name,

employees=[(0, 'None')] + [(x.id, x.display_name,
{'selected':1 if x.id == employee_id else None})
for x in v2org.employees],

locations=[(x.id, x.display_name, {'selected':1 if x.id == location_id else None})
for x in v2org.locations],

statuses=[(x.id, x.display_name, {'selected':1 if x.id == status_id else None})
for x in model.ShiftStatus.select()],

)

So, how to make that better? I want to preempt suggestions to do stuff like:

a = [...]
b = [...]
c = [...]
return dict(a=a, b=b, c=c)

That’s not better! For two reasons:

  1. You’re taking up way more vertical space

    Taken to its logical conclusion, you’ll take a statement like

    return dict(
    employees=[(0, 'None')] + [(x.id, x.display_name,
    {'selected':1 if x.id == employee_id else None})
    for x in v2org.employees]
    )

    And then rewrite it as

    employees = [(0, 'None')]
    for emp in v2org.employees:
    if emp.id == employee_id:
    d = {'selected': 1}
    else:
    d = {'selected':None}
    t = (emp.id, emp.display_name, d)
    employees.append(t)
    return dict(employees=employees)

    I find this annoying, but not as annoying as the next point.

  2. It is no longer obvious that each value is calculated independently

    Repeating the example from above:

    a = [...]
    b = [...]
    c = [...]
    return dict(a=a, b=b, c=c)

    Until I read inside those list comprehensions, or even worse, follow the functions you used to define each one of those, I can’t be certain that each of those variables are determined independently of each other.

    In other words, in that version, at a casual glance, it is possible that b depends on a. But in the dictionary approach, it is obvious that b does not depend on a, since a doesn’t exist in any way that b can get access to it.

Rewrite my ugly code

I have a big gnarly function named get_start_and_stop_dates. Please rewrite it into something that still passes all the doctests but isn’t so ugly.

If you save this in a file named matt_sucks.py, you can run all the doctests like this:

$ nosetests --with-doctest matt_sucks.py
Doctest: matt_sucks.get_first_and_last_dom ... ok
Doctest: matt_sucks.get_start_and_stop_dates ... ok
Doctest: matt_sucks.stubborn_datetimeparser ... ok

----------------------------------------------------------------------
Ran 3 tests in 0.028s

OK

Here’s the code:

import simplejson
from datetime import date, datetime, timedelta

def get_start_and_stop_dates(d, s):

"""
Returns a tuple of datetime.date objects.

First checks dictionary d, then looks in the cookie s, then returns
the results of get_first_and_last_dom().

We return values from the dictionary d, even if the values exist in
simple_cookie s:

>>> d = {'start_date':'12-07-2008', 'stop_date':'12-20-2008'}
>>> import Cookie, simplejson
>>> s = Cookie.SimpleCookie()
>>> s['start_date'] = simplejson.dumps('12-08-2008')
>>> s['stop_date'] = simplejson.dumps('12-11-2008')
>>> a, b = get_start_and_stop_dates(d, s)
>>> from datetime import date
>>> isinstance(a, date) and isinstance(b, date)
True
>>> a.strftime('%m-%d-%Y'), b.strftime('%m-%d-%Y')
('12-07-2008', '12-20-2008')

If the dictionary d doesn't have values, then we get them from the
simple_cookie object s:

>>> a, b = get_start_and_stop_dates({}, s)
>>> from datetime import date
>>> isinstance(a, date) and isinstance(b, date)
True
>>> a.strftime('%m-%d-%Y'), b.strftime('%m-%d-%Y')
('12-08-2008', '12-11-2008')

We handle mix-and-match scenarios, like where one value is in d and
another is in s:

>>> s2 = Cookie.SimpleCookie()
>>> s2['stop_date'] = simplejson.dumps('2-28-1975')
>>> get_start_and_stop_dates({'start_date':'2-17-1975'}, s2)
(datetime.date(1975, 2, 17), datetime.date(1975, 2, 28))

When just one of the dates is specified, then the other will be
the first/last day of the month containing the other date:

>>> get_start_and_stop_dates({'start_date':'2-17-1975'},
... Cookie.SimpleCookie())
(datetime.date(1975, 2, 17), datetime.date(1975, 2, 28))

>>> get_start_and_stop_dates({'stop_date':'2-17-1975'},
... Cookie.SimpleCookie())
(datetime.date(1975, 2, 1), datetime.date(1975, 2, 17))

Finally, we call get_first_and_last_dom when all else fails:

>>> get_first_and_last_dom() == get_start_and_stop_dates({},
... Cookie.SimpleCookie())
True
"""

# I've revised this several times, but this is still pretty ugly.
# It probably can be divided into more functions.

# These are the last-resort values, holding the first and last days
# of the current month.
first, last = get_first_and_last_dom()

# These are the dateformats that the dates will be in.
dateformats = ['%m-%d-%Y', '%Y-%m-%d', '%Y-%m-%d %H:%M:%S']

start_date = stop_date = None

# Figure out the start_date first.
if 'start_date' in d and d['start_date']:
start_date = stubborn_datetimeparser(d['start_date'],
dateformats).date()

elif s.has_key('start_date') and s['start_date'].value:
start_date = stubborn_datetimeparser(simplejson.loads(s['start_date'].value),
dateformats).date()

# Now repeat the process for stop_date.
# TODO: pull this redundancy into a single function and call it
# twice.
if 'stop_date' in d and d['stop_date']:
stop_date = stubborn_datetimeparser(d['stop_date'],
dateformats).date()

elif s.has_key('stop_date') and s['stop_date'].value:
stop_date = stubborn_datetimeparser(simplejson.loads(s['stop_date'].value),
dateformats).date()

# Now figure out what to return. Remember, if we found one date,
# but not the other, then we return the first/last date of that month,
# not the current month.

if not start_date and not stop_date:
return first, last

elif start_date and stop_date:
return start_date, stop_date

elif start_date and not stop_date:
a, b = get_first_and_last_dom(start_date)
return start_date, b

elif not start_date and stop_date:
a, b = get_first_and_last_dom(stop_date)
return a, stop_date

def get_first_and_last_dom(dt="today"):
"""
Return a tuple of datetime.date objects with the first and last day of the
month holding dt, which defaults to today.

>>> first, last = get_first_and_last_dom(datetime(2008, 12, 6))
>>> first == datetime(2008, 12, 1).date()
True
>>> last == datetime(2008, 12, 31).date()
True

>>> first, last = get_first_and_last_dom(datetime(2008, 11, 30))
>>> first == datetime(2008, 11, 1).date()
True
>>> last == datetime(2008, 11, 30).date()
True
"""

if dt == "today":
dt = datetime.now()

# We have to be a little careful figuring out the last date.
if dt.month < 12: last_day = (datetime(dt.year, dt.month+1, 1) - timedelta(days=1)).date() else: last_day = (datetime(dt.year+1, 1, 1) - timedelta(days=1)).date() first_day = datetime(dt.year, dt.month, 1).date() return first_day, last_day def stubborn_datetimeparser(s, dateformats): """ Keep trying to parse s into a datetime object until we succeed or run out of dateformats. When the first format works, we immediately return: >>> dateformats = ['%Y-%m-%d', '%m-%d-%Y', '%m-%d-%Y %H:%M']
>>> stubborn_datetimeparser('12-1-2008', dateformats)
datetime.datetime(2008, 12, 1, 0, 0)

Otherwise, we keep trying until we parse it:

>>> stubborn_datetimeparser('12-1-2008', dateformats)
datetime.datetime(2008, 12, 1, 0, 0)

>>> stubborn_datetimeparser('12-1-2008 15:47', dateformats)
datetime.datetime(2008, 12, 1, 15, 47)

or we run out of formats, and raise a ValueError:

>>> stubborn_datetimeparser('12/1/2008', dateformats)
Traceback (most recent call last):
...
ValueError: I couldn't parse '12/1/2008' with any of my formats!
"""

for datefmt in dateformats:
try:
return datetime.strptime(s, datefmt)

except ValueError:
pass

# This else matches the for datefmt in dateformats loop. It means
# that we didn't break out of the loop early.
else:
raise ValueError("I couldn't parse '%s' with any of my formats!" % s)

Now get to work!

Define your validation schema inline

The TurboGears docs show how to assign validators for individual parameters in the validate decorator like this:

@validate(validators={'a':validators.Int(), 'b':validators.DateConverter()})
@error_handler()
def f(self, a, b, tg_errors=None):
# Now a is already an integer and b is already a datetime.date object,
# unless there were some validation errors.

That’s great, but there are some validations that depend on numerous parameters at the same time. For example, you might want to make sure that an employee’s hire date precedes the termination date.

I already knew how to subclass validators.Schema to do this, and then pass that instance into the validate decorator like this:

class MattSchema(validators.Schema):
a = validators.Int()
b = validators.DateConverter()
chained_validators = [blah] # pretend that blah does some compound validation.

@validate(validators=MattSchema())
def f(self, a, b)

This approach is fine, but today I discovered that it is also possible to define a Schema inline, inside the validate decorator, and specify the chained_validators right there, like this:

@expose('.templates.shiftreports.overtime')
@validate(validators=validators.Schema(
a=validators.Int(),
dt=validators.DateConverter(),
chained_validators=[blah]),
state_factory=matt_state_factory)
def f(self, a, b):

What’s the point? Well, it seems wasteful to define a class and hide it in another file if that schema is only going to be used for exactly one controller. Also, this makes it really fast for me to mix and match comound validators with controllers. I don’t need to pop open my separate validators file where all my elaborate schemas live. I can define them right here.

I’m very forgetful too, so I like to keep my code shallow so that I can instantly see what the heck something does. With all the validators right there, I can easily figure out what the system intends to do.

However, I would define a Schema subclass as soon as I see that I need the same thing twice.

I’m happy that the FormEncode authors had the foresight to support this inline approach along with the declarative style.