code-formatting people: I need your help

This is the sort of post my wife says doesn’t count when I tell her I wrote a new blog entry.

Anyhow, I’m curious how people format their code. For the stuff below, how would you write it? Feel free to do whatever you want; make temporary variables, rearrange blocks, whatever. I’m looking for interesting new ways to make my code as easy to comprehend as possible.

Here’s an example of how I break assignments over multiple lines:

# By the way, self.categories is a dictionary, not a list,
# because there's no guarantee that all the keys everywhere
# will be contiguous.
cat1, cat2, cat3, cat4, cat5 = \
[self.categories[x] for x in range(1, 6)]

This next excerpt shows a couple of my habits. I love the new conditional assignment possible in 2.5, I hate making lots of intermediate variables, and I really love using list comprehensions rather than for loops.

def my_send_methods(self, SendMethod, disabled=False):
return [(x.id, x.display_name,
{
'selected':1 if self.preferred_send_method == x else None,
'disabled':1 if disabled else None,
})
for x in SendMethod.constants.values()]

viewer discretion advised on this next block

This next block is the return statement from one of my TurboGears 1.0 controller methods.

return dict(htmlclass="advancedscheduling", v2org=v2org, name=name,

employees=[(0, 'None')] + [(x.id, x.display_name,
{'selected':1 if x.id == employee_id else None})
for x in v2org.employees],

locations=[(x.id, x.display_name, {'selected':1 if x.id == location_id else None})
for x in v2org.locations],

statuses=[(x.id, x.display_name, {'selected':1 if x.id == status_id else None})
for x in model.ShiftStatus.select()],

)

So, how to make that better? I want to preempt suggestions to do stuff like:

a = [...]
b = [...]
c = [...]
return dict(a=a, b=b, c=c)

That’s not better! For two reasons:

  1. You’re taking up way more vertical space

    Taken to its logical conclusion, you’ll take a statement like

    return dict(
    employees=[(0, 'None')] + [(x.id, x.display_name,
    {'selected':1 if x.id == employee_id else None})
    for x in v2org.employees]
    )

    And then rewrite it as

    employees = [(0, 'None')]
    for emp in v2org.employees:
    if emp.id == employee_id:
    d = {'selected': 1}
    else:
    d = {'selected':None}
    t = (emp.id, emp.display_name, d)
    employees.append(t)
    return dict(employees=employees)

    I find this annoying, but not as annoying as the next point.

  2. It is no longer obvious that each value is calculated independently

    Repeating the example from above:

    a = [...]
    b = [...]
    c = [...]
    return dict(a=a, b=b, c=c)

    Until I read inside those list comprehensions, or even worse, follow the functions you used to define each one of those, I can’t be certain that each of those variables are determined independently of each other.

    In other words, in that version, at a casual glance, it is possible that b depends on a. But in the dictionary approach, it is obvious that b does not depend on a, since a doesn’t exist in any way that b can get access to it.

15 thoughts on “code-formatting people: I need your help

  1. Thanks for the feedback!

    I'm talking about how the key-value pairs in your dictionary a are independent, not parameters going into a function. If you can figure out how to set one key in a dictionary based on the value of another key in that dictionary, DURING THE CONSTRUCTION, then I would really like to see it, because it would mean I'm completely wrong.

    Matt

  2. Regarding the “viewer discretion advised” block: I started trying to clean it up, but I don't think I can really make it better. The example is so small that the design of the containing system can't be escaped. What I would probably do, as a first round of repair, is:

    return dict(htmlclass="advancedscheduling",
                v2org=v2org,
                name=name,
                employees=v2org.employees.to_json(),
                locations=v2org.locations.to_json(),
                statuses=[status.to_json()
                          for status in model.ShiftStatus.select()))

    I'm assuming here that v2org is a model object. I don't know whether the constraints imposed by your ORM would allow you to say “v2org.employees.to_json()”, since “employees” might be some kind of result set from the database. But the basic idea I would advocate is that the model objects should know how to express themselves in whatever formats you need, and the controller should just ask them for those formats when it needs them. That way the controller does one job (mediates between user actions and model changes) and the model does one job (exposes the data).

    In general, my opinion is: if your code is complex enough that you need to think hard about formatting it, you definitely are doing too much in the method, and may be doing too much in the class.

  3. All of that looks pretty unpythonic to me. Long lines and long statements are not as clear as breaking things into separate lines and using temporary variables. Your style reminds me of Java where, by the time you are halfway through a construct, you forget what the construct was about.

    As far as actual code, in the first case I would just make the whole assignment in single line, since it is less than 80 characters.

    The next two… yeah, I would break things down to something like you showed in . I just find it much easier to read. I don't think it is a problem to use lots of vertical space when you have to use lots of vertical space. Break it into a function if it seems bothersome.

    I also don't think 2. is really true. Think about this trivial example:

    <pre>
    In [1]: a={'a':1, 'b':2}

    In [2]: def foo(x, y): print x, y
    …:

    In [3]: foo(a.pop('a'), a)
    1 {'b': 2}
    </pre>

    The second parameter is clearly dependent on how you derive the first parameter.

    List comprehensions and generators are great, but you shouldn't overdo them. If the expression becomes long, it is a hint that it is probably too complex to easily read and understand. The obvious exception to this rule would be a case where this piece of code was a performance bottleneck and list comprehension was the fix to make it fast enough.

  4. That's a really good idea. Thanks!

    For a while, I was hung up on the notion of putting code into the model that is really a view-y kind of thing. Somewhere back I wrote a post called MVC blasphemy to that effect, where I added methods to individual model objects so that they would have an HTML-ish form.

  5. Breaking assignments over multiple lines: try to avoid backslashes. They're fragile and ugly. One space after the backslash will break it. Instead, I'd do:

    cat1, cat2, cat3, cat4, cat5 = [
    self.categories[x] for x in range(1, 6)]

    IOW, use the properties of Python's brackets (and braces and parentheses).

    As for conditional expressions, I hope you realize that these two dictionaries are almost identical:

    {'selected':1 if self.preferred_send_method == x else None,
    'disabled':1 if disabled else None,}

    {'selected': self.preferred_send_method == x,
    'disabled': disabled,}

    Python's True == 1, and False == 0, so just use the condition directly as in the second version above. You used None as a false value; unless there's a good reason not to, you should use False instead.

  6. David,

    I flip-flop a lot on using backslashes vs opening a list on one line
    and then putting the meat of the assignments on the next line.

    Thanks for the tip on conditional expressions. I'll go back and see
    if that approach works. I prefer using None rather than False because
    I'm never going to misinterpret None as an integer.

    Thanks for the feedback.

  7. You are not a fan of OOP are you? A functional language like Erlang seems more apropiate for you, Python's parameter passing is not side effect free so your issue #2 doesn't apply much.

    Instead of returning tuples and dictionaries you should be returning class instances where the breaking down into separate lines is more natural. I for one can't understand why selected and disabled aren't properties or methods of employee.

    This also makes the code more readable for collaborators, experience tells me highly functional programs tend to be terse on documentation. If you are programming in Python is probably because you appreciate readability, so not only do I expect you to use intermediary variables, I even expect you to give them descriptive names so everybody knows what that value you just calculated is.

    I understand you have your style and respect it and given that most of your code is written to work like this its probably a bad idea to switch now, so I'll simply encourage you to follow the advises given by Gary and David about proper indentation.

    Nice post, cheers.

  8. I'm not against OOP, I just learned how to program in an environment without any OOP people around me, so I doesn't come naturally. And I've worked with some bad programmers that were also OOP fans, so I've read a lot of really ugly OOP code.

    For example, I dislike having to hop up four or five or more parent classes to find the code that actually does what I want. So maybe in reaction to that, I'm too far on the opposite extreme, where I avoid any anything but low-level data types and literals.

    I write these posts in order to get feedback and hopefully improve. Thanks for your comments.

  9. Thanks for the writeup. I think maybe you can do preformatted text with the code tag, and then use lots of nbsp's to indent.

    def f(a, b, c):
      return a + b + c

  10. <pre>
    The only real gripe I have with the non-standard bool values. If you must have a bool that can’t be used to do arithmetics create some kind of custom Matt bool:

    class mbool:
    def __init__(self,expr):
    self.val=bool(expr)
    def __nonzero__(self):
    return self.val
    def __bool__(self):
    return self.val

    then you can also extract the repeating part:

    def selected_lst(org_entity, select_id):
    return [(x.id, x.display_name, {'selected':mbool(x.id == status_id)})
    for x in org_entity]

    Which allows you to say:
    return dict(
    htmlclass="advancedscheduling",
    v2org=v2org,
    name=name,
    employees=[(0, 'None')] + selected_lst(v2org.employees,employee_id),
    locations=selected_lst(v2org.locations,location_id),
    statuses=selected_lst(model_ShiftStatus.select(),status_id),
    )

    </pre>

  11. Here's how I would format these:

    <pre>
    cat1, cat2, cat3, cat4, cat5 = [self.categories[x] for x in range(1, 6)]
    </pre>

    since it fits in 78 columns (duh!). If it didn't, I'd try

    <pre>
    cat1, cat2, cat3, cat4, cat5 = [self.categories[x]
                                    for x in range(1, 6)]
    </pre>

    and

    <pre>
    cat1, cat2, cat3, cat4, cat5 = [
            self.categories[x] for x in range(1, 6)]
    </pre>

    On to the second example. I find those repeated “1 if condition else None”
    expressions annoying; surely a helper can make them readable

    <pre>
    def output_bool(self, x):
        return 1 if x else None
    </pre>

    I'd name that helper $outputformat_bool for whatever output format you're
    producing.

    Next, I'd be inclined to put every tuple element on a separate line, if at
    least one of them is so large that it doesn't fit on a single line.

    <pre>
    def my_send_methods(self, SendMethod, disabled=False):
        return [(x.id,
                 x.display_name,
                 dict(selected=output_bool(x == self.preferred_send_method),
                      disabled=output_bool(disabled)),
                ) for x in SendMethod.constants.values()]
    </pre>

    For the third example, I see your argument, but I think the intimidation factor
    of a single expression spanning 9 lines of code is worse than having people
    explicitly make sure the expressions are independent:

    <pre>
    employees = ([(0, 'None')] +
                 [(x.id, x.display_name,
                   dict(selected=output_bool(x.id == employee_id)))
                  for x in v2org.employees])
    locations = [(x.id, x.display_name,
                  dict(selected=output_bool(x.id == location_id))
                 for x in v2org.locations]
    statuses = [(x.id, x.display_name,
                 dict(selected=output_bool(x.id == status_id))
                for x in model.ShiftStatus.select()]
    return dict(htmlclass=”advancedscheduling”, v2org=v2org, name=name,
                employees=employees, locations=locations, statuses=statuses)
    </pre>

    Although this particular example has enough regularities to want to extract
    them into a helper

    <pre>
    def item_list(items, selected_id):
        return [(x.id, x.display_name,
                 dict(selected=output_bool(x.id == selected_id))
                for x in items]

    employees = [(0, 'None')] + item_list(v2org.employees, employee_id)
    locations = item_list(v2org.locations, location_id)
    statuses = item_list(model.ShiftStatus.select(), status_id)
    return dict(htmlclass=”advancedscheduling”, v2org=v2org, name=name,
                employees=employees, locations=locations, statuses=statuses)
    </pre>

    HTH,

  12. Here's how I would format these:


    cat1, cat2, cat3, cat4, cat5 = [self.categories[x] for x in range(1, 6)]

    since it fits in 78 columns (duh!). If it didn't, I'd try


    cat1, cat2, cat3, cat4, cat5 = [self.categories[x]
                                    for x in range(1, 6)]

    and


    cat1, cat2, cat3, cat4, cat5 = [
            self.categories[x] for x in range(1, 6)]

    On to the second example. I find those repeated “1 if condition else None”
    expressions annoying; surely a helper can make them readable


    def output_bool(self, x):
        return 1 if x else None

    I'd name that helper $outputformat_bool for whatever output format you're
    producing.

    Next, I'd be inclined to put every tuple element on a separate line, if at
    least one of them is so large that it doesn't fit on a single line.


    def my_send_methods(self, SendMethod, disabled=False):
        return [(x.id,
                 x.display_name,
                 dict(selected=output_bool(x == self.preferred_send_method),
                      disabled=output_bool(disabled)),
                ) for x in SendMethod.constants.values()]

    For the third example, I see your argument, but I think the intimidation factor
    of a single expression spanning 9 lines of code is worse than having people
    explicitly make sure the expressions are independent:


    employees = ([(0, 'None')] +
                 [(x.id, x.display_name,
                   dict(selected=output_bool(x.id == employee_id)))
                  for x in v2org.employees])
    locations = [(x.id, x.display_name,
                  dict(selected=output_bool(x.id == location_id))
                 for x in v2org.locations]
    statuses = [(x.id, x.display_name,
                 dict(selected=output_bool(x.id == status_id))
                for x in model.ShiftStatus.select()]
    return dict(htmlclass="advancedscheduling", v2org=v2org, name=name,
                employees=employees, locations=locations, statuses=statuses)

    Although this particular example has enough regularities to want to extract
    them into a helper


    def item_list(items, selected_id):
        return [(x.id, x.display_name,
                 dict(selected=output_bool(x.id == selected_id))
                for x in items]

    employees = [(0, 'None')] + item_list(v2org.employees, employee_id)
    locations = item_list(v2org.locations, location_id)
    statuses = item_list(model.ShiftStatus.select(), status_id)
    return dict(htmlclass="advancedscheduling", v2org=v2org, name=name,
                employees=employees, locations=locations, statuses=statuses)

    HTH,

Comments are closed.