The Dark Knight deconstructed

This story articulates my inchoate thoughts when I walked out of the theater after seeing The Dark Knight.

This blockquote will hook you or repel you:

Wayne uses his position as an Anglo-Saxon capitalist to marshal vast resources to develop military grade technology and materiel in the ‘fight’ against the ‘gangs’ in Gotham. Missiles, grenades, various projectiles, military or special-forces transportation methods (Fox borrows ideas from the CIA’s assassination squads quite overtly, i.e. “Skyhook”), mass surveillance, and old-fashioned brutality are Wayne’s stock-in-trade. All the while, he obfuscates his identity as the “Batman” in order to protect his position as capitalist, and to avoid public responsibility for his extra-legal violence. Of course, the actual Gotham police have no intention of arresting Batman for vigilante savagery, despite public acrimony over the rule of law.

I’ve written stuff in this tone, but this fellow is a master.

Posted in mlp

Toward a horrifying new workflow system

Offline version control is nice. Reviewing logs, viewing diffs, and applying merges is dramatically faster when everything is local.

It would be nice to be able to use my ticketing system offline also. I don’t like the context switch of leaving my editor to go over to my web-based ticket system when I want to make a note of something to do later.

Completely unrelated to offline access is the idea that most of the time, my ticketing system and my source control systems are barely aware of eachother.

For example, when I write a “fix this low-priority bug” ticket, it isn’t obvious what revision and branch of my code I’m talking about. Sure, I can add that information manually, and maybe my ticket system will even require me to do that, but I can very easily put incorrect information in there.

I’m thinking about starting an open-source project to make this happen. Please leave a comment with your thoughts (if you have any).

Goals

  • Allow offline access to reading and writing tickets.
  • Unify source control with ticketing so that tasks, requested features, and bug reports are linked with the relevant code.

What fields are in a ticket?

I’ve used bugzilla, various homemade ticketing systems, trac, and redmine. They all have at least these fields in every ticket:

  • Globally unique ID
  • Title
  • Description
  • Comments
  • Deadline
  • Status
  • Priority
  • Assigned To

A decent ticketing system usually also supports category keywords, lists of interested people that want to be included in discussions, links to external files, estimated time cost, etc.

Also, a decent ticketing system has lots of nice aggregate views of tickets.

I don’t want to talk about building a *decent* distributed system. I’ll settle for a toy one that just barely works.

Text file approach

Put each ticket in a separate file. Name the files like “make-login-box-green.txt”. Use some markup to divide the file into fields.

Append comments and follow-up notes to the end of the ticket.

This is easy and intuitive. I can use vim to write the tickets. The source control system can track changes in the text file itself to notice changes. So, the source control system reveals if I renamed the title.

The disadvantage is that doing diffs on different versions of a text file just notices text differences.

For example, if I rename a ticket’s title from one line to two lines, and then also change the deadline, all I can see in my report of what changes is the information about what characters on what lines are different.

Local client approach

Use something like the text file approach, with a few bells and whistles.

Instead of just firing up vim/emacs/notepad/textmate and starting typing, I want to use a simple local app like this:

$ cd ~/projects/myproj1/trunk/tickets
$ dist-ticket new --title "Change color of login buttons from green to pink"

Then the app would open $EDITOR with a new file that includes the template of a new ticket, populated with the command-line options passed in.

This local client could create the new file with a file name that includes a UUID function to generate globally unique file names.

The app would also be able to compare diffs between revisions of tickets, like this:

$ dist-ticket diff -r 1:2 ticket-123.txt
Title renamed from "Change color of login buttons from green to pink"
to "Change login buttons style"
Deadline extended by six months.
Changed priority from High to Low.

Of course, the underlying source control system would still be able to show the text differences between the two files. But the dist-ticket script would apply a little bit more intelligence.

Stuff I’m pretty sure about

I want this thing to be agnostic to the underlying source control system. People using centralized source control should still be able to use it and people using some homemade crap should all be able to use it.

Initially it might not be possible to have dist-ticket go and talk to the VCS and get revisions. Instead of the dist-ticket diff thing above, we might need to do this at first:

$ bzr cat -r 1 ticket-123.txt > /tmp/rev1.txt
$ bzr cat -r 2 ticket-123.txt > /tmp/rev2.txt
$ dist-ticket diff /tmp/rev1.txt /tmp/rev2.txt

So after the interface is defined, I can build the implementations for the VCS that I care about. The first two lines could be built into dist-ticket.

Associating tickets, branches, and revisions

I know I want to link my source control system somehow to my ticketing system. I see a couple of ways to do it:

  • Underneath each branch, make a top-level directory called tickets. Then tickets would be seen as related to that branch. One ticket would be linked to exactly one branch.

    OR

  • Keep tickets in some place completely outside the branches directory.

    Make some other intermediate data structure that holds a many-to-many relationsip between tickets and branches.

I like the second approach a lot more than the first approach. I would love to be able to look backward and see how I started working on feature X95321 as of revision 123, and then marked it complete as of revision 148. It would be nice to easily see which files are relevant and which ones are not relevant.

Maybe I would need to hack the VCS commit process somehow track relevant tickets per each commit. Maybe I could just store extra text in the comment part of the commit.

I finally have a project that justifies learning prolog

Sometimes when I’m feeling batty,I’ll put the Three Laws of Robotics in my source code. Usually, this is an veiled insult aimed at myself; the comment is my script is gotten so morbidly complex that it threatens to wake up and kill me.

On a completely unrelated note, I’ve been picking at the edges of prolog for the last couple of years. I’ve worked my way through a free Prolog textbook, and now I’m very slowly working my way through Language, Proof and Logic in order to learn me some predicate calculus.

Now I thought of a project that combines these two. I’m gonna build daydream about defining an ontology suitable for making robots comply with those three laws of robotics. Once I finished, it would work like this:


> Eat baby
Violates law #1!
> Mop floor
OK
> Burn down abandoned house
OK

You get the idea.

PyOhio was a smashing success

The Columbus Metro Library offered a fantastic location for us. Wireless internet, multiple meeting rooms, one room with about 30 workstations, etc. Really great location. A $15 donation makes you a “friend” of the library, and gets you a 15% discount at the coffee booth.

Catherine Devlin led the charge of organizing this conference, and she did it amazingly well.

The slides from my decorator talk are available here. I’ll be breaking them down into a series of blog posts with a lot more commentary, so stay tuned.

Howard Roark!

From the article:

The drink request Sunday, said Simmermon, who was visiting from Brooklyn, was denied by a barista who told him that Murky doesn’t do espresso over ice. Irked, Simmermon said he asked for a triple espresso and a cup of ice, which he said the barista provided, grudgingly.

Apparently Murky Coffee would prefer to do it right or not at all. Brilliant. I completely respect that.

Posted in mlp

Notes from clerb meeting on Thursday, July 17th

DimpleDough provided a great location for this month’s Cleveland Ruby Users Group and they even shelled out for dinner. We heard a really good talk about about ruby and F#.

The ruby material covered some neat corners of the language like the method_missing method, which operates like python’s getattr. Here’s a toy example of how it can be used:

irb(main):005:0> class C
irb(main):006:1> def foo
irb(main):007:2> 1
irb(main):008:2> end
irb(main):009:1> end
=> nil
irb(main):010:0> c = C.new
=> #
irb(main):011:0> c.foo
=> 1
irb(main):012:0> class C
irb(main):013:1> def method_missing(m, *args)
irb(main):014:2> puts "you tried to call a method #{m}"
irb(main):015:2> end
irb(main):016:1> end
=> nil
irb(main):017:0> c.baz
you tried to call a method baz
=> nil
irb(main):018:0>

Incidentally, note how I added a new method to class C after I originally defined it. That’s a cute trick in ruby. I can imagine a lot of nasty misuses of that, but I think the “we’re all consenting adults” rule should apply. And when a class has dozens of methods, it might be helpful to divide them across different files.

We talked about currying as well, in the context of F#. I tend to use currying in this scenario:

  • I recognize that two separate functions could be refactored to be a single function with a whole bunch more parameters;
  • I remake the original functions as curried versions of the new super function.

In other words, if I already have two methods, like paint_it_red(it) and paint_it_green(it), it’s trivial to realize I could write a paint_it_some_color(it, color) and then replace the original paint_it_red with a curried version.

I found this really useful when it isn’t just a single parameter I’m fixing to a constant value, but maybe a whole bunch.

Apparently, Ruby will add currying support in 1.9. I tried to see if I could “fake it” in the irb interpreter, but I just made a mess:

irb(main):036:0> def f(a, b)
irb(main):037:1> a + b
irb(main):038:1> end
=> nil

Nothing interesting so far. f adds its two parameters. So now I’m going to try to make a new function that returns a version of function f with the first parameter a set to fixed value:

irb(main):046:0> def curried_f(a)
irb(main):047:1> def g(b)
irb(main):048:2> a+b
irb(main):049:2> end
irb(main):050:1> return g
irb(main):051:1> end
irb(main):053:0> curried_f(1)
ArgumentError: wrong number of arguments (0 for 1)
from (irb):50:in `g’
from (irb):50:in `curried_f’
from (irb):53
from :0

The problem (I think) stems from how in Ruby, if I just type the name of the function, the function gets called. So in line 50, when I’m trying to return a reference to the new function I just created, Ruby evaluates the result of calling g without giving it any parameters.

I bet I’m doing something very un-ruby-tastic with this approach. I’m probably supposed to leverage those anonymous blocks instead.

F# looks really interesting. It supports all those weird prolog/erlang/haskell-style features like single assignment, pattern matching, and optimal tail-call recursion, with the benefit of having access to the .NET libraries as well.

One of the best professors I studied under made a remark that in COBOL, you think for five minutes and then type for two hours, but in prolog, you think for two hours, and then type for five minutes. I agree. I have learned a lot of languages, but I haven’t gotten any smarter. I’m just learning how to map my thoughts into notation much more quickly.

I would love to have the time and reason to do a project with F#. I think I’ll start by installing mono and messing around.

I need to write faster tests

This is not ideal:

----------------------------------------------------------------------
Ran 84 tests in 370.741s

OK

My tests take so long for two reasons. First of all, most of them use twill to simulate a browser walking through a version of the web app running on localhost. Second, my test code reads like a novel. Here’s an example, slightly embellished to make a point:

setup: connect to the database and find or create a hospital and an employee named “Nurse Ratched.” Find or create a bunch of open shifts in the emergency department. Find or create another nurse named Lunchlady Doris*.

test: Nurse Ratched wants to see what shifts are available to be picked up. So she logs into the app. Then she navigates to the “open shifts” screen, and then filters down to shifts in the emergency department over the next seven days. Then she wants to sign up for the shift starting at midnight on Saturday night. So, she clicks the “sign up” icon. The system verifies that this shift + her already-scheduled hours won’t push her into overtime, and she has no other flags on her account, so she is automatically scheduled.

Then the system sends her a confirmation message, which according to her preferences, is sent to her email address. Then the system queues an SMS message to be delivered an hour before the shift starts in order to remind her (also according to her preferences).

Finally, the test verifies that the shift is now not listed as available by simulating Lunchlady Doris logging in and checking that same “open shifts” screen.

If everything checks out, print a dot, and move on to the next chapter.

teardown: Unassign Nurse Ratched from the shift she picked up.

I think twill in itself is fine. Marching through a series of pages is problematic. I do this to set up conditions for testing later on. As a side benefit, I verify everything checks out along the way.

On the plus side, I’m confident that the integration of all these components do in fact play nice together. I don’t think it’s safe to abandon end-to-end testing like this, but I would like not to depend it every time I want to make some slight change to a component. It would be nice to run these right before a commit, but only run some super-fast tests after each save.


[*]People that understand this reference should reevaluate their priorities in life. back

I heart Python doctests

I wrote the doctests for the function below and then wrote the code to satisfy them in a total of about 30 seconds. As an extra plus, these doctests immediately clarify behavior in corner cases.

def has_no(s):
"""
Return False if string s doesn't have the word 'no' inside.

>>> has_no('no problem')
True

>>> has_no('not really')
False

>>> has_no('no')
True

>>> has_no('oh nothing')
False
"""

if s.lower() == 'no': return True
if s.lower().startswith('no '): return True
if s.lower().endswith(' no'): return True
if ' no ' in s.lower(): return True

return False

Writing tests in any other testing framework would have taken me much longer. Compared to writing these tests with nose, writing this:

assert not has_no('oh nothing')

wouldn’t take me any more time than

>>> has_no('oh nothing')
False

But that’s not all there is to it. With nose, I’d need to open a new test_blah.py file, then import my original blah.py module, then I would have to decide between putting each assert in a separate test function or just writing a single function with all my asserts.

That’s how a 30-second task turns into a 5-minute task.

Anyhow, I’m surprised doctests don’t get a lot more attention. They’re beautiful. Adding tests to an existing code base couldn’t be any simpler. Just load functions into an interpreter and then play around with it (ipython has a %doctest_mode, by the way).

For a lot of simple functions (like the one above) it is easy to just write out the expected results manually rather than record from a session.

It is also possible to store doctests in external text files. The Django developers use this trick frequently.

Finally, I don’t try to solve every testing problem with doctests. I avoid doctests when I need elaborate test fixtures or mock objects. Most of my modules have a mix of functions with doctests and nose tests somewhere else to exercise the weird or composite stuff.

Incidentally, this post is where Tim Peters introduced the doctests module.

Supply-side economics explained

I wrote this post four years ago on kuro5hin. The picture has just gotten worse since then. I sure am glad I’m learning to grow food in the backyard.

George W. Bush’s economic policy is based on trickle-down economics, also known as supply-side stimulus. Reagan was a big fan of this idea also. Simply described, supply siders argue that the best way to stimulate the economy to grow is to cut taxes on the wealthy. When their tax rates fall, the rich will increase their investments. For example, a restaurant owner might decide to build a larger kitchen if she gets a big refund check. Then, she’ll have to hire more workers to staff that kitchen, and so employment goes up, indirectly because of that original tax cut.

It’s an appealing idea. Reagan argued that it even makes sense for the government to cut taxes to below current spending and take on debt because in the long run, the economy would grow back so that eventually the tax cut would pay for itself. This approach is called “supply-side” because the stimulus (the tax cut) are applied to the suppliers of goods and services (the business sector).

The common objection to supply-side economics is that there’s absolutely no guarantee that if you cut taxes on the wealthy, then they will use that money to invest in new business. In fact, since these tax cuts happen in bad economic times, investors might decide that their money is safer if they save it rather than invest it. Going back to the restaurant example, if the restaurant owner decides to just stuff that tax refund into a savings account, or just keep it in her mattress, then no job growth occurs.

Also, if the government did what Reagan (and George W. Bush) recommended and went into deficits to finance one of these tax cuts, and no economic growth occurs, then the government is in a really bad spot. They have to raise taxes back to sustainable levels, and then raise taxes again in order to get the money to pay for the debt, and then raise taxes even higher to pay for the interest on the debt. Or, they can do what Reagan did, and just roll the debt over by issuing more debt. This is sort of like paying off the Master Card bill with the Visa. It works great as long as you can always get another credit card to lend you more money. When the last credit card company decides not to give you a card, then you are in trouble.

George Herbert Walker Bush called supply-side economics “voodoo economics” because all of supply-side theory was based on a hope that the rich would invest those tax cuts and not just stick them in the bank. George W. Bush ignores his father’s opinions about the wisdom of his economic policy, however, and is a big supporter of supply-side economics.

Third-world countries do the Visa-Master Card swap trick all the time. They run up huge debts by spending more than they tax, and keep borrowing money from private investors in their country and abroad. When it becomes obvious that the country is so far in debt that they will never be able to pay it back, investors start selling off their debt, even if they sell them at steeply-discounted amounts. This is really, really bad for the country still trying to pay its bills by borrowing more. When investors start dumping your IOUs on the market, then your country’s currency quickly loses value. This is called hyper-inflation.

In 1997, investors all around the world had lots of money invested in east Asia. Then, people lost confidence in certain countries, and so investors all started selling off like mad. The investors sold debt denominated in Asian currency to buy dollars. This pushed down the value of Asian currencies relative to $US. In short, families in these countries found out that their life savings (which were stored in their home-country currency, like the Thai baht, or the Indonesian rupiah, not in $US) lost all of its value because of inflation. It was as if these people woke up, went to the store, and discovered that all the prices had doubled, and were probably going to double every day after that. That’s when the riots broke out, which scared away more investors, and the downward spiral continued.

The same thing happened recently in Argentina. Investors all started selling off Argentinian debt, so the value of the Argentinian currency plummeted, and people were wiped out. Also, when you have high, high inflation, goods imported from other countries become much more expensive.

What happened in the 1980s is like a big Rorschach test. Some economists see all the signs that supply-side economics worked, and others see the same period as the beginning of severe fiscal irresponsibility (“fiscal” means how the government manages spending). There’s no doubt the economy grew after the Reagan tax cuts, but it never grew enough to pay back the debt Reagan racked up. We’re stilling paying interest today on that debt. We’re also now adding to it because each year that the government spends more than it taxes, it creates a deficit, so that gets added to the debt, and we’ve been in a deficit ever since the George W. Bush tax cuts. Also, in some other recessions, the government has chosen to just wait it out, and most recessions end in about 11 months. Based on previous experience, the recession probably would have taken care of itself eventually, and we wouldn’t have all this debt hanging over us today from twenty years ago that we still haven’t paid off.

In 1991, part of the reason why George H. W. Bush had to break his “read my lips: no new taxes” pledge was because he was forced with the choice of either raising taxes, or putting the country further in debt. He made the politically painful move in order to protect the long-term interests of the country, even though he knew he was just about guaranteeing he would lose the 1992 election.

Clinton saw an opportunity to steal an issue from the Republicans in 1992. Since they were no longer the party of being fiscally responsible, Clinton made that his mantra. He balanced the budget early, by cutting spending and raising taxes. Then of course, the public didn’t like that, so in 1994, the Democrats lost control of Congress. Still, thanks to Clinton, we got out of deficits by the end of 1990s and in 2000 Gore wanted to start paying down the debt, but then George W. Bush won the election, and instead of paying down the $7 trillion that we owe (about $24,000 per US citizen, and growing every day), he pushed through his tax cuts instead.

The US debt is at an all-time high, and the financial world is starting to worry about the long-term stability of the US economy. The International Monetary Fund, in a release a few weeks ago, recently warned that the US debt was increasing to the size where it could threaten the world economy. The Bush administration almost entirely ignored the report and the mainstream US media didn’t make the report into a big story.

Meanwhile, the US dollar has lost about 30% of its value versus the EU Euro in the last 12 months. A weak currency in the short run may help our exports, but in the long run, it pushes up interest rates and frightens foreign investors. Since most of our debt is held by non-US investors, the US government’s ability to borrow depends on maintaining confidence that our currency will maintain value in the long-term.

One economist described debt as more like termites in the walls, rather than a tornado outside. Both will eventually destroy the house, but it is a lot easier to pretend that the termite problem isn’t so bad.

The Brookings Institute, a think tank in Washington, DC, just finished a paper that describes some long-term consequences of ignoring the budget deficits. Alice Rivlin, former vice-Chair of the Federal Reserve Board of Governors co-authored the paper. It is written for the interested outsider, rather than the professional economist. In short, allowing the government to run deficits indefinitely raise interest rates for all of us, risks inflation of US currency, and limits long-term economic growth.

Total employment (the number of people with jobs) has fallen by about 3 million jobs since the economy peaked in March of 2001. George W. Bush promoted the tax cut as a tool to create jobs, and by that standard, it hasn’t worked at all.