About matt

My name is Matt Wilson and I live in Cleveland Heights, Ohio. I love random emails from strangers, so get in touch! [email protected].

Python showboating

Posted on October 7, 2008 by matt

Some javascript sends a string like this: a_1_b_2_c_3, so here’s how I parse it:
>>> s 'a_1_b_2_c_3' >>> a, b, c = [int(x) for x in s.split('_')[1::2]] >>> a, b, c (1, 2, 3)

More seriously, this guy’s post on why zip is the same as unzip blew my mind.

I really like The Knife

Posted on October 6, 2008 by matt

Random thoughts: Why does Sweden have so many bands that appeal to me? This song makes me think about Suspended in Gaffa by Kate Bush and separately a bunch of 1980s new wave bands. I’m sure all the cool kids have already moved on from The Knife to a band I won’t discover until after Sarah Palin gets sworn in (evil always wins — just accept it).

The high-level view is worthwhile

Posted on October 4, 2008 by matt

About two weeks ago, I wrote a provisional patent application.

According to our attorney, a patent application needs to describe the product with enough detail so that anyone skilled in the trade would be able to follow the instructions and build the product.

So I didn’t include a diagram of my database. I figure that somebody skilled in the trade would understand how to design an adequate database schema when I say stuff like “since the system supports sending the same information to many people, where some people receive SMS message, and others get a voice call using text-to-speech, I keep the message data separate from the details about who receives that message and by what means.”

Likewise, when I say the system can forward an incoming SMS message from an employee to a supervisor’s email address, I don’t go through the details of how I parse the binary crapola from the SMS and then construct an email message. Somebody skilled in the trade knows how to do that already, or they know how to learn how to do it.

And I didn’t talk about which web application framework I used or how exactly I deploy my code on the production server. I figure that a skilled programmer can figure out those details without my help. Furthermore, there’s probably a lot of different possible solutions.

So I left a lot of detail out. I focused on how the system responds to certain events and what problems the system was designed to solve.

Even after taking all those shortcuts, I ended up with nearly 10 pages of text and it took nearly an entire week to crank out. After I finished, I learned a few things:

My code follows patterns. I build one feature, then I build another feature using the same style. By the time I’m writing the third feature, again using the same style, similar code now lives in three places.
The application lacks symmetry. There are obvious examples, like where you can download data into a spreadsheet from one screen but not from another. In less-obvious cases, two different methods might solve the same problem, but one method uses a superior technique.

The second point is the opposite of the first point. Instead of solving different problems in similar ways, I’m solving similar problems in different ways.

I need more abstraction. I need to write code that is easier to reuse, and then actually reuse it rather than dashing off some ad-hoc fix.

All of this happened because of my design strategy. I talk to a pool of customers and then I make a list of their problems. Then I sort those problems by (A * B) / C, where:
A = size of the market with this problem B = how severe it is (i.e., how much would they pay for the solution) C = how difficult is it to solve.

I pick enough of the best candidates to fill up about 30 days of work. Then I build, test, and deploy, and the cycle starts over again.

So far, I’m happy with working this way, but there’s clearly a downside to only focusing on defining and building the next feature. A colleague calls it “not seeing the forest for the trees.” I think that’s about right. I don’t think of my work as a gestalt as much as a bag of magic tricks. Anyhow, for a few brief moments, after taking a week off from writing code to see write that patent application, I saw the forest. It’s a nice view.

aq punches edit

Posted on September 28, 2008 by matt

From lemonodor auxiliary.

AQ_PUNCHES from John Wiseman on Vimeo.

A worse blogging system

Posted on September 27, 2008 by matt

I’ve been daydreaming about this for a while. I took some time to write out my thoughts. They’re still half-baked.

Blogs and RSS feeds are pretty good. I don’t have to manually go to sites. My reader polls the sites I subscribe to and it pulls the feeds. But the situation could be a lot better.

Problems with blogging from the reader’s POV

Feed readers don’t work all that well offline. Sure, maybe the RSS feed itself is downloaded, but images won’t likely be pulled down.

Also, polling is kind of goofy. It would be nicer to use some kind of pub-sub framework where I get notified.

RSS feeds usually only store recent stories.

Very often I find a great blog that has dozens of stories. I would love to be able to download the entire blog for offline viewing.

What about Google Gears?

Yeah, what about it? I know of one single blog that actually uses it in this context. I would like to think there is a solution to this problem that doesn’t require building C++ extensions to the browser.

Problems from the writer’s POV

This section is based on my experiences with WordPress and Blogger. Obviously, publishing content on a remote site requires an internet connection to that remote site, but there is no real reason that I should need an internet connection to preview the rendering of my content.

Also, there’s no obvious way I can integrate my source control tools with my blog engine.

Several times I start an article on my laptop, upload it as a draft to my server, then work on it on my server, then lose my internet connection, and go back to an out-of-date draft on my laptop to continue work.

I can write an article much more quickly using simplified markup and I can be pretty certain that it will render into valid HTML. There are a few plugins for WordPress that support writing with markdown, but they require using the wordpress text editor. Sure, I could copy and paste from my real editor, but that’s less than ideal.

The idea

Take these ingredients:

Any decentralized source control system.
Any simplified markup language, like reStructuredText, markdown, or textile
Any tool to make pretty html out of that markup language.

And optionally:

A new tool to build lots of index files and RSS feeds.
A new tool to notify interested parties that something new is ready, by email, jabber, pingback, etc.

Here’s a simple example:

I write a text file using reStructuredText.
I use a local git repo to track revisions.
I use a local tool to render my text file into HTML and make sure I’m happy with the look. Git is set to ignore these HTML files.
When I’m done, I use git to push my work to a remote repository on a box with a webserver.
That repository has some code that fires when ever it receives a new push:
- It runs the exact same HTML rendering programs I used locally.
- It builds a new RSS feed.
- It rebuilds any internal indexes, tables of contents, whatever are appropriate.
- It interacts with whatever pub-sub crap is useful so other people learn about the new content.

On the remote git repository, all the rendered HTML, RSS, etc would be available for cloning and the webserver supports people reading my blog the old-fashioned way.

WordPress has other features like being able to navigate through archives, or select stories by tags, or send updates to twitter, etc. I think all of these could be solved somehow during the publishing phase.

For example, navigation through archives doesn’t really require any scripting. I just need to generate indexes for every date range.

Tag-based navigation also doesn’t really require running:
SELECT POSTS.* FROM POSTS, POST_TAGS, TAGS WHERE POST.ID = POST_TAGS.POST_ID AND POST_TAGS.TAG_ID = TAG.ID AND TAGS.NAME = 'some inoffensive tag name';
It would be sufficient to just regenerate indexes for every tag after each post during the publishing phase.

What about comments?

WordPress allows visitors to post comments on a blog, and it does a pretty good job filtering out spammers with the Akismet plugin. I see two solutions; one is straightforward and mediocre and one is preposterous.

The straightforward solution is to use a service like disqus to track comments on an external server.

The rendered HTML pages would include a blob of javascript. That javascript makes a request to pull all the comments for this URL to the site, and then it appends the text to the DOM. Of course, people that download the material for offline viewing won’t see the comments when they don’t have an internet connection.

Sure, it would be possible to regularly scrape the comments out of the remote server and rebuild all the files available for offline viewing, but that only solves the reading part.

Copyright issues with comments

Imagine I write a blog post with a mediocre code sample inside, and you think of a better way to write the same code.

You start writing a comment on my site (or on my Disqus section, it doesn’t matter) and you’re about to submit, when you see a little line that says all comments become my copyright, and you know you want to use this code in some GPL project.

Maybe you don’t see any lines at all that explain who owns blog comments, so then you’re uncertain about what applies.

Anyhow, there’s a deadweight loss here. You have something to say that would help me out, but you won’t say it. If I knew what you were going to say, I’d make a special exception just for this one comment.

By the way, If you want me to change my license so I don’t own the comments, then I’m faced with a bad situation where somebody can post a comment, and then demand later that I take it down. This is a serious problem for “real” sites. Look at the terms of service on reddit. It insists on a perpetual non-exclusive right to any content posted there.

The ridiculous solution

Just like it will be possible to clone my blog text, commenters should have their own repository where I can clone their comments.

So, when Lindsey comments on my (Matt’s) site, she really writes a post on her own site, and then sends my site a message that says:

Hi Matt,

I read your blog post [1] and I wrote a comment here on my site [2].

You can show my comment on to your site as long as you agree with my comment license [3].

[1] http://matt.example.com/why-rinsing-is-as-good-as-washing

[2] http://lindsey.example.com/soap-is-not-optional

[3] http://lindsey.example.com/comment-license

Lindsey

This message could be an email, an HTTP post, whatever. I could manually process this message, or I could set up some handler that figures out what to do based on some rules ahead of time.

So, we’ve changed the flow of comments from lots of people pushing text to me to a system where they just send me notifications and if I want to pull them, then I can.

This system allows more offline work to be done. Lindsey can clone my site and read it. Then she can write a comment. The next time she has an internet connection, she publishes her comment to her site, which triggers the message to be sent to my site.

Conversation hubs

So, pretend that I don’t show Lindsey’s comment on my site because I think her point makes me look stupid. Now how do third-parties get to see her remarks?

Well, this is a solution that is better than the status quo. Imagine that when Lindsey sent me a message about her comment, she also sent a similar message to another server called a conversation hub.

She tells that hub that her post http://lindsey.example.com/soap-is-not-optional is a response to my post http://matt.example.com/why-rinsing-is-as-good-as-washing.

When somebody clones a feed from my site, they can also check a few of these conversation hubs and optionally clone any posts that have indicated they are relevant to that post.

We’d need better tools to assemble a conversation thread from all the different pieces. But that’s not really that hard.

What about spamming the conversation hub?

A spammer could just send messages to the conversation hubs linking their posts to everything out there.

Well, the conversation hubs could insist on real authentication, and then allow feedback from people. Also, people that check for comments at a hub can request to only see comments that have received aggregate positive feedback.

What about Adsense?

Well, if I switch to this approach, and people start downloading my text files to read offline, they ain’t gonna see my adsense ads, and I’ll be deprived of my $15/year revenue.

But for people that actually make real money off adsense, the question is valid. Remember that we’re talking about helping people read your site offline. Those people that are mostly offline aren’t seeing the site now anyway.

The online visitors can still see them though. Also, people that view the HTML files after cloning my publish node may still see them if they have a working internet connection and they allow the embedded javascript to run.

Sure, there’s a risk that some online viewers will switch to the offline-views and then turn off javascript or their internet connection so that they can’t see the ads.

Publishers would need to weigh this risk. Maybe the solution could be to sell offline copies at a price equal to the expected lost revenue from the switchers.

What about SEO?

It’s a non-issue. The HTML is available online just like it always was.

I just wrote another mock object framework.

Posted on September 26, 2008 by matt

Here it is:
class BazMock(object): def __init__(self, **kwargs): self.__dict__.update(kwargs)

Now I need to figure out how to make some mock tests to go with it.

Sarah Palin interviewed by Katie Couric … WOW.

Posted on September 25, 2008 by matt

My three-year old son is more coherent when he explains how dinosaurs live in the museum. If the voters choose McCain and Palin, then we deserve every bit of the hell that comes with it. It is painfully clear she is not ready for the job.

I’m so worked up over this bailout I’m participating in democracy

Posted on September 23, 2008 by matt

I just finished using a form on George Voinovich’s site to let him know my thoughts on this banking crisis.

I’m not adamantly opposed to the bailout in theory. I get the idea that the some market activities have external consequences. But I also get that this administration always says “trust me!” right before shit gets really, really bad. If we’re going to do a bailout, let’s do it in a boring and well-thought out way. I want to make sure that this bailout buys us enough safeguards and regulations so that we’re never faced with this crap again.

The villains on k5 have a pretty good discussion about this bailout. I like this comment:

Just about the only way that it would cost 700 billion to get with two chicks is if one was Natalie Portman and the other one was a clone of Natalie Portman. Even cloning a human probably wouldn’t get you particularly close to 700 billion but you might be in the same ballpark.

Ha ha.

Anyhow, I also went to Sherrod Brown’s website and read his statements from today’s hearing and I really like his angle. I’m not too worried about letting him know how I feel since he’s already there.

I also liked how Sherrod Brown has RSS feeds for his site, and a pretty nice looking color scheme. Maybe that’s because he just got there.

UPDATE

Another fine Ohio politician, Marcy Kaptur, is also on the right side of this:

What Django can learn from Zope

Posted on September 16, 2008 by matt

Mark Ramm makes a lot of interesting points about Django in this talk. Really good stuff.

The next season of Bizarre Foods starts Tuesday

Posted on September 6, 2008 by matt

Sometimes I wonder if Andrew Zimmern just wants a nice tame meal, but since he’s the star of Bizarre Foods, everywhere he goes, he’s imprisoned by it. Like maybe somebody invites him to a dinner party and everyone is eating spaghetti, but when the host delivers his bowl, he sees it has a bunch of crickets on it.

I think his neighbors probably dump all their rotten fruit at his house. People probably call him to ask if he wants to drink their expired milk.

Anyhow, this video has scenes from next season.

t+1

Programming, gardening, economics, life in Cleveland Heights

Author Archives: matt