Found a possible error in chapter 7 of the TurboGears book

I bought the TurboGears book about two weeks ago, and I have been working through it. I like the book in general, but I agree with the reviewers on Amazon that complain about the number of errors. I can’t think of another programming book that I’ve read with this many errors.

All of the errors I noticed are little glitchy typographical errors, rather than incorrect theory. The authors really do a good job of illustrating the MVC approach to web design, so I’m glad I bought it.

Anyway, this page lists mistakes found after publication, and the community of readers seems to be doing a good job of helping each other out.

I think I might have found another tiny error. This code appears at the bottom of page 109:

class ProjectFields(widgets.WidgetsList):
title = TextField(label="project", validator=validators.NotEmpty())
client_revenue = widgets.TextField(validator=validators.Number())
project_form = widgets.TableForm(fields=ProjectFields(), action="save_project_test")

I don’t see the point in using both TextField and widgets.TextField. But more importantly, I think the indentation is wrong in the last line. I don’t think project_form is supposed to be an attribute of the ProjectField class.

I think the code should look more like this:


class ProjectFields(widgets.WidgetsList):
title = widgets.TextField(label="project", validator=validators.NotEmpty())
client_revenue = widgets.TextField(validator=validators.Number())

# Moved outside the class.
project_form = widgets.TableForm(fields=ProjectFields(), action="save_project_test")

But maybe I’m missing something. I posted to the TurboGears Book mailing list, so hopefully I’ll find out.

That American Apparel dude has gone mad with power

One explanation for this is that he is trying to prove to his oligarch friends that he can make people buy anything. Or maybe he built some kind of matter / anti-matter thing, where he can harness all the irony released by people walking around wearing these.

The site has a few pictures and this one is my favorite:

Fanny Pack + stretch pants

I especially like how the guy individual modeling the fanny pack for us is also wearing gray stretch pants. Well, at least now he or she has a place to put his or her wallet and/or car keys.

Sidenote: this blog required a new category. I can’t think of any other points to make about fanny packs, so it might be a while before I write anything else on this subject.

becontrary.com is a neat site built with TurboGears

BeContrary.com is a very clever idea for a site. This debate on going dutch illustrates how the site works. And this is a good discussion of different styles of python templates.

The site’s author, Will McGugan, wrote up a blog post describing his experience with turbogears here. He says he chose TurboGears partially because he had already worked with CherryPy and really liked it. Will made this remark after talking about SQLObject:

Incidently, I don’t like the way that Python ORMs specify the database schema as Python code. Since the schema doesn’t change when the application is running, I would prefer a simple xml definition that would be used by the ORM to dynamically build classes for each table that could be derived from in the model code.

I like this idea, but instead of writing XML, I would prefer to write SQL and have python parse that to build classes.

Why write me a response back at all?

I wrote this email to Samsung technical support a few days ago:

SUBJECT: Need Hayes commands (AT commands) for phone

Hi —

I own a Samsung A707 phone with AT&T service.

I can make a serial port connection to my phone via bluetooth. However, it seems like my phone doesn’t understand most AT commands.

Is there a list anywhere with all the AT commands that this phone supports?

Thanks for the help.

Matt

And here’s the reply I got back:

Dear Matthew,

Thank you for your inquiry. We suggest searching the internet for Hayes AT commands.

Do you have more questions regarding your Samsung Mobile Phone? For 24 hour information and assistance, we offer a new FAQ/ARS System (Automated Response System) at http://www.samsungtelecom.com/support It’s like having your very own personal Samsung Technician at your fingertips.

Thank you for your continued interest in Samsung products.

Sincerely,
Technical Support
Renee

WOW. Is Renee a chatbot or did a real human being actually spend time writing this response? I especially like that they link to the tech support section of the website. This is where I sent the email in.

Fortunately for me, the people on the gnokii mailing list are helping me out.

We’re presenting at the Jump Start Angel Fair!

The First Annual Cleveland Angel Fair picked us to present. This is fantastic news.

In other news, spent the day writing code in the house by myself. The wind is howling outside, and my fingers are shaking because I refuse to turn on the heater. I haven’t gotten so many hours of consecutive geek time in years.

This is the life. I can’t believe that I used to take a shower, put on clean clothes, and interact with humans every day.

A few different ways to store data with varying attributes

Got an email from a friend:

I want to create a database that acts just like the index card note cards one used to make for doing research papers in HS and Univ. I’ve got most of it down, but I am having trouble figuring out how to normalize the page numbers and the source data.
Let’s say one has three kinds of sources – books, magazines, and websites. Well, a book will have:

author(s)
title
place of publication
publishedr
copywrite date

a magazine:
author(s) – but only maybe – what does one do about The Economist?
title of article
title of magazine
date of publication

a website:
author(s) – again only maybe
title of website
URL

Here’s what I said in reply:

So, I think I get your question. Books, magazines, and websites are all different examples of sources that you might cite. They have some attributes in common and some attributes that are unique.

Going with the high school term paper example, let’s pretend that you wrote a paper and your bibliography looks like this:

  • (book) Tom Sawyer by Mark Twain. Published by Hustler, 1833, in NY.
  • (book) Huckleberry Finn by Mark Twain. Published by Hustler, 1834, in NY.
  • (magazine) “Indonesia sucks”, The Economist. No author listed. February 2001 issue. page 67.
  • (magazine) “No, the Economist Sucks”, Jakarta Post. Joe Brown is the author. Article appeared in the March 14, 2007 issue, on page 6D.

  • (website): “Indonesia” on WIkipedia, http://en.wikipedia.org/wiki/Indonesia. Lots and lots of authors. I used text found on the site as of June 1st, 2007.
  • (website) “blog #96 post”, http://anncoulter.com, Ann Coulter is the author, article posted on July 4th, 2007. I used text found on the site as of this date.

I can see at least three ways to set this up:

1. You can make a single table called sources that includes the union of all these different types. So, you would have a column called “publisher” and another column called URL. The book sources would have a blank URL field, and the website sources would have a blank publisher field. You could have a column called source type which would have values of “book”, “website”, “URL”, or anything else that fits.

PROs: simple!

CONs: It is tricky to bake in good data validation into your database. You can’t easily add rules to enforce that you get all the required data for each row. Also, every time you discover a new source type, you may need to modify the table and add even more columns.

2. You create a separate table for each source type. So, you have a books table, a magazines table, and then a websites table.

PROs: Now, you can easily make sure that all your books data has all the required data.

CONs: Accumulating all the results for one of your papers means you have to do a query against each table separately and then use some UNION keyword to add them together. Also, when you need to a new source type, you’ll need to add a new table to your schema.

3. Make a bunch of tables:

sources (source_id)

fields (field_id, field_name)

source_fields(source_id, field_id, field_value)

So, this entry:

(book) Tom Sawyer by Mark Twain. Published by Hustler, 1833, in NY.

Would get a single row in the sources table.

And the fields table would have these values:

(field_id, field_name)
1, source type
2, title
3, author
4, publisher
5, publish date
6, publish location

Then finally, we’d put the actual data in the source fields table:

(source_id, field_id, field_value)
1, 1, “book”
1, 2, “Tom Sawyer”
1, 3, “Mark Twain”
1, 4, “Hustler”

… you get the idea, hopefully.

Then, when you want to store a magazine, the first thing you do is add any new field types you need to the fields table, and then add your data

PROs: you can make up new attributes for your data any time you want, and never have to change your database. For example, if you need to start storing TV shows, you can just add the new types to the fields table and you’re good.

CONs: The field_value field needs to accept any kind of data. So, probably, you’ll want to make it a column type like a TEXT column that can hold arbitrarily large objects, and then before you store anything in your database you need to convert it to a text object. So, you’re not going to be able to index this data well and you’re not going to be able to require that the data matches some formats.

So, figuring which of these approaches is the correct one depends on the specifics of the scenario.

How well can you predict today all the future types of data? If you have perfect clairvoyance, or if you don’t mind monkeying with the database, approach #3 is pointless. I recommend approach #3 in a scenario when you have lots of users, and you don’t want each of them monkeying with the schema.

How worried are you about bad data getting entered in? You can always use triggers and stored procedures or some outer application code to add validation on any of these, but it won’t be easy. Using approach #2 will make validation the easiest.

How fast do queries need to be? If we want to know all the books written by Mark Twain, approach #2 will likely give the fastest response, followed by #1 and then #3.

By the way, while we’re on the subject of bibliographies, all my class notes on normal forms are here and I wrote up a blog on compound foreign key constraints a while back here.

Matt

PS: I’m using this email as my next blog entry. And here it is 🙂

Is there a botanist in the house?

I have a variety of jalapeno plants in the backyard garden. Some are thriving. Some never did very well. One plant in particular grows beautiful, gigantic, bug-resistant peppers.

I want to harvest the seeds from this plant in the hopes of propagating it next year. Here’s my science question — do all the seeds in all the peppers on this plant have the same DNA?

Spreadsheets are the devil, but here is how to avoid getting burned.

Spreadsheets seem like they are adequate tools for serious analysis. And unfortunately, people are graduating from stats and OR programs without mastering any of the other alternatives. But brother, I stand before you today to tell you that spreadsheets are the devil.

When you face a modeling problem, spreadsheets tempt you with the seemingly easy way out. It all starts with how easy it is to import data. Excel’s import wizard is fast and pretty smart about automatically assigning column types. Meanwhile, your hapless colleagues are going to spend a day reading manuals just to load in that same tab-delimited text file.

Now that you’ve got the raw inputs loaded, you figure that within a few days you’ll be done building your trendlines and you’ll kill time choosing fonts for your pie charts. But what happens — invariably — is that you think you are done and then you look at your number on your final worksheet and realize it can’t be right. You must now find the error in any of the possibly hundreds of tiny formulas all chained together. Welcome to cell HE11.

Meanwhile, while you’ve got numbers that are laughably wrong, your SAS friend after a few days at least has his PROC REPORT output to show the boss, even if he did have to print it on the basement mainframe dot-matrix printer.

So, despite all that, sometimes, I find that I just have to use a spreadsheet. In that circumstance, I try to follow a set of rules. Any time I deviate from these rules, I always get burned.

  1. Put at the top of each sheet a few paragraphs that describe the model. Ideally, this text should be so clear and specific that I can rebuild the spreadsheet just based on this information. (This also helps make sure that you implemented the logic correctly.)
  2. Indicate what are the cells that the user should play with, and what cells should not be tweaked. Point out where the final answer pops out. Establish a color scheme to distinguish between input data and formulas.
  3. Emulate the IRS 1040, where there is a column of text and just a few columns of numbers, and each row is as simple as possible. There’s a main column that gets summed at the bottom, and a secondary column where complex totals are broke down further.
  4. Decompose those formulas and don’t store literal data inside of formulas! For example, in a mortgage calculator, break out the interest rate, the mortgage size, and the number of years in the mortgage into separate cells, and then show the result in another cell:

    s1

    Don’t be tempted to cram all those numbers inside a single cell like this:

    s2

    Sure, you save a few rows and it compresses the size of your sheet, but in the end, you make your sheet much less flexible, and it will be more difficult to separate data-entry errors from formula errors.

  5. Finally, Put everything in top-to-bottom order in each sheet and have a single flow. Don’t have lots of parallel panels side-by-side. It becomes too confusing.

I am certain that there are even more rules that are better than these. Enlighten me.