The First Annual Cleveland Angel Fair picked us to present. This is fantastic news.
In other news, spent the day writing code in the house by myself. The wind is howling outside, and my fingers are shaking because I refuse to turn on the heater. I haven’t gotten so many hours of consecutive geek time in years.
This is the life. I can’t believe that I used to take a shower, put on clean clothes, and interact with humans every day.
Spreadsheets seem like they are adequate tools for serious analysis. And unfortunately, people are graduating from stats and OR programs without mastering any of the other alternatives. But brother, I stand before you today to tell you that spreadsheets are the devil.
When you face a modeling problem, spreadsheets tempt you with the seemingly easy way out. It all starts with how easy it is to import data. Excel’s import wizard is fast and pretty smart about automatically assigning column types. Meanwhile, your hapless colleagues are going to spend a day reading manuals just to load in that same tab-delimited text file.
Now that you’ve got the raw inputs loaded, you figure that within a few days you’ll be done building your trendlines and you’ll kill time choosing fonts for your pie charts. But what happens — invariably — is that you think you are done and then you look at your number on your final worksheet and realize it can’t be right. You must now find the error in any of the possibly hundreds of tiny formulas all chained together. Welcome to cell HE11.
Meanwhile, while you’ve got numbers that are laughably wrong, your SAS friend after a few days at least has his PROC REPORT output to show the boss, even if he did have to print it on the basement mainframe dot-matrix printer.
So, despite all that, sometimes, I find that I just have to use a spreadsheet. In that circumstance, I try to follow a set of rules. Any time I deviate from these rules, I always get burned.
- Put at the top of each sheet a few paragraphs that describe the model. Ideally, this text should be so clear and specific that I can rebuild the spreadsheet just based on this information. (This also helps make sure that you implemented the logic correctly.)
- Indicate what are the cells that the user should play with, and what cells should not be tweaked. Point out where the final answer pops out. Establish a color scheme to distinguish between input data and formulas.
- Emulate the IRS 1040, where there is a column of text and just a few columns of numbers, and each row is as simple as possible. There’s a main column that gets summed at the bottom, and a secondary column where complex totals are broke down further.
- Decompose those formulas and don’t store literal data inside of formulas! For example, in a mortgage calculator, break out the interest rate, the mortgage size, and the number of years in the mortgage into separate cells, and then show the result in another cell:
Don’t be tempted to cram all those numbers inside a single cell like this:
Sure, you save a few rows and it compresses the size of your sheet, but in the end, you make your sheet much less flexible, and it will be more difficult to separate data-entry errors from formula errors.
- Finally, Put everything in top-to-bottom order in each sheet and have a single flow. Don’t have lots of parallel panels side-by-side. It becomes too confusing.
I am certain that there are even more rules that are better than these. Enlighten me.
TechLift is a non-profit organization that helps out tech firms in Ohio. I went to an overview tonight.
This was the first time I’ve been around a bunch of venture capitalists. The first thing I realized when I got there was that I wasn’t wearing a suit, but everyone else was. I thought nobody wore suits anymore. Now I realize that everybody above a certain level of wealth still wears suits. And the people that want those people’s money still wear suits.
Anyhow, TechLift is interesting — one speaker described its purpose as getting firms ready to collect venture capital. TechLift takes in hundreds of applications, thins the pool down to a few dozen, invites them in for presentations, then picks about five firms and coaches them through the startup process. TechLift prefers to select companies that are already well-developed rather than ones with interesting ideas but incoherent business plans. In the business life cycle of imagining -> incubating -> demonstrating, TechLift focus on firms in the incubation stage.
Meanwhile, to reach out to those companies at the beginning, TechLift started the Idea Crossing site this year. That site is sort of like investment banking meets web 2.0. Startups describe themselves, and the site connects them to relevant mentors, investors, and service providers.
One speaker made a remark that I thought was clever:
It’s easy to forget that the goal was draining the swamp when you’re fighting the alligators.
People around me at work say phrases like:
- “Do we know how long this will take?”
- “Do we have someone that can figure that out?”
Wikipedia calls this the patronizing we and the description is dead on:
The patronizing we is sometimes used in addressing instead of “you”. A doctor may ask a patient: And how are we feeling today? This usage is emotionally non-neutral and usually bears a condescending, ironic, praising, or some other flavor, depending on an intonation: “Aren’t we looking cute?”.
I don’t like it. People tend to use it to assign an activity implicitly, like when somebody says “We’ll take care of it” and they really want me to do something, but they also want to somehow associate themselves with my labor.
And when some of the lazy marketing people say “Do we know how many of X there are?” what they really mean is “I’m so mushy-headed I can’t even bother thinking who I should ask to find this out”.
Finally, the “We need to get this done!” and “We need to make this a priority!” imperatives are the absolute worst. The speaker is admonishing subordinates and at the same time taking credit for anything that may happen.
Perhaps later I will construct a lookup table to disambiguate these phrases.