The efficient market hypothesis and search-engine optimization

The efficient market hypothesis (EMH) is an idea in the finance world that the market price for a commodity accurately reflects all information available at the moment. There’s little point in us trying to pick particular winning stocks, unless we have some very secret information. The best investment strategy involves diversifying risk so that results match the aggregate market changes.

It’s a very appealing idea because it means the laziest strategy is also the best.

Anyhow, I suspect that something sort of like EMH dominates the search engine world, but instead of prices for commodities at points in time, the market is search engine rankings and keywords. In my inchoate model, each keyword is its own market. The “price” of a given website would be how well it ranks in a search for that keyword.

I brought up EMH because if we view a website’s search engine ranking as a market price, and we make the assumption that search engines on average operate as efficiently as any other market, then ultimately, your site’s search engine ranking can’t really be pushed up through artificial means, for the same reason that you can’t push up stock prices artificially. I should make it clear that I mean that you can’t do it sustainably.

Hopping back into the financial world, an unfortunately common tactic is to run a pump-and-dump scam. The principle is simple — I own a whole bunch of shares of a company, and so I go out and promote the heck out of that company while at the same time I’m selling all my shares to all the suckers that I manage to convince. Meriill Lynch paid a $100 million fine a few years ago for doing this.

I get unsolicited email all the time with investment recommendations. These are sent from people running the same operation on a smaller scale.

Of course even if these efforts do let the dumper get out and make some money, once the market learns about what is really going on, the stock price craters. See what happened to Enron for one of the most famous examples of how the market reacts to information assymetry.

Getting to the point, these tactics may push up the price just long enough for the dumper to dump, but no serious stockholder would ever consider using a pump-and-dump scam as a long-run strategy to improve the price per share.

I think this is what search-engine marketers mean when they talk about how black-hat tactics don’t work in the long run and will likely backfire. The more I read and learn about search engine optimization and website marketing, the industry experts all seem to be really saying that you have to have good content on your site, and you have to have recommendations from the larger community. Everything I read really seems to say that the best SEO strategy is to build a really good website, rather than to build a shoddy product and market it aggressively.

In short, the invisible hand can not be denied.

I don’t like the patronizing “we”

People around me at work say phrases like:

  • “Do we know how long this will take?”
  • “Do we have someone that can figure that out?”

Wikipedia calls this the patronizing we and the description is dead on:

The patronizing we is sometimes used in addressing instead of “you”. A doctor may ask a patient: And how are we feeling today? This usage is emotionally non-neutral and usually bears a condescending, ironic, praising, or some other flavor, depending on an intonation: “Aren’t we looking cute?”.

I don’t like it. People tend to use it to assign an activity implicitly, like when somebody says “We’ll take care of it” and they really want me to do something, but they also want to somehow associate themselves with my labor.

And when some of the lazy marketing people say “Do we know how many of X there are?” what they really mean is “I’m so mushy-headed I can’t even bother thinking who I should ask to find this out”.

Finally, the “We need to get this done!” and “We need to make this a priority!” imperatives are the absolute worst. The speaker is admonishing subordinates and at the same time taking credit for anything that may happen.

Perhaps later I will construct a lookup table to disambiguate these phrases.

When to use globals

I am dogmatic about never using global variables. But when people way smarter than me use them, like Brian Kernighan does when he builds a command-line interpreter in that chapter of The Unix Programming Environment, I wonder if maybe I’m being too reluctant.

I was looking at a python module vaguely like this:

def foo():

def bar():

def baz():

def quux():

I had a bunch of independent functions. I wanted to add logging. I saw two easy ways to do it:

def foo():
logger = get_logging_singleton()

def bar():
logger = get_logging_singleton()

def baz():
logger = get_logging_singleton()

def quux():
logger = get_logging_singleton()

In the above code, I would get a reference to my logger object in each function call. No globals. Maybe I am violating some tenet of dependency injection, but I’ll talk about that later. Anyhow, the point I want to make is that the above approach is the way I would do it in the past.

Here’s how I decided to write it this time:

logger = get_logging_singleton()

def foo():

def bar():

def baz():

def quux():

All the functions access the logger created in the main namespace of the module. It feels a tiny bit wrong, but I think it is the right thing to do. The other way violates DRY in a big fat way.

So, a third option would be to require the caller to pass in the logging object in every function call, like this:

def quux(logger):

This seems like the best possible outcome — it satisfies my hangup about avoiding global variables and the caller can make decisions about log levels by passing any particular logger it wants to.

There’s two reasons why I didn’t take this approach:

  1. I was working on existing code, and I didn’t have the option of cramming in extra parameters in the calling library. So, I could do something like def quux(logger=globally_defined_logger) but I’m trying to make this prettier, not uglier. The whole reason that I wanted to add logging was that I wanted some visibility into what what the heck was going wrong in my app. I didn’t have time to monkey with overhauling the whole system.
  2. I plan to control my logger behavior from an external configuration system. I don’t want to change code inside the caller every time I want to bump the log level up or down. It is the conventional wisdom in my work environment that I face less risk just tweaking a configuration file setting and restarting my app rather than editing my code*.

[*]I suspect that in the final analysis, this belief will be exposed as garbage. But for right now, it seems pretty true that bugs occur more frequently after editing code than after editing config files.

UPDATE: Apparently, I’m not just talking to myself here! Gary Bernhardt linked to this post and added some really interesting points. Also, his link to the post on the origin of the phrase now you have two problems was something I hadn’t heard of before.

Dependency Injection Demystified

I’m building a robot to make breakfast.

def make_breakfast():
fridge = get_reference_to_fridge()
eggs = fridge.get_eggs(number_of_eggs=2)
fried_eggs = fry(eggs, over_easy)
cabinet = get_reference_to_cabinet()
plate = cabinet.get_plate()
add(fried_eggs, plate)

return plate

This is OK, but I realize my robot needs lots and lots of practice, and I don’t like wasting all my nice eggs and getting all my plates dirty. So, while my robot is still learning how to cook, I want to specify that it uses paper plates and some crappy expired eggs I fished out of the grocery store dumpster.

Continue reading

Google presentation at Clepy on August 6th, 2007

Tonight Brian Fitzpatrick (Fitz) from the Chicago Google office did a presentation for the clepy group on version control at Google. They use subversion on top of their own super-cool bigtable filesystem back end.

We had a good discussion on the merits of centralized vs. decentralized version control. According to Fitz, decentralized systems discourage collaboration. He made the joke, “Did you hear about the decentralized version control conference? Nobody showed up.” He made the point that centralized repositories encourage review and discussion. I agree with that.

Apparently subversion 1.5, which will be released in a few months, will have much improved merging facilities. We won’t need to use --stop-on-copy to figure out where we branched. Also, it will be safe to repeat a merge, because nothing will happen on the second attempt.

I don’t like the dispatching system in turbogears

I wanted to translate a(1).b(2)into the TurboGears URL /a/1/b/2. A browser making a request for /a/1/b/2 would trigger that code.

This page explains how to do it. You build a single default method that catches everything and then does introspection to figure out where to send the request.

It works fine, but it isn’t nearly as obvious or concise as the regular-expression approach I’ve seen in rails and Django.

Learning flex without spending $0.01, day one.

I downloaded the command-line compiler and lots of documentation PDF files from here earlier today.

Then I started working through the “getting started” tutorial PDF.

I made a few edits to my ~/_vimrc file so that working with mxml files would be a little easier:

” Do some specific maps for flex files (.mxml files).
” F10 rebuilds the swf.
autocmd BufNewFile,BufRead *.mxml map <F10> :! mxmlc %<CR>

” F11 executes the swf.
autocmd BufNewFile,BufRead *.mxml map <F11> :! start %<.swf<CR>

And I was able to build this do-nothing widget after about 15 minutes of goofing off:


That’s a screenshot of my homemade swf running above the vim session where I wrote it.

Next stuff to figure out:

  • Where do my trace statements go?
  • I need to figure out how to pass in locations of actionscript files when I compile my mxml files into swf files.
  • I need to learn the tags in MXML. They’re different than HTML.
  • I need to learn how to talk to a webserver.

Pre-employment drug screens ate my balls

This is a really old k5 diary that I’m proud of. The original post is here. I made some minor edits.

The bratty and unhelpful HR troglodyte just called and said she forgot to mention earlier that I’ve gotta take a pre-employment drug screening.

Now I gotta go and piss in a goddamn cup.

She’s unhelpful because so far, she’s been unable to answer even one of my questions about vacation, benefits, health insurance or retirement without putting me on hold and asking somebody else first. She’s bratty because she doesn’t like how I ask her to explain the nonsensical corporate jargon she throws out; I think she’d prefer that I just trust her judgement about my options rather than try to comprehend them myself.

Anyway, she should have told me about the drug test two weeks ago when I got the offer. Now I gotta get to the new city a few days ahead of my original time so I can excrete urine for these fuckers.

Dogs urinate as a sign of submission. Maybe that’s how this drug-testing thing got started; it’s just a way to break the worker’s spirit right off the bat.

I tend to glaze over when libertarians fuss about stuff like grocery store club cards, or Radio Shack asking for my address. I agree with them, but I don’t really get upset about it.

But pre-employment drug tests really make me mad. It’s not what you’re thinking. I’m clean, man. Just like our president, I can pass the FBI background check that examines the last seven years.

And I doubt these tests really accomplish anything, anyway. It’s not as if American productivity shot up after employers started screening new hires. Almost any of the tests can be easily circumvented. They’re just another sign that we’re slowly giving our dignity away.

I’m not mad because I think drugs ought to be legal. I don’t really care anymore about whether they should or shouldn’t be illegal; they are illegal, and most likely, they’re going to be illegal for a really long time. People might as well rant about bad weather. Furthermore, I wouldn’t touch them anyway.

I don’t like being treated like a criminal. Drug screening places are always shitty hellhole offices, with employees that are unhappy at the fact that they handle piss all day, so they take it out on the poor saps that need work bad enough to submit to this degradation.

This job didn’t perform a credit check (well, at least not to my knowledge) but I’ve been asked to grant permission for those for other jobs.

My sister had to take a lie detector test in order to get a promotion at one job. Where does it fucking end? Will firms send out investigators to root around houses of job applicants and look for anything that might mark them as a bad employee? Why not profile family members and find out if any of them indicate a family predisposition towards deviance? Maybe future junior partners at PricewaterhouseCoopers will have to go out and kill some nameless victim in order to make it to full partner; that way, the company always has something on them.

It’s days like this that make me want to cash out my retirement and head out of town, buy a farm, and live off the grid. But that’s not really very safe anymore either, right? A bunch of bored ATF assholes would probably come after me.

A friend warned me not to eat any poppy-seed muffins before the test. That got me thinking. In some other parallel universe, I’m gonna look up every drug analog possible and eat all of them: poppy seeds, cough syrup, cranberry juice, etc, and try to grand-slam that drug test. I want a goddamn siren to go off because of everything I (falsely) test positive for. Based on the results, doctors will want to know how I can remain standing.

But in this universe, I figured out what I’m gonna do. I’m going on an all-asparagus diet the week before my test. My piss is gonna stink so bad, lab techs will have to wear masks or risk losing consciousness. They’ll have to close the place down to fumigate.

Using dictionaries rather than complex if-elif-else clauses

Lately, I’ve been using dictionaries as a dispatching mechanism. It seems especially elegant when I face some fairly elaborate switching logic.

For example, instead of:
if a == 1 and b == 1:
log("everything worked!")

elif a == 1 and b == 0:
log("a good, b bad")

log("a failed")

Do this:

d = {
(1, 1): commit,
(1, 0): report_that_b_failed,
(0, 0): report_that_a_failed

k = (a, b)
f = d[k]

This approach is also really useful when you face the need to change logic based on runtime values. You can imagine that d might be built after parsing an XML file.