Rewrite my ugly code

I have a big gnarly function named get_start_and_stop_dates. Please rewrite it into something that still passes all the doctests but isn’t so ugly.

If you save this in a file named matt_sucks.py, you can run all the doctests like this:

$ nosetests --with-doctest matt_sucks.py
Doctest: matt_sucks.get_first_and_last_dom ... ok
Doctest: matt_sucks.get_start_and_stop_dates ... ok
Doctest: matt_sucks.stubborn_datetimeparser ... ok

----------------------------------------------------------------------
Ran 3 tests in 0.028s

OK

Here’s the code:

import simplejson
from datetime import date, datetime, timedelta

def get_start_and_stop_dates(d, s):

"""
Returns a tuple of datetime.date objects.

First checks dictionary d, then looks in the cookie s, then returns
the results of get_first_and_last_dom().

We return values from the dictionary d, even if the values exist in
simple_cookie s:

>>> d = {'start_date':'12-07-2008', 'stop_date':'12-20-2008'}
>>> import Cookie, simplejson
>>> s = Cookie.SimpleCookie()
>>> s['start_date'] = simplejson.dumps('12-08-2008')
>>> s['stop_date'] = simplejson.dumps('12-11-2008')
>>> a, b = get_start_and_stop_dates(d, s)
>>> from datetime import date
>>> isinstance(a, date) and isinstance(b, date)
True
>>> a.strftime('%m-%d-%Y'), b.strftime('%m-%d-%Y')
('12-07-2008', '12-20-2008')

If the dictionary d doesn't have values, then we get them from the
simple_cookie object s:

>>> a, b = get_start_and_stop_dates({}, s)
>>> from datetime import date
>>> isinstance(a, date) and isinstance(b, date)
True
>>> a.strftime('%m-%d-%Y'), b.strftime('%m-%d-%Y')
('12-08-2008', '12-11-2008')

We handle mix-and-match scenarios, like where one value is in d and
another is in s:

>>> s2 = Cookie.SimpleCookie()
>>> s2['stop_date'] = simplejson.dumps('2-28-1975')
>>> get_start_and_stop_dates({'start_date':'2-17-1975'}, s2)
(datetime.date(1975, 2, 17), datetime.date(1975, 2, 28))

When just one of the dates is specified, then the other will be
the first/last day of the month containing the other date:

>>> get_start_and_stop_dates({'start_date':'2-17-1975'},
... Cookie.SimpleCookie())
(datetime.date(1975, 2, 17), datetime.date(1975, 2, 28))

>>> get_start_and_stop_dates({'stop_date':'2-17-1975'},
... Cookie.SimpleCookie())
(datetime.date(1975, 2, 1), datetime.date(1975, 2, 17))

Finally, we call get_first_and_last_dom when all else fails:

>>> get_first_and_last_dom() == get_start_and_stop_dates({},
... Cookie.SimpleCookie())
True
"""

# I've revised this several times, but this is still pretty ugly.
# It probably can be divided into more functions.

# These are the last-resort values, holding the first and last days
# of the current month.
first, last = get_first_and_last_dom()

# These are the dateformats that the dates will be in.
dateformats = ['%m-%d-%Y', '%Y-%m-%d', '%Y-%m-%d %H:%M:%S']

start_date = stop_date = None

# Figure out the start_date first.
if 'start_date' in d and d['start_date']:
start_date = stubborn_datetimeparser(d['start_date'],
dateformats).date()

elif s.has_key('start_date') and s['start_date'].value:
start_date = stubborn_datetimeparser(simplejson.loads(s['start_date'].value),
dateformats).date()

# Now repeat the process for stop_date.
# TODO: pull this redundancy into a single function and call it
# twice.
if 'stop_date' in d and d['stop_date']:
stop_date = stubborn_datetimeparser(d['stop_date'],
dateformats).date()

elif s.has_key('stop_date') and s['stop_date'].value:
stop_date = stubborn_datetimeparser(simplejson.loads(s['stop_date'].value),
dateformats).date()

# Now figure out what to return. Remember, if we found one date,
# but not the other, then we return the first/last date of that month,
# not the current month.

if not start_date and not stop_date:
return first, last

elif start_date and stop_date:
return start_date, stop_date

elif start_date and not stop_date:
a, b = get_first_and_last_dom(start_date)
return start_date, b

elif not start_date and stop_date:
a, b = get_first_and_last_dom(stop_date)
return a, stop_date

def get_first_and_last_dom(dt="today"):
"""
Return a tuple of datetime.date objects with the first and last day of the
month holding dt, which defaults to today.

>>> first, last = get_first_and_last_dom(datetime(2008, 12, 6))
>>> first == datetime(2008, 12, 1).date()
True
>>> last == datetime(2008, 12, 31).date()
True

>>> first, last = get_first_and_last_dom(datetime(2008, 11, 30))
>>> first == datetime(2008, 11, 1).date()
True
>>> last == datetime(2008, 11, 30).date()
True
"""

if dt == "today":
dt = datetime.now()

# We have to be a little careful figuring out the last date.
if dt.month < 12: last_day = (datetime(dt.year, dt.month+1, 1) - timedelta(days=1)).date() else: last_day = (datetime(dt.year+1, 1, 1) - timedelta(days=1)).date() first_day = datetime(dt.year, dt.month, 1).date() return first_day, last_day def stubborn_datetimeparser(s, dateformats): """ Keep trying to parse s into a datetime object until we succeed or run out of dateformats. When the first format works, we immediately return: >>> dateformats = ['%Y-%m-%d', '%m-%d-%Y', '%m-%d-%Y %H:%M']
>>> stubborn_datetimeparser('12-1-2008', dateformats)
datetime.datetime(2008, 12, 1, 0, 0)

Otherwise, we keep trying until we parse it:

>>> stubborn_datetimeparser('12-1-2008', dateformats)
datetime.datetime(2008, 12, 1, 0, 0)

>>> stubborn_datetimeparser('12-1-2008 15:47', dateformats)
datetime.datetime(2008, 12, 1, 15, 47)

or we run out of formats, and raise a ValueError:

>>> stubborn_datetimeparser('12/1/2008', dateformats)
Traceback (most recent call last):
...
ValueError: I couldn't parse '12/1/2008' with any of my formats!
"""

for datefmt in dateformats:
try:
return datetime.strptime(s, datefmt)

except ValueError:
pass

# This else matches the for datefmt in dateformats loop. It means
# that we didn't break out of the loop early.
else:
raise ValueError("I couldn't parse '%s' with any of my formats!" % s)

Now get to work!

12 thoughts on “Rewrite my ugly code

  1. It's not for the function you're wanting help with, but dateutil would make much shorter work of your date parsing issues.

    My typical usage for parsing date strings is (in a nutshell):

    from dateutil.parser import parse as parse_date
    dt = parse_date(some_date_string)

    You can add your own formats to it if you need to do crazy stuff.

  2. Thanks Mike — besides helping with parsing, what else can dateutil
    help with? I remember you showed a bunch of neat recurrence trickery
    as well. I think I should read all the docs.

  3. The standard library module “calendar” contains the function monthrange(), which will give you the numerical day of the first and last days of the month. The last day of the month in get_first_and_last_dom() becomes last_day = datetime.date(dt.year, dt.month, calendar.monthrange(dt.year,dt.month)[1])

  4. That calendar.monthrange trick is nice. Allows me to forget about nonsense like leap year and special logic for December.

    I read somewhere that every big python app ends up building a crappier version of what's already in the standard library. I'm happy to see I've reached that point 🙂

    Thanks for the help!

  5. I like that quote! And thanks, this was a fun exercise. I've been sick for a couple of days, so I haven't had my own ugly code to stare at for a while. 🙂

Comments are closed.