As near as I can tell, looking up a string inside a set is effectively free.
I made a 9000-element set, where each element is foo_0, or foo_1, foo_2, … , on up to foo_8999. Then I measured the time cost of testing whether an element belongs to that set:
$ python -m timeit -s 's = set(["foo_%s" % i for i in range(9000) ])' '"foo_4500" in s'
1000000 loops, best of 3: 0.447 usec per loop
Next I measured a few scans across a list of the same size:
$ python -m timeit -s 's = ["foo_%s" % i for i in range(9000) ]' '"foo_0" in s'
1000000 loops, best of 3: 0.447 usec per loop
$ python -m timeit -s 's = ["foo_%s" % i for i in range(9000) ]' '"foo_1" in s'
1000000 loops, best of 3: 0.659 usec per loop
$ python -m timeit -s 's = ["foo_%s" % i for i in range(9000) ]' '"foo_900" in s'
10000 loops, best of 3: 130 usec per loop
$ python -m timeit -s 's = ["foo_%s" % i for i in range(9000) ]' '"foo_4500" in s'
1000 loops, best of 3: 631 usec per loop
It takes more time to do to two string comparisons than it does to hash the string and look it up in the set.
Add New Comment
Viewing 6 Comments
Thanks. Your comment is awaiting approval by a moderator.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Do you already have an account? Log in and claim this comment.
Add New Comment