Speeding up url_for for Flask
17th February 2018
@functools.lru_cache(maxsize=2560)
def caching_url_for(endpoint, **values):
"""A version of `url_for` that caches the output"""
return url_for(endpoint, **values)
Update: The above version of the original code snippet I posted uses functools' lru_cache (least recently used) which is not only more concise but also provides the option to control the size of the cache. Thanks to cyanydeez for the suggestion.
def caching_url_for(endpoint, **values):
"""A version of `url_for` that caches the output"""
# Check for a cached version of the endpoint/values
value_key = None
if endpoint in caching_url_for._cache:
if values:
value_key = tuple(sorted(values.items()))
if value_key in caching_url_for._cache[endpoint]:
return caching_url_for._cache[endpoint][value_key]
# No cached version so generate the URL
url = url_for(endpoint, **values)
# Store the URL in the cache for the next look up
if endpoint not in caching_url_for._cache:
caching_url_for._cache[endpoint] = {}
caching_url_for._cache[endpoint][value_key] = url
return url
# Define a cache for the `caching_url_for` function, URLs are cached locally
# so that restarting the application clears the cache.
caching_url_for._cache = {}
On a recent project I was asked to help optimise the server-side render time for a page displaying a 200+ carpet swatches (I live on the edge I'm telling you).
The tech stack involved in this case was Flask > Mongoframe > Jinja2, and the page that was slow to render can be seen here: https://www.brintons.co.uk/carpets
I expected the performance issues to be shared somewhat between querying the database (via Mongoframes) and rendering the template (Jinja2), however, after added some basic profiling to the view I was surprised to see that several hundred calls to url_for were adding up to a significant amount of time - the bottleneck was actually with Flask.
To resolve this issue I implemented the code above to provide a version of url_for that caches the output.
Benchmarks
I put together some basic benchmarks, however take these with a pinch of salt as the implementation used to generate them was very simple (I've included the code at the end of the article).
Function | Avg. time (in secs) for 200 calls |
---|---|
url_for | 0.0189 |
cached_url_for | 0.0006 |
cached_url_for (pre-populated*) | 0.0003 |
* For this result I first made sure the cache was populated.
Benchmark results
The results show that the caching_url_for function is (once populated) approximately 60x faster.
But wait we're only saving a little under two hundredths of a second here! My guess is that this is down to the fact I'm running the benchmark against a site with just a handful of registered rules, on a site with 10s or 100s of rules (like Brintons.co.uk) the performance difference is greater because rules are built/matched by looping over all the registered rules until you find one that matches (from my interpretation of the code) - this also means some endpoints would be found faster than others.
On a large and busy website with lots of concurrent visitors this seems like a simple to implement and worthwhile optimisation. However, if anyone can spot a flaw in my thinking/code here I'd love to hear from you.
Benchmark code
@app.route('/foo/<int:bar>')
def foo(bar):
pass
@app.route('/benchmark')
def benchmark():
s = time.time()
for i in range(1000):
for j in range(200):
url_for('foo', bar=j)
avg_time = ((time.time() - s) / 1000)
output = 'url_for:' + str(avg_time) + '<br>'
for j in range(200):
caching_url_for('foo', bar=j)
s = time.time()
for i in range(1000):
for j in range(200):
caching_url_for('foo', bar=j)
avg_time = ((time.time() - s) / 1000)
output += 'caching_url_for:' + str(avg_time)
return output