Speeding up url_for for Flask

17th February 2018

@functools.lru_cache(maxsize=2560)
def caching_url_for(endpoint, **values):
    """A version of `url_for` that caches the output"""
    return url_for(endpoint, **values)

Update: The above version of the original code snippet I posted uses functools' lru_cache (least recently used) which is not only more concise but also provides the option to control the size of the cache. Thanks to cyanydeez for the suggestion.

def caching_url_for(endpoint, **values):
    """A version of `url_for` that caches the output"""

    # Check for a cached version of the endpoint/values
    value_key = None
    if endpoint in caching_url_for._cache:
        if values:
            value_key = tuple(sorted(values.items()))
        if value_key in caching_url_for._cache[endpoint]:
            return caching_url_for._cache[endpoint][value_key]

    # No cached version so generate the URL
    url = url_for(endpoint, **values)

    # Store the URL in the cache for the next look up
    if endpoint not in caching_url_for._cache:
        caching_url_for._cache[endpoint] = {}
    caching_url_for._cache[endpoint][value_key] = url

    return url

# Define a cache for the `caching_url_for` function, URLs are cached locally
# so that restarting the application clears the cache.
caching_url_for._cache = {}

On a recent project I was asked to help optimise the server-side render time for a page displaying a 200+ carpet swatches (I live on the edge I'm telling you).

The tech stack involved in this case was Flask > Mongoframe > Jinja2, and the page that was slow to render can be seen here: https://www.brintons.co.uk/carpets

I expected the performance issues to be shared somewhat between querying the database (via Mongoframes) and rendering the template (Jinja2), however, after added some basic profiling to the view I was surprised to see that several hundred calls to url_for were adding up to a significant amount of time - the bottleneck was actually with Flask.

To resolve this issue I implemented the code above to provide a version of url_for that caches the output.

Benchmarks

I put together some basic benchmarks, however take these with a pinch of salt as the implementation used to generate them was very simple (I've included the code at the end of the article).

FunctionAvg. time (in secs)
for 200 calls
url_for0.0189
cached_url_for0.0006
cached_url_for (pre-populated*)0.0003

* For this result I first made sure the cache was populated.

Benchmark results

The results show that the caching_url_for function is (once populated) approximately 60x faster.

But wait we're only saving a little under two hundredths of a second here! My guess is that this is down to the fact I'm running the benchmark against a site with just a handful of registered rules, on a site with 10s or 100s of rules (like Brintons.co.uk) the performance difference is greater because rules are built/matched by looping over all the registered rules until you find one that matches (from my interpretation of the code) - this also means some endpoints would be found faster than others.

On a large and busy website with lots of concurrent visitors this seems like a simple to implement and worthwhile optimisation. However, if anyone can spot a flaw in my thinking/code here I'd love to hear from you.

Benchmark code

@app.route('/foo/<int:bar>')
def foo(bar):
    pass


@app.route('/benchmark')
def benchmark():

    s = time.time()

    for i in range(1000):
        for j in range(200):
            url_for('foo', bar=j)

    avg_time = ((time.time() - s) / 1000)
    output = 'url_for:' + str(avg_time) + '<br>'

    for j in range(200):
        caching_url_for('foo', bar=j)

    s = time.time()

    for i in range(1000):
        for j in range(200):
            caching_url_for('foo', bar=j)

    avg_time = ((time.time() - s) / 1000)
    output += 'caching_url_for:' + str(avg_time)

    return output