AMPing ContentTools

25th January 2018

Over the last couple of days I've been working on introducing AMP (Accelerated Mobile Pages) to this journal as I have a number of upcoming projects that will also use AMP.

AMP requires you to use a modified markup for your HTML, for example img tags are converted to amp-img, there are other tag attributes such as layout and placeholder and there's also a bunch of rules required for your page to validate.

ContentTools outputs HTML which is saved to a database and then retrieved and rendered out as part of a HTML template. This is problematic as the output HTML will likely fail AMP validation.

I've run across this is type of problem before and it's made me think that at some point I'd like to add support for JSON output in ContentTools to allow for control over the rendered output. For now though I've implemented a Jinja filter for manhattan that can convert the HTML output by ContentTools into valid AMP markup. Whilst this doesn't feel as clean as having control over the initial rendering of the HTML it does seem to work  pretty well and so I thought I'd share.

from bs4 import BeautifulSoup

__all__ = [
    'amped'
]


def amped(html, snippet=None):
    """Convert a HTML string to an AMP HTML string"""

    # Parse the HTML string
    soup = BeautifulSoup(html, 'html.parser')

    # Build a map of images (assets) stored against the snippet
    assets_map = {}
    if snippet:
        if snippet.scope == 'local':
            assets_map = {a[0]: a[1]
                    for a in snippet.local_contents.get('__assets__', [])}
        else:
            contents = snippet.global_snippet.contents
            assets_map = a{
                    a[0]: a[1] for a in contents.get('__assets__', [])}

    # Ensure images have width and height attributes
    for tag in soup.select('[data-mh-asset-key]'):
        asset = assets_map.get(tag['data-mh-asset-key'])
        if asset:
            variations = asset.get('variations')
            image_size = variations['image']['core_meta']['image']['size']
            image = tag.find('img')
            if image:
                image['width'] = image_size[0]
                image['height'] = image_size[1]

    # Remove image fixtures
    for tag in soup.select('[data-ce-tag="img-fixture"]'):
        image = tag.find('img')
        if image:
            tag.previous_element.insert_after(image)
            tag.extract()

    # Convert images to amp images
    for tag in soup.select('img'):
        if not tag.get('src'):
            tag.extract()
        tag.name = 'amp-img'
        tag['layout'] = 'responsive'

    # Ensure images within amp-iframes are set as placeholders
    for tag in soup.select('amp-iframe'):
        image = tag.find('amp-img')
        if image:
            image['layout'] = 'fill'
            image['placeholder'] = True
            del image['width']
            del image['height']

    # Remove any inline styles
    for tag in soup.select('style'):
        tag.extract()

    for tag in soup.select('[style]'):
        del tag['style']

    return soup.prettify()

What's that snippet stuff?

So the code relating to the snippet argument here is another work around.

Currently ContentTools doesn't set a height against image fixtures and this is a required field for responsive images in AMP. Fortunately this information is stored as meta data against the snippet of content in manhattan and so we can access it and add the height for the image.

I'm not sure why height isn't currently set against image fixtures but clearly fixing this would simplify the above so that's on my todo list.