Jinja: overrideable autoescape

Created on 15 Oct 2015  ·  11Comments  ·  Source: pallets/jinja

At work, we are using Jinja to render templates as YAML. We'd like to customize all {{ variable }} expansions to do something sensible to Python values like None (which is null in YAML) and empty string (which is encoded as '' in YAML). It seems like the Jinja autoescaping mechanism would be perfect for this. Unfortunately, I don't see any way to configure the autoescaping functionality to not autoescape to HTML. Instead, I've had to monkeypatch Markup and escape in jinja2.runtime. It would be nice if there were an officially-sanctioned method to do this, for example by overriding something in the environment.

All 11 comments

Not sure if autoescaping is a proper way to do this.

I'd rather use a filter. tojson seems a good candidate since yaml is a superset of json,.


Besides that: Any reason why you string-based templates to build YAML? Serializing a Python dictionary as YAML seems much cleaner. The only case I can see where this might not be sufficient is when you want to include comments. But in that case you could use a YAML engine that handles non-destructive reading/writing so you'd simply load a YAML template with your comments and empty items and then update the data and re-serialize it as YAML.

The point of using autoescaping is that I don't want to have to put a filter on every single variable expansion.

Our use case is to write configuration files for our microservices. We have a "template" included with each service and we want to fill it with per-environment configuration which is specific to that template. I agree that using YAML directly is cleaner, but we couldn't find a generic "engine" that knows how to replace variables that it doesn't already know the location of. (There are YAML aliases and references, but they're not really a good fit for this -- they're more for serializing cyclic structures.) There are ways I can think of to get this to work using YAML, but they're kind of hacky too. This setup also gives more control to the kinds of substitutions that we can perform, although in practice we don't really use that for anything.

You could use a jinja plugin to apply the filter to all variables. It's not too hard as long as you don't have to use i18n blocks. Maybe you can take some ideas from something I've written some time ago: https://github.com/indico/indico/blob/master/indico/web/flask/templating.py#L187

This is very helpful, thanks! I feel like autoescaping is still an OK approach to solving this because you can e.g. flag variables as |safe. But I concede that conceptually, this isn't really like autoescaping, because we're not just ensuring that variables are valid YAML, but formatting them in a certain way. I'll write an extension.

@glasserc did you succeed with this extension ? I have similar problem, can you point out for solution / share some ideas ?

The first part of my solution was the extension that runs all variable calls through a filter. This is the bare minimum that I needed to get that working.

import jinja2.ext
from jinja2.lexer import Token

class YAMLEverythingExtension(jinja2.ext.Extension):
    """
    Insert a `|yaml` filter at the end of every variable substitution.

    This will ensure that all injected values are converted to YAML.
    """
    def filter_stream(self, stream):
        # This is based on https://github.com/indico/indico/blob/master/indico/web/flask/templating.py.
        for token in stream:
            if token.type == 'variable_end':
                yield Token(token.lineno, 'pipe', '|')
                yield Token(token.lineno, 'name', 'yaml')
            yield token

This could be smarter, for example it could try to skip inserting the filter if it sees a safe or yaml filter explicitly in the variable block. But I never ended up needing/having time for that.

The second part is the filter itself. Just using yaml.dump wasn't sophisticated enough, so I had to poke around the yaml internals a little bit.

import cStringIO
import yaml

def yaml_filter(val):
    """Serialize some value in isolation, not as part of any document.

    We can't just use yaml.dump because that outputs an entire document, including newlines, which isn't helpful for
    inserting into a YAML document."""
    if isinstance(val, jinja2.Undefined):
        val._fail_with_undefined_error()
    stream = cStringIO.StringIO()
    dumper = yaml.dumper.Dumper(stream)
    dumper.open()
    node = dumper.represent_data(val)
    dumper.serialize(node)
    # The serialized node tends to have a \n at the end.  The template might not
    # want a \n inserted here, e.g. if two variables are on the same line, so
    # strip.
    return stream.getvalue().strip()

This ties it all together, including making sure that the filter of the given name is available in the environment:

from jinja2 import loaders
from jinja2 import Environment, StrictUndefined

def get_environment():
    """Create a standard Jinja environment that has everything in it.
    """
    jinja_env = Environment(extensions=(YAMLEverythingExtension,),
                            # some other options that we use at work
                            loader=loaders.FileSystemLoader(['.', '/']),
                            undefined=StrictUndefined)
    jinja_env.filters["yaml"] = yaml_filter
    return jinja_env

This would still be great for escaping non-HTML templates. I'm using Jinja2 to generate JSON, and want to use JSON escaping instead of HTML escaping.

Using Jinja templates or any other string-based template language or string operations at all to generate JSON is just wrong. I can see why you would do this for YAML since it is meant to be human-friendly, too, but JSON should be generated from e.g. a Python dict and not from a Jinja template.

(that said, I am curious why you want to do that and what you are trying to do :p)

I guess that makes sense. :-) I'll close #571.

Maybe JSON is not good to be generated by Jinja, but I'm very interested in this kind of feature because I'm generating LaTeX using jinja. I've gotten most of the way there by modifying the block_start_string and block_end_string and others in the Environment. For now, I've set autoescape = False, but I would ideally like to pass something in to do the escaping myself (maybe like a regex).

This also applies to generating plaintext and markdown. Jinja has been a great tool so far for templating non-html documents.

I was looking into using jinja to generate SQL queries because using str.format() can get messy if you need multiple conditional query parts.

There are already libraries to do that, namely jinja-vanish that allows custom escape functions and jinjasql.
The first one has to disable constant-evaluation at compile which seems a little bad. The other one
uses the filter aproach to add a special filter to all variable expression which is ok but also feels not right.

I haven't look to deeply into it but it does not seem to be that hard to implement it. Could this be considered or is it out of scope for the library?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Xion picture Xion  ·  5Comments

AMDmi3 picture AMDmi3  ·  4Comments

humiaozuzu picture humiaozuzu  ·  3Comments

guettli picture guettli  ·  5Comments

jp-costa picture jp-costa  ·  5Comments