Jinja: lstrip_blocks doesn't preserve space before end block on the same line

Created on 29 Jan 2020  ·  13Comments  ·  Source: pallets/jinja

Look at the following template code:

set_property verilog_define {
  {%- for k, v in vlogdefine.items() %}{{ k }}={{ v|param_value_str }} {% endfor -%}
  } [get_filesets sources_1]

https://github.com/olofk/edalize/blob/bdb6c9ccc666e9f60333279ad53ed09cda88b3dc/edalize/templates/vivado/vivado-project.tcl.j2#L27-L29

Up to Jinja2 2.10.3 this would result in something like

set_property verilog_define {vlogdefine_bool=1 vlogdefine_int=42 vlogdefine_str=hello } [get_filesets sources_1]

https://github.com/olofk/edalize/blob/bdb6c9ccc666e9f60333279ad53ed09cda88b3dc/tests/test_vivado/test_vivado_0.tcl#L10

The code has lstrip_blocks = True.

Now with 2.11.0 the code is rendered like this:

set_property verilog_define {vlogdefine_bool=1vlogdefine_int=42vlogdefine_str=hello} [get_filesets sources_1]

@towoe bisected the behavior change down to https://github.com/pallets/jinja/commit/7d00a40465c89bee141ab5a3db545a20e7d30509 (by @petee-d in https://github.com/pallets/jinja/issues/857)

According to the documentation

lstrip_blocks
If this is set to True leading spaces and tabs are stripped from the start of a line to a block. Defaults to False.

I don't think the whitespace should be removed before the endfor, since that's not whitespace "from the beginning of the line". Am I misunderstanding that, or does the new version introduce an unexpected change in behavior?

Python 3.7.3 (but also happens on 3.5 in CI)

bug

Most helpful comment

Just released 2.11.2 with this.

All 13 comments

I'll look into it, sorry for the causing that confusing debugging session you linked! At first glance, it looks like the behavior is more consistent now, and we didn't have a test for the previous behavior, but I'll have to check. Would it be possible for you to use {%+ endfor -%} to explicitly declare preserving the whitespace?

@davidism thanks for taking a look.

Yes, we can make changes to edalize (the library where Jinja is used) in various ways (add the +, switch to lstrip_blocks = False, etc.), but I'd like to first ensure that we have a shared understanding of the expected behavior.

This also affects me in the same way as the original poster. The documentation seems pretty clear to me that the 2.11 behavior is wrong. From https://github.com/pallets/jinja/blob/2.11.x/docs/templates.rst#whitespace-control:

(Nothing will be stripped if there are other characters before the start of the block.)

This is due to #858, which fixed a pretty big speed issue with parsing space. I'm fine with fixing this issue it caused, but I don't want to just revert that change. If anyone wants to play around with the regex in the lexer and fix this, I'm happy to review it.

Sorry about the fix breaking the intended behavior of lstrip_blocks, I remember being a bit confused about what exactly it's supposed to do and being unable to find proper documentation for it. I'm not sure why Google didn't take me to the docs part @kenyon linked to, very weird. So I ended up figuring out the intended behavior from the tests and the implementation and apparently missed this.

As @davidism said, reverting the fix would cause other trouble so a new fix needs to be devised. I will try to make time for that this week.

from jinja2 import Template

t = Template(
    "{% if x %}{{ x }} {% endif %}y",
    lstrip_blocks=True,
)
out = t.render(x="x")
assert out == "x y"

https://github.com/pallets/jinja/blob/547e6e39ec3994c9dd0c806ee7bb29353843060e/src/jinja2/lexer.py#L721-L735

Here's the issue. When this template is being tokenized, find("\n") is handed text containing only the space between {{ x }} and {% endif %}, sees no newline, then strips from the beginning of the text.

Adding "\n" in text to the elif makes a bunch of tests fail because the first line of a template doesn't contain a newline but should be stripped if it only has space. If we get a little smarter and track whether we're on the first line, it still fails if trim_blocks is also enabled since that strips off the newline as part of the regex.

It also seems that the behaviour of {%- raw %} {% endraw -%} is affected. Before 2.11.0 this will force a space at this place. Now this space is removed.

from jinja2 import Template

t = Template(
    "{{x}}\n{%- raw %} {% endraw -%}\n{{ y }}",
    lstrip_blocks=True,
)
out = t.render(x="x", y="y")
assert out == "x y"

@petee-d any chance you have some time to take a look at this again? I'm going to revert it for 2.11.2 otherwise.

Ouch, totally forgot about this. Will try to actually make the time this week, otherwise revert it if the side effects are as bad as it seems.

My tentative plan is to release on Saturday. Let me know if you're working on it, I can push that back if so.

I think I have a fix. I tried not to be influenced in my fix by your previous comments @davidism and yet still came to the same solution you tried, just went a bit further.

First of all, I think I originally intended for my fix to behave differently if l_pos = text.rfind("\n") + 1 didn't find any newline character but then forgot about it. So I no longer do the stripping if there is no newline in text. Then I also tried to solve the problem with the source first line needing to be stripped anyway, but instead of using or pos == 0, I used a line_starting boolean flag intialized at True. Then I just set this flag after processing each token based on whether the matching string ends with a newline and it also solves the trim_blocks issue. See the fix in #1183.

The current test suite passes, tomorrow I will also add new tests for the cases in this issue and other I can think of. I will also run the original version, first fix and the second fix about my freshly collected 5GB set of user created templates I collected from my project, compare the parse trees and performance.

Tests added, perf tests showed no degradation, so it's done from my side. :)

Just released 2.11.2 with this.

Was this page helpful?
0 / 5 - 0 ratings