So I was just testing the default HTML Purifier config that now allows embedding of youtube videos properly and doesn't strip it out on saving of an entry, however it is still not working properly. I believe this is because the regex used to determine whether the iframe src URL is either youtube or vimeo is not quite right and doesn't account for an edge case in youtube urls that omit the
https: part at the beginning of the URL.
This is happening because when you first embed the youtube video, the resulting HTML looks like this:
<figure><iframe style="width: 500px; height: 281px;" src="//www.youtube.com/embed/roY6H75d9wE" frameborder="0" allowfullscreen=""></iframe></figure>
As you can see above, the
src parameter starts with
//, instead of
As a result, the above
src value gets stripped out because the current regex being used doesn't account for the possibility of a missing scheme:
Here is a regex I've used in the past that works with this edge case:
'URI.SafeIframeRegexp' => '%^(https?:)?//(www\.youtube(?:-nocookie)?\.com/embed/|player\.vimeo\.com/video/)%',
Please note the additional brackets and question mark around the
https?: part marking it as optional. It also accounts for youtube's nocookie URLs. Lastly, the
. (dot) characters have also been escaped as we want it to match an actual dot and not just any character.
Also, I think those
. (dot) characters in the regex need to be escaped properly as the intent is to match a literal
. character and not any character. I have updated the regex solution above to escape them.
@andris-sevcenko just checking in to see if you have seen this issue. Thank you.
@sidm1983 I have. Are you willing to submit a PR for this?
Just released 2.8.3 which fixes this.
Just updated the default HTML Purifier config for new Craft projects as well. You will need to make this change manually if you have a
config/htmlpurifier/Default.json file: https://github.com/craftcms/craft/commit/2e3710956cd6d8f8a1ac572a2b10c53e714f1799
Thanks for that @andris-sevcenko & @brandonkelly. Apologies I didn't submit a PR for this, as I didn't get a chance at the time. Thanks for sorting it out though.
@brandonkelly, quick question about the change you made in the above commit. It looks like the dot characters are not being escaped, which means they will match any character and not just a single dot. Is this intentional?
@brandonkelly just wanted to check if you've seen my comment above about escaping the dot character in the regex. Thank you. 😊🙏🏽