Skip to content

Fixed escaping of alt text in ContentFormat.img() #2036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bmispelon
Copy link
Member

@bmispelon bmispelon commented Apr 17, 2025

This should fix #2035 and also address escaping issues in the other formats (markdown and restructured text).

@bmispelon bmispelon changed the title Fixed escaping of alt text in ContentFormat.HTML Fixed escaping of alt text in ContentFormat.img() Apr 17, 2025
@bmispelon bmispelon marked this pull request as ready for review April 17, 2025 07:43
Copy link
Member

@thibaudcolas thibaudcolas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable but I don’t think I’ll have time to actually test this / review in more depth, sorry

Copy link
Member

@adamzap adamzap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the final ReST test that includes a newline (see my other comment), I'm able to get the tests to pass with this small change:

diff --git a/blog/models.py b/blog/models.py
index 0083dd22..9ca310b0 100644
--- a/blog/models.py
+++ b/blog/models.py
@@ -1,3 +1,4 @@
+import html
 from urllib.parse import urlparse
 
 from django.conf import settings
@@ -73,7 +74,7 @@ class ContentFormat(models.TextChoices):
         CF = type(self)
         return {
             CF.REST: f".. image:: {url}\n   :alt: {alt_text}",
-            CF.HTML: f'<img src="{url}" alt="{alt_text}">',
+            CF.HTML: f'<img src="{url}" alt="{html.escape(alt_text)}">',
             CF.MARKDOWN: f"![{alt_text}]({url})",
         }[self]

I'm wondering if _md_escape is not needed since markdown escapes alt text automatically:

>>> import markdown
>>> markdown.markdown('![Al"t t"ext](/path/to/img.jpg)')
'<p><img alt="Al&quot;t t&quot;ext" src="/path/to/img.jpg" /></p>'

If the intent behind _md_escape is something beyond fixing alt text quoting, could we put it in a different PR?

(ContentFormat.REST, "test-", '<img src="." alt="test-">'),
(ContentFormat.REST, "test.", '<img src="." alt="test.">'),
(ContentFormat.REST, "test!", '<img src="." alt="test!">'),
(ContentFormat.REST, "te\nst", '<img src="." alt="te st">'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to support newlines in alt text? I'm having trouble verifying that newlines would have any affect on rendering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ContentFormat image HTML source needs to escape attributes
3 participants