Skip to content

Harden fenced markdown rendering against code-fence breakout and mention abuse#534

Closed
Copilot wants to merge 6 commits intomainfrom
copilot/audit-discord-user-code
Closed

Harden fenced markdown rendering against code-fence breakout and mention abuse#534
Copilot wants to merge 6 commits intomainfrom
copilot/audit-discord-user-code

Conversation

Copy link
Copy Markdown

Copilot AI commented Apr 10, 2026

This change closes remaining markdown injection surfaces where untrusted external text was interpolated into fenced Discord output.

  • Scope of hardening

    • Applied fenced-content sanitization to untrusted outputs across GitHub, snippet, release-feed, package, and error-rendering paths.
    • Covered: issue/PR/repo/user/org/gist views, snippet line extraction, release body rendering, PyPI/Crates summaries, and extension/error text formatting.
  • Sanitizer behavior

    • Centralized via Manager.sanitize_codeblock_content(...).
    • Neutralizes fence delimiters to prevent breakout for both:
      • triple backticks: ``````
      • triple tildes: ~~~
    • Neutralizes mentions in non-code textual outputs to reduce ping abuse.
  • Parity-preserving exception for code views

    • For code-centric outputs (gist/snippet code blocks), mention neutralization is disabled to preserve literal @ characters and copy/paste fidelity.
    • Fence breakout protection still applies.
  • Representative usage

    body = self.bot.mgr.sanitize_codeblock_content(
        self.bot.mgr.truncate(pr['bodyText'], 387, full_word=True)
    )
    embed.add_field(name=':notepad_spiral: Body:', value=f"```{body}```", inline=False)

Summary by Sourcery

Harden Discord fenced markdown rendering for untrusted external text and error outputs by centralizing sanitization of content embedded in code blocks.

Bug Fixes:

  • Prevent code-fence breakout in Discord embeds by sanitizing untrusted text before interpolation into fenced code blocks.
  • Reduce unwanted user and role pings by neutralizing mentions in non-code textual outputs shown inside Discord code fences.

Enhancements:

  • Introduce a reusable manager-level sanitizer for codeblock content that escapes fence delimiters and optionally neutralizes mentions.
  • Apply the centralized sanitizer across GitHub views, release feed bodies, package summaries, snippet rendering, commit/issue/PR bodies, and error/debug embeds while preserving literal content for code-centric snippets.

Summary by Sourcery

Harden Discord embed rendering by sanitizing untrusted text before interpolation into fenced code blocks.

Bug Fixes:

  • Prevent markdown code-fence breakout in error, GitHub, and feed embeds by sanitizing interpolated content.
  • Reduce potential mention abuse in fenced, non-code text by neutralizing mention characters in sanitized content.

Enhancements:

  • Introduce a centralized manager-level sanitizer for codeblock content that escapes fence delimiters and optionally neutralizes mentions.
  • Apply the sanitizer across error logging, extension management, GitHub repo/user/org/commit/issue views, release feeds, package info, snippet rendering, and gists while preserving literal content for code-centric snippets.

Copilot AI and others added 6 commits April 10, 2026 21:08
Agent-Logs-Url: https://github.com/statch/gitbot/sessions/46b730a3-a822-49cd-9473-c994c6dd7db6

Co-authored-by: seven7ty <63970738+seven7ty@users.noreply.github.com>
Copy link
Copy Markdown

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • In sanitize_codeblock_content, using if not content: will convert falsy values like 0 to an empty string; consider changing this to an explicit if content is None: check so numeric or boolean values are preserved when passed in.
  • You currently rely on callers to remember neutralize_mentions=False for code-centric paths (snippets, gists, etc.); consider adding small helper wrappers like sanitize_code() vs sanitize_text() or defaulting based on context to reduce the chance of accidentally neutralizing @ in future code views.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `sanitize_codeblock_content`, using `if not content:` will convert falsy values like `0` to an empty string; consider changing this to an explicit `if content is None:` check so numeric or boolean values are preserved when passed in.
- You currently rely on callers to remember `neutralize_mentions=False` for code-centric paths (snippets, gists, etc.); consider adding small helper wrappers like `sanitize_code()` vs `sanitize_text()` or defaulting based on context to reduce the chance of accidentally neutralizing `@` in future code views.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! This is your first pull request, we're glad to have you with us 😄 Make sure to follow the contribution guidelines. Happy coding!

@seven7ty seven7ty closed this Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants