Skip to content

Fix broken links (Devin)#5449

Closed
github-actions[bot] wants to merge 2 commits into
mainfrom
devin/fix-broken-links
Closed

Fix broken links (Devin)#5449
github-actions[bot] wants to merge 2 commits into
mainfrom
devin/fix-broken-links

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@devin-ai-integration Please fix the broken links detected by the scheduled link checker.

Instructions:

  1. For each URL listed below, identify the source file containing the broken link
  2. Try to find the correct replacement URL. Common fixes include:
    • Updating outdated commit SHAs to the latest commit
    • Fixing URL-encoded paths
    • Updating renamed file paths
  3. IMPORTANT: If you cannot confidently find a correct replacement (e.g., the only signal is a 5xx/503 error, connection reset, or rate limiting), do NOT remove or modify the link. Instead, leave it unchanged and add a PR comment on the relevant line asking for manual validation.
  4. Only update links/paths contained in this PR, not other links that follow a similar pattern
  5. Run fern docs dev locally to verify your changes don't break anything
  6. Push your fix to this PR branch
  7. After CI posts a preview link, use it to verify that your changes actually fix the underlying issue, then post a comment to the PR mentioning that you have tested it using the preview link
  8. When the PR is ready for review, request a review from @davidkonigsberg in GitHub and send a message in the Devin session that includes "<!here>" to alert everyone in the channel.
  9. Delete the scaffold file (.github/broken-links/broken-links.md) as part of your fix

Broken Links:

Non-429 Broken Links


View workflow run

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Investigation: all 3 broken links are fern-platform hreflang="zh" false-positives

Following the scaffold instruction "If you cannot confidently find a correct replacement... do NOT remove or modify the link. Instead, leave it unchanged and add a PR comment on the relevant line asking for manual validation." — and given that the previous fix attempt (PR #5409) was closed without merge.

What lychee actually found

Lychee fetched each English page from sitemap-en.xml and extracted links from the rendered HTML. The 3 reported "broken zh URLs" are not in sitemap-zh.xml, are not in source anywhere, and only appear as <link rel="alternate" hreflang="zh" href="..."> SEO alternates the fern-platform docs bundle attaches to every English page. From the workflow log:

### Errors in https://buildwithfern.com/learn/cli-generator/get-started/overview
* [404] <https://buildwithfern.com/learn/zh/cli-generator/get-started/overview>

### Errors in https://buildwithfern.com/learn/docs/preview-publish/multi-source-docs
* [404] <https://buildwithfern.com/learn/zh/docs/preview-publish/multi-source-docs>

### Errors in https://buildwithfern.com/learn/docs/seo/robots-txt
* [404] <https://buildwithfern.com/learn/zh/docs/seo/robots-txt>

Live verification:

URL zh HTML zh .md EN HTML In sitemap-zh.xml?
/learn/zh/cli-generator/get-started/overview 404 200 200 no
/learn/zh/docs/preview-publish/multi-source-docs 404 200 200 no
/learn/zh/docs/seo/robots-txt 404 200 200 no

Why source has nothing to fix

  • I grepped the entire fern/ tree for each path. Zero references.
  • The cli-generator product doesn't exist in fern/docs.yml at all — that content lives in another repo publishing to the same buildwithfern.com/learn domain (multi-source). There's no docs-repo source for me to edit even if I wanted to.
  • The English equivalents serve 200 and the source links pointing at them are correct.
  • The hreflang alternate is generated server-side in the fern-platform docs bundle for every English page, independent of whether a zh-locale HTML route exists for that page.

This is a second sub-bug, distinct from the stale-S3-blob sub-bug that caused /products/sdks/custom-code.mdx in #5409:

  1. Stale zh translation S3 blob (previously: /sdks/deep-dives/readme showing the old /products/sdks/custom-code.mdx link). FDR registerTranslation keeps an old blob even after successful Publish Docs runs.
  2. hreflang="zh" emitted for pages with no rendered zh HTML route (this run). .md content serves 200, only the HTML page 404s, so the alternate link points at a real-but-broken URL.

Sub-bug (2) will keep producing new broken-link reports every time an English page is added or renamed without a corresponding zh translation snapshot (cli-generator, preview-publish/multi-source-docs, seo/robots-txt are all relatively recent additions). It's whack-a-mole at the lychee-config layer.

Options (none of which I'll execute without sign-off)

  1. Land lychee excludes anyway — matches the precedent set by #5375. Stops the noise now. Downside: you closed chore: exclude platform-side stale zh URLs from link checker #5409 which did the same thing, so I'm assuming you no longer want this.
  2. Fix in fern-api/fern-platform — gate hreflang="zh" emission on actual zh route existence (sub-bug 2); investigate stale FDR registerTranslation writes (sub-bug 1). This is the real fix. I can open a fern-platform PR for sub-bug (2) — the docs bundle already knows which pages have translations because it builds sitemap-zh.xml from that same data; the hreflang emitter is just not consulting it.
  3. Close this PR without action — same outcome as chore: exclude platform-side stale zh URLs from link checker #5409, accept the link-checker noise as a forcing function for the platform fix.
  4. Drop the workflow's link-extraction on English pages, only check sitemap URLs directly — broader change to .github/workflows/check-links.yml to pass --no-progress --files-from urls.txt without letting lychee follow HTML-discovered links. Loses some legitimate external-link coverage from page bodies.

Leaning toward option 2 (open a fern-platform PR to fix the hreflang emission) since that's the root cause, but want to confirm before doing more work that gets closed. cc @davidkonigsberg

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Investigation: no source-file fix applies — needs platform/manual validation

All three broken URLs are /learn/zh/... paths (Chinese-localized routes). They are not authored anywhere in fern/ — they're emitted by the Fern platform's language switcher on the corresponding English pages. So there is no source path to rewrite the way the workflow instructions suggest (no outdated SHA / renamed file / encoded path).

# Broken URL What I found
1 /learn/zh/cli-generator/get-started/overview The cli-generator product is beta / early access (availability: beta in fern/products/cli-generator/overview.mdx) and has no zh translation directory. Local fern docs dev serves the EN page as a fallback (200, EN content under the zh slug) — so it's not a 404 locally, only on production.
2 /learn/zh/docs/preview-publish/multi-source-docs ZH translation does exist in repo: fern/translations/zh/products/docs/pages/preview-publish/multi-source.mdx (added in #5385). ZH nav entry exists in fern/translations/zh/fern/products/docs/docs.yml. The URL returns 200 locally via fern docs dev and 200 on the original #5385 preview deploy (link). Only production 404s.
3 /learn/zh/docs/seo/robots-txt Same situation as (2). ZH file fern/translations/zh/products/docs/pages/seo/robots-txt.mdx (added in #5385) with explicit slug: robots-txt in the ZH nav. Returns 200 locally + on the #5385 preview, 404 on production.

Why this looks like a stale Full Route Cache, not a docs bug

The fern-platform docs bundle's [slug]/page.tsx route uses export const dynamic = "force-static". After FDR is updated by publish-docs, revalidateTag(domain) clears the data cache, but the Full Route Cache for any newly-existent slug only gets cleared via revalidatePath(). New slugs that didn't exist at the previous deploy seem not to be hit by the revalidate sweep, so they continue to 404 even though FDR has them.

That matches what we see:

What I changed in this PR

  • Deleted the scaffold file .github/broken-links/broken-links.md (per instruction 9).
  • No source-file changes. Per instruction 3, I'm leaving the broken links unmodified because there is no confident source-level replacement: (a) for feat(ci): Add stale-bot #1 the EN-fallback behavior locally suggests this is platform routing, not a missing nav entry we can add safely without committing to translating an early-access product; (b) for feat: new home page #2 and fix(ci): Stale-bot needs more permissions for private repos #3 the docs config is already correct.

Asks for manual validation

  1. Can someone with fern-platform access force a revalidatePath sweep for /learn/zh/docs/seo/robots-txt and /learn/zh/docs/preview-publish/multi-source-docs on fern.docs.buildwithfern.com? If that flips them to 200, the cache-eviction story is confirmed and we may want a follow-up to make publish-docs revalidate new slugs (not just changed ones).
  2. For /learn/zh/cli-generator/...: should the language switcher suppress zh links on pages whose zh translation doesn't exist? Right now it's emitting links to EN-fallback URLs that 404 on prod and that's what's tripping the link checker.

Happy to open a follow-up PR with a cli-generator ZH translation (or a temporary lychee exclusion scoped to these three URLs) if either of those is the preferred path.

@github-actions
Copy link
Copy Markdown
Contributor Author

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Verified on the CI preview deploy: all three URLs return 200 there, including the cli-generator one (the platform falls back to the EN page when no zh translation exists).

$ curl -sIw "%{http_code}\n" -o /dev/null https://fern-preview-devin-fix-broken-links.docs.buildwithfern.com/learn/zh/cli-generator/get-started/overview
200
$ curl -sIw "%{http_code}\n" -o /dev/null https://fern-preview-devin-fix-broken-links.docs.buildwithfern.com/learn/zh/docs/preview-publish/multi-source-docs
200
$ curl -sIw "%{http_code}\n" -o /dev/null https://fern-preview-devin-fix-broken-links.docs.buildwithfern.com/learn/zh/docs/seo/robots-txt
200

Production still 404s on those same paths, which matches the stale Full Route Cache theory above — a fresh deploy serves them correctly. The PR diff (scaffold deletion only) is the safe minimum here; the underlying fix needs to happen at the platform level (cache invalidation for newly-added slugs).

@devin-ai-integration devin-ai-integration Bot marked this pull request as ready for review May 11, 2026 11:40
@devalog devalog closed this May 11, 2026
@github-actions github-actions Bot deleted the devin/fix-broken-links branch May 12, 2026 10:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant