Rate limit HTTP requests to Azure Container Registry APIs#2137
Merged
Conversation
… errors LifecycleMetadataService.IsDigestAnnotatedForEolAsync catches all exceptions and returns null, so a transient registry failure (e.g. an ACR HTTP 429 rate-limit error) is silently treated as "not annotated". This false negative lets already-EOL-annotated digests leak back into the annotation list and fail the publish stage with a conflicting EOL date. Make OciArtifactType public so the test can reference it. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove the catch-all in IsDigestAnnotatedForEolAsync that logged and returned null for any exception. A transient registry failure (e.g. an ACR HTTP 429 rate-limit error) was being treated as "not annotated", causing already-EOL-annotated digests to leak into the annotation list and fail the publish stage with a conflicting EOL date. The legitimate "not annotated" case is still represented by a null return when no lifecycle referrer is present; genuine registry failures now propagate so callers fail loudly instead of producing a false negative. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the ad-hoc IHttpClientProvider/HttpClientProvider abstraction with the standard Microsoft.Extensions IHttpClientFactory. Consumers now inject IHttpClientFactory and call CreateClient(). The custom logging DelegatingHandler is dropped in favor of the factory's built-in HTTP request/response logging. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
mthalman
reviewed
Jun 8, 2026
mthalman
left a comment
Member
There was a problem hiding this comment.
Note that ContainerRegistryClient and ContainerRegistryContentClient are not getting the benefit of these changes to support rate limiting. Consider updating their respective factory classes to configure an options instance for the client.
var options = new ContainerRegistryClientOptions
{
Transport = new HttpClientTransport(httpClientFactory.CreateClient())
};
mthalman
approved these changes
Jun 9, 2026
lbussell
added a commit
that referenced
this pull request
Jun 9, 2026
…eouts (#2139) Related: #2137 Spurious timeouts were occurring during high-volume ACR operations (e.g. ORAS referrer lookups). Time spent waiting in the client-side rate limiter's queue was being charged against the request timeout - **Split the resilience pipeline into three levels:** `Retry → RateLimiter → per-attempt Timeout`. The limiter now sits *inside* retry (so each attempt is counted) but *outside* the per-attempt timeout (so queue-wait isn't timed). Retries no longer bypass the limiter. - **Bound concurrency** for all ACR operations. Keeps the rate limiting from becoming overloaded/throttling too hard. This increases throughput from ~1.4 → ~3.5 req/s in my local testing.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ever since #2050, checking for image lifecycle annotations has sped up. Checking for existing EOL annotations can exceed rate ACR's per-identity rate limit.
The original LifecycleAnnotationService assumed that error while checking for annotations/referrers means that the annotation does not exist. If an HTTP 429 error occured while checking for lifecycle annotations, then it would report that the annotation doesn't exist. This caused issues in the EOL annotation step of the golang image publishing pipeline.
This PR adds:
For the rate limiting, I decided to migrate away from our ad-hoc
HttpClientProvider. It existed since before we adoptedMicrosoft.Extensions.Hostingin #1860.M.E.Hostingcontains anIHttpClientProviderby default. It already includes logging as well. Now, we configureHttpClientconfiguration and rate limiting through standard means.