Skip to content

SREP-3738: fix misleading error messages in dynatrace gather-logs#877

Open
rolandmkunkel wants to merge 2 commits intoopenshift:masterfrom
rolandmkunkel:SREP-3738
Open

SREP-3738: fix misleading error messages in dynatrace gather-logs#877
rolandmkunkel wants to merge 2 commits intoopenshift:masterfrom
rolandmkunkel:SREP-3738

Conversation

@rolandmkunkel
Copy link
Copy Markdown
Contributor

@rolandmkunkel rolandmkunkel commented Apr 14, 2026

Summary

  • Fix log.Print("failed to get request token", err) calls that concatenate
    the message and error without a separator, producing garbled output like
    "failed to get request tokenrequest failed: 401 Unauthorized". Changed to
    log.Printf with proper formatting.
  • Improve the token provider error message to hint at a missing/misconfigured
    vault CLI

Ref: SREP-3738

Test plan

  • Run osdctl dt gather-logs without vault installed and verify the
    error message mentions the vault CLI
  • Run osdctl hcp must-gather and confirm DT query errors are logged
    with proper formatting (space/colon between message and error)

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced error messages for storage authentication failures to include specific guidance on Dynatrace/vault CLI installation and configuration.
    • Improved error logging for request token failures with better formatting and complete error details, providing clearer diagnostics across multiple operations.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 14, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 14, 2026

@rolandmkunkel: This pull request references SREP-3738 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary

  • Fix log.Print("failed to get request token", err) calls that concatenate
    the message and error without a separator, producing garbled output like
    "failed to get request tokenrequest failed: 401 Unauthorized". Changed to
    log.Printf with proper formatting.
  • Improve the token provider error message to hint at a missing/misconfigured
    vault CLI

Ref: SREP-3738

Test plan

  • Run osdctl dt gather-logs without vault installed and verify the
    error message mentions the vault CLI
  • Run osdctl hcp must-gather and confirm DT query errors are logged
    with proper formatting (space/colon between message and error)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 14, 2026

Walkthrough

Single file modified with error message clarity improvements and logging format consistency fixes. Error messages for token provider failures now include Dynatrace/vault CLI installation context, and request token failure logs updated to use formatted log.Printf with error details.

Changes

Cohort / File(s) Summary
Error Message and Logging Improvements
cmd/dynatrace/hcpGatherLogsCmd.go
Updated error message when storage token provider fails to reference Dynatrace/vault CLI installation/configuration. Changed multiple error logs for request token failures from log.Print() to log.Printf() with formatted error values for better clarity and consistency.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error)

Check name Status Explanation Resolution
Stable And Deterministic Test Names ❌ Error Unable to execute shell command to access ./cmd/dynatrace/hcpGatherLogsCmd_test.go Provide the file content or check that the file path is correct
✅ Passed checks (9 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'SREP-3738: fix misleading error messages in dynatrace gather-logs' directly summarizes the main change: fixing error message formatting and clarity in the dynatrace gather-logs command, which matches the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Test Structure And Quality ✅ Passed The custom check is scoped to Ginkgo test code, but this PR contains no Ginkgo test changes, only implementation file modifications.
Microshift Test Compatibility ✅ Passed The MicroShift Test Compatibility check is not applicable to this PR as it only modifies a Cobra CLI command file and does not add any Ginkgo e2e tests.
Single Node Openshift (Sno) Test Compatibility ✅ Passed The SNO Test Compatibility check is not applicable to this pull request; the PR modifies only CLI command implementation files containing business logic, not Ginkgo e2e tests.
Topology-Aware Scheduling Compatibility ✅ Passed PR modifies only CLI command utility file with error message and logging format improvements; no deployment manifests, operators, controllers, or Kubernetes scheduling constraints affected.
Ote Binary Stdout Contract ✅ Passed The changes do not violate the OTE Binary Stdout Contract. log.Printf() calls are in command handler methods, not process-level code. Go's log package writes to os.Stderr by default, not os.Stdout.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed The PR modifies cmd/dynatrace/hcpGatherLogsCmd.go, a command implementation file with no new Ginkgo e2e tests.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 14, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rolandmkunkel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 14, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 14, 2026

@rolandmkunkel: This pull request references SREP-3738 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary

  • Fix log.Print("failed to get request token", err) calls that concatenate
    the message and error without a separator, producing garbled output like
    "failed to get request tokenrequest failed: 401 Unauthorized". Changed to
    log.Printf with proper formatting.
  • Improve the token provider error message to hint at a missing/misconfigured
    vault CLI

Ref: SREP-3738

Test plan

  • Run osdctl dt gather-logs without vault installed and verify the
    error message mentions the vault CLI
  • Run osdctl hcp must-gather and confirm DT query errors are logged
    with proper formatting (space/colon between message and error)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/dynatrace/hcpGatherLogsCmd.go (1)

286-292: ⚠️ Potential issue | 🟠 Major

Return early when restarted-pod request token retrieval fails.

After Line 288, execution continues into getLogs even though request token creation failed. This hides the real failure and can produce misleading follow-up errors.

🔧 Proposed fix
 	podLogsRequestToken, err := getDTQueryExecution(DTURL, accessToken, restartedPodLogsQuery.finalQuery)
 	if err != nil {
 		log.Printf("failed to get request token: %v", err)
-
+		_ = f.Close()
+		return fmt.Errorf("failed to get request token for restarted pod logs: %w", err)
 	}
 	err = getLogs(DTURL, accessToken, podLogsRequestToken, f)
 	f.Close()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/dynatrace/hcpGatherLogsCmd.go` around lines 286 - 292, The code calls
getLogs even when getDTQueryExecution failed: check the error returned from
getDTQueryExecution(DTURL, accessToken, restartedPodLogsQuery.finalQuery) and
return early (after closing f) if err != nil so you don't proceed with an
invalid podLogsRequestToken; specifically, inside the error branch for
getDTQueryExecution log the error, ensure f.Close() is invoked, and return from
the enclosing function instead of falling through to getLogs(DTURL, accessToken,
podLogsRequestToken, f).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/dynatrace/hcpGatherLogsCmd.go`:
- Around line 180-184: The loop opens a file handle 'f' then calls
getDTQueryExecution (assigning eventsRequestToken) and on error uses continue
without closing 'f', leaking file descriptors; fix by ensuring 'f.Close()' is
called before each continue on the error paths around the getDTQueryExecution
call (and the similar path near lines 240-244), or even better, arrange to close
the file immediately after opening (e.g., defer f.Close() inside a scoped
function or ensure explicit f.Close() before every continue) so
getDTQueryExecution and the eventsRequestToken error path do not leak the file
handle.

---

Outside diff comments:
In `@cmd/dynatrace/hcpGatherLogsCmd.go`:
- Around line 286-292: The code calls getLogs even when getDTQueryExecution
failed: check the error returned from getDTQueryExecution(DTURL, accessToken,
restartedPodLogsQuery.finalQuery) and return early (after closing f) if err !=
nil so you don't proceed with an invalid podLogsRequestToken; specifically,
inside the error branch for getDTQueryExecution log the error, ensure f.Close()
is invoked, and return from the enclosing function instead of falling through to
getLogs(DTURL, accessToken, podLogsRequestToken, f).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: f77a2560-b9a7-4743-b184-78353fa3e848

📥 Commits

Reviewing files that changed from the base of the PR and between 4e14680 and d8ff584.

📒 Files selected for processing (1)
  • cmd/dynatrace/hcpGatherLogsCmd.go

Comment on lines 180 to 184
eventsRequestToken, err := getDTQueryExecution(DTURL, accessToken, eventQuery.finalQuery)
if err != nil {
log.Print("failed to get request token", err)
log.Printf("failed to get request token: %v", err)
continue
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Close f before continue when request token creation fails.

On Line 183 and Line 243 paths, the loop continues without closing the opened file handle. Repeated failures can exhaust file descriptors.

🔧 Proposed fix
@@
 		eventsRequestToken, err := getDTQueryExecution(DTURL, accessToken, eventQuery.finalQuery)
 		if err != nil {
+			_ = f.Close()
 			log.Printf("failed to get request token: %v", err)
 			continue
 		}
 		err = getEvents(DTURL, accessToken, eventsRequestToken, f)
 		_ = f.Close()
@@
 		podLogsRequestToken, err := getDTQueryExecution(DTURL, accessToken, podLogsQuery.finalQuery)
 		if err != nil {
+			_ = f.Close()
 			log.Printf("failed to get request token: %v", err)
 			continue
 		}
 		err = getLogs(DTURL, accessToken, podLogsRequestToken, f)
 		_ = f.Close()

Also applies to: 240-244

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/dynatrace/hcpGatherLogsCmd.go` around lines 180 - 184, The loop opens a
file handle 'f' then calls getDTQueryExecution (assigning eventsRequestToken)
and on error uses continue without closing 'f', leaking file descriptors; fix by
ensuring 'f.Close()' is called before each continue on the error paths around
the getDTQueryExecution call (and the similar path near lines 240-244), or even
better, arrange to close the file immediately after opening (e.g., defer
f.Close() inside a scoped function or ensure explicit f.Close() before every
continue) so getDTQueryExecution and the eventsRequestToken error path do not
leak the file handle.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 15, 2026

@rolandmkunkel: This pull request references SREP-3738 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

  • Fix log.Print("failed to get request token", err) calls that concatenate
    the message and error without a separator, producing garbled output like
    "failed to get request tokenrequest failed: 401 Unauthorized". Changed to
    log.Printf with proper formatting.
  • Improve the token provider error message to hint at a missing/misconfigured
    vault CLI

Ref: SREP-3738

Test plan

  • Run osdctl dt gather-logs without vault installed and verify the
    error message mentions the vault CLI
  • Run osdctl hcp must-gather and confirm DT query errors are logged
    with proper formatting (space/colon between message and error)

Summary by CodeRabbit

  • Bug Fixes
  • Enhanced error messages for storage authentication failures to include specific guidance on Dynatrace/vault CLI installation and configuration.
  • Improved error logging for request token failures with better formatting and complete error details, providing clearer diagnostics across multiple operations.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@rolandmkunkel
Copy link
Copy Markdown
Contributor Author

/retest

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 15, 2026

@rolandmkunkel: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.


accessToken, err := tokenProvider.Token()
if err != nil {
f.Close()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this is gathering a bunch of data after opening the file without really needing the file - might be easier to just gather all data we need first, once we know we have everything open the file and write.
Might also make the codeflow easier to just push all the file-writing into it's own function so we can just use 'defer f.Close()' instead of sprinkling f.Close() everywhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants