Troubleshooting
Authentication errors
Section titled “Authentication errors”Symptom: API request failed (HTTP 401) or API request failed (HTTP 403) in workflow logs.
Causes:
- Token has expired or been revoked
- Token lacks the required scopes
- Wrong token for the provider (e.g.,
DD_API_KEYin theDD_APP_KEYfield)
Fix:
- Verify the token is set in your environment (GitHub Actions secrets or
.envfor CLI) - Check that the token has the required permissions:
- GitHub — Fine-grained PAT (recommended): Contents read/write, Issues read/write, Pull requests read/write. Classic token:
reposcope - Linear — API key with write access
- Sentry — auth token with
project:read,event:read,issue:readscopes - Datadog — both
DD_API_KEYandDD_APP_KEYare required (they are different keys)
- GitHub — Fine-grained PAT (recommended): Contents read/write, Issues read/write, Pull requests read/write. Classic token:
- Regenerate the token if it has expired
”Missing required config” error
Section titled “”Missing required config” error”Symptom: Missing required config: github.GITHUB_TOKEN (env: GITHUB_TOKEN) at startup.
Cause: The workflow references a skill whose required environment variables are not set.
Fix:
- Run
sweny check(CLI) to see which skills are configured and which are missing credentials - Set the missing environment variables in your
.envfile or GitHub Actions secrets - If you don’t need a particular skill, use a workflow that doesn’t reference it
Workflow validation failures
Section titled “Workflow validation failures”Symptom: Entry node "X" not found or Edge references unknown node: "Y".
Cause: The workflow YAML has a structural error — a typo in a node ID, an edge pointing to a non-existent node, or a missing entry field.
Fix:
sweny workflow validate my-workflow.yamlThis reports all structural errors: unknown node references, missing entry node, and edges pointing to non-existent nodes. Fix the YAML and re-validate.
Node timeouts
Section titled “Node timeouts”Symptom: A node runs for a long time and eventually fails, or the GitHub Actions job hits its timeout.
Cause: The Claude agent is taking too many turns on a complex task, or the task itself is too broad.
Fix:
- Increase turn limits:
max-investigate-turns(default: 50) — for investigation nodesmax-implement-turns(default: 30) — for implementation nodes
- Narrow the scope:
- Use
service-filterto limit which services are analyzed - Use
time-rangeto reduce the log window (e.g.,4hinstead of24h) - Use
investigation-depth: quickfor faster but less thorough analysis
- Use
- Increase the GitHub Actions job timeout:
timeout-minutes: 90
Rate limiting
Section titled “Rate limiting”Symptom: HTTP 429 errors from observability or issue tracker APIs.
Cause: The Claude agent is making too many API calls in a short period. Observability APIs (Datadog, Sentry, New Relic) often have stricter rate limits than other services.
Fix:
- Reduce
investigation-depthfromthoroughtostandardorquick - Narrow
time-rangeto reduce the volume of data queried - Use
service-filterto limit the scope - For Datadog, verify your API key tier supports the request volume
MCP server connection failures
Section titled “MCP server connection failures”Symptom: MCP server failed to start or Connection refused errors in logs.
Causes:
- stdio servers — the npm package failed to install or the command is not found
- HTTP servers — the endpoint is unreachable or the authentication headers are wrong
Fix for stdio servers:
- Test the server locally:
npx -y @modelcontextprotocol/server-github@latest - Check that the GitHub Actions runner has network access to npm
- Verify the
envvalues are correct (e.g.,GITHUB_PERSONAL_ACCESS_TOKENfor the GitHub MCP server, notGITHUB_TOKEN)
Fix for HTTP servers:
- Verify the URL is correct (some servers require a trailing path like
/mcp) - Check authentication headers — each server has its own format:
- Linear:
Authorization: Bearer <token> - Datadog:
DD_API_KEYandDD_APPLICATION_KEYas headers - PagerDuty:
Authorization: Token token=<key> - New Relic:
Api-Key: <key>(notAuthorization)
- Linear:
- Check if the service has a status page for outages
GitHub Actions permissions
Section titled “GitHub Actions permissions”Symptom: Resource not accessible by integration or 403 when creating PRs or issues.
Fix: Add the required permissions to your workflow file:
permissions: contents: write # Create branches and push commits pull-requests: write # Open pull requests issues: write # Create issues (when using github-issues tracker)For cross-repo dispatch (when the fix belongs to a different repository), use a bot-token with repo and actions scopes instead of the default GITHUB_TOKEN:
- uses: swenyai/triage@v1 with: bot-token: ${{ secrets.BOT_TOKEN }}Dry run produces no output
Section titled “Dry run produces no output”Symptom: sweny triage --dry-run runs but you see no investigation report.
Causes:
- No errors found in the configured time range
- The observability provider credentials are valid but the service/index/log group has no data
severity-focusis set toerrorsbut the logs only contain warnings
Fix:
- Widen the time range:
--time-range 7d - Change severity focus:
--severity-focus all - Remove the service filter:
--service-filter '*' - Check the raw API response by running
sweny checkto verify the provider can connect and return data
”No configured skills” warning
Section titled “”No configured skills” warning”Symptom: Workflow references unregistered skill: "X" warnings at startup, followed by a node failing.
Cause: The environment variables for a skill are not set, so the skill was not registered.
Fix:
Run sweny check to see which skills are detected:
sweny checkThis lists every built-in skill and shows which ones have their credentials configured. Set the missing environment variables for the skills your workflow needs.
Stale results after code changes
Section titled “Stale results after code changes”Symptom: You changed your workflow or code, but SWEny produces the same results as before.
Cause: The executor caches completed node results. If the workflow definition or input hasn’t changed enough to invalidate the cache, previous results are reused.
Fix:
sweny triage --no-cache # Disable cache for this runrm -rf .sweny/cache # Clear the cache entirelyThe default cache TTL is 24 hours (86400 seconds). For iterative development where you want fresh results every run, use --no-cache.
Claude produces unexpected results
Section titled “Claude produces unexpected results”Symptom: The investigation is shallow, the implementation misses the point, or the PR changes the wrong files.
Fix:
- Add
additional-instructionsto guide Claude:additional-instructions: "Focus on the payment webhook handler. The error is likely in src/webhooks/stripe.ts." - Use
investigation-depth: thoroughfor deeper analysis - Check that the correct skills are available at the relevant node — Claude can only use tools from the node’s declared skills
- Review the execution events in the GitHub Actions log or Studio to see what tools Claude called and what data it received
Still stuck?
Section titled “Still stuck?”- GitHub Issues — report bugs and request features
- GitHub Discussions — ask questions and share workflows
- Marketplace — browse community workflows for inspiration