One of the most frustrating parts of working with CI/CD pipelines is figuring out why something failed.
If you've worked with pipelines long enough, you've probably experienced this:
-
A pipeline fails
-
You open the logs
-
You're scrolling through hundreds of lines of output
-
Somewhere in there is a single line that actually caused the problem
Today we're going to build something that helps solve that problem.
Instead of manually digging through logs, we're going to send the pipeline failure output to AWS Bedrock, let AI analyze the results, and then automatically post a summary of the failure inside GitHub Actions.
Even better, we're going to intentionally break the pipeline so we can see how the system behaves when something goes wrong.
Let's walk through the setup.
Project Overview
In this project we are building a pipeline that:
-
Runs a Python test suite
-
Fails intentionally
-
Captures the failure output
-
Sends the failure logs to AWS Bedrock
-
Uses AI to generate a summary of what happened
-
Posts the AI analysis directly into the GitHub Actions Step Summary
The result is a pipeline that doesn't just fail — it explains why it failed.
Prerequisites
To follow along you will need:
-
An AWS account
-
GitHub repository
-
AWS Bedrock access
-
GitHub Actions enabled
-
OIDC integration between GitHub and AWS
In the previous project we created the infrastructure needed to allow GitHub Actions to securely talk to AWS using OIDC. That setup includes:
-
Terraform configuration
-
IAM roles
-
IAM policies
-
OIDC provider configuration
I'll link the previous setup in the repo if you want to recreate the full environment.
Project Structure
The repository used in this project is intentionally simple.
project-root/
│
├── app/
│ └── sqrt.py
│
├── tests/
│ └── test_sqrt.py
│
├── scripts/
│ ├── analyze_failure.py
│ └── print_markdown_summary.py
│
├── requirements.txt
└── .github/workflows/ci.yml
-
A Python file
-
A failing test
-
A GitHub Actions workflow
-
A script that sends logs to Bedrock
Creating a Pipeline That Fails
Our Python code simply calculates the square root of a number.
Example:
import math
def calculate_sqrt(x):
return math.sqrt(x)
The test suite intentionally contains a failure.
def test_sqrt_failure():
assert calculate_sqrt(16) == 5
Since the correct value is 4, the test fails — which triggers our pipeline failure.
This is exactly what we want.
Generating the Workflow with Copilot
To speed up development, I used Copilot Chat inside VS Code to generate the workflow and scripts.
Instead of manually writing every file, we can prompt Copilot to generate:
-
the GitHub Actions workflow
-
the Bedrock analysis script
-
the markdown summary formatter
For example, the first prompt asks Copilot to generate a GitHub Actions workflow that:
-
checks out the repository
-
installs Python dependencies
-
runs pytest
-
captures failure logs
-
sends logs to AWS Bedrock
-
prints an AI summary in GitHub Actions
The generated YAML workflow includes steps like:
-
Checkout code
-
Setup Python
-
Install dependencies
-
Run pytest
-
Generate failure context JSON
-
Configure AWS credentials
-
Run AI analysis
-
Print summary
Sending Failure Logs to AWS Bedrock
The key part of the project is the analyze_failure.py script.
This script:
-
Reads the pipeline failure output
-
Sends the logs to Bedrock
-
Asks the model to analyze the failure
-
Returns structured JSON describing the issue
Example output:
{ "failure_type": "test_failure", "severity": "low", "root_cause": "The test expected sqrt(16) to equal 5, but the correct result is 4.", "confidence": 0.95, "explanation": "The failing test assertion is incorrect.", "recommended_fix": "Update the test to assert calculate_sqrt(16) == 4." }
This gives developers an instant explanation of the failure without digging through logs.
Formatting the GitHub Actions Summary
To make the results easier to read, we generate a markdown summary.
A second script reads the AI output JSON and converts it into a nicely formatted report for the GitHub Actions Step Summary.
Example summary:
AI Failure Analysis Failure Type: test_failure Severity: low Root Cause: The test assertion expected the wrong result. Recommended Fix: Update the expected value from 5 to 4
Now when the pipeline fails, developers immediately see a clear explanation.
Troubleshooting the Workflow
During development we ran into a few common issues:
Missing GitHub Secrets
Initially the pipeline failed because the AWS account ID secret was not configured.
GitHub repository secrets must include:
AWS_ACCOUNT_ID
Bedrock Model Parameters
The Bedrock API requires specific parameters depending on the model being used. In this case we had to adjust the request format to match the Claude model requirements.
Workflow Ordering
Another issue occurred when the AWS credentials step ran after the AI analysis script. This caused the script to try to call Bedrock before authentication was configured.
Moving the AWS credentials configuration earlier in the workflow resolved the issue.
Script Path Reference
The final bug was simply a path issue when referencing the script directory.
Once the correct path was used, the workflow executed successfully.
Final Result
After fixing the issues, the pipeline now:
-
Runs tests
-
Fails intentionally
-
Captures the logs
-
Sends them to Bedrock
-
Generates an AI summary
-
Displays the explanation in GitHub Actions
Instead of digging through logs, developers immediately see what went wrong and how to fix it.
Why This Matters
As AI becomes more integrated into development workflows, this type of automation can dramatically improve productivity.
Potential future improvements could include:
-
Automatic pull request comments
-
Suggested code fixes
-
Failure classification across builds
-
Historical failure analysis
AI won't replace debugging entirely, but it can make understanding failures much faster.
Repository
You can find the full project and prompts used in this video here:
GitHub Repo:
https://github.com/sparlor/ai-pipeline-doctor/tree/master
Final Thoughts
This project shows how combining GitHub Actions, AWS Bedrock, and a small amount of automation can make CI/CD pipelines significantly more developer friendly.
Instead of just failing, your pipeline can now explain the failure for you.
Thanks for following along — and I'll see you in the next post.
