Building a Simple GitHub PR Review Agent with ReAct
Imagine having an intelligent assistant that could automatically review your GitHub pull requests, providing thoughtful feedback, detecting bugs, and ensuring code quality standards are met. In this post, we'll build an initial version of that - a simple but effective GitHub PR review agent using the ReAct pattern.
There are many agent frameworks and patterns out there, and most of them are reasonably simple under the hood, which is why argue for building the agents yourself, so they can be tuned for your specific tasks. The ReAct (Reasoning, Acting, Observation) pattern is a powerful approach to building AI agents that can reason through complex problems step-by-step. It's an iterative process where the agent:
- Reasons about the current state and goals
- Acts by selecting and executing a relevant tool or action
- Observes the results of the action
This approach leads to more transparent, reliable, and effective agents compared to agents that attempt to solve problems in a single step.
Note: All the code for this blog post is available in the react-agent/01-github-pr-reviewer repository. You can run and modify these examples to see the ReAct pattern in action.
Prerequisites
To follow along, you'll need:
- Python 3.9+
- A GitHub account and personal access token (optional for public repositories)
- Basic familiarity with Pydantic and async Python
- The Opper AI SDK (
pip install opperai
)
Understanding the ReAct Pattern
Before diving into code, let's understand why the ReAct pattern is so effective for building agents:
- Transparent reasoning: Each step in the agent's thought process is explicit
- Modularity: Tools and actions can be added, removed, or modified independently
- Error recovery: The agent can observe errors and try alternative approaches
- Trace-ability: Each step can be logged and analyzed for debugging
The ReAct pattern mirrors how humans solve problems - reasoning about the situation, taking an action, observing the result, and then continuing with this new information.
Core Architecture: The ReAct Loop
At the heart of our agent is the ReAct loop - a cycle of reasoning, action, and observation. Here's a simplified version of the core loop:
# ReAct loop
while current_step < self.max_steps:
# Step 1: REASONING - Analyze the current state
reasoning = await self._react_reasoning(agent, context)
# Step 2: ACTION SELECTION - Select the next action
action = await self._react_action_selection(agent, context, reasoning)
# If the action is to finish, we're done
if action.action_type == "finish":
return action.output or {}
# Step 3: OBSERVATION - Execute the selected tool
if action.action_type == "use_tool" and action.tool_name:
tool_name = action.tool_name
tool_params = action.tool_params or {}
# Execute the tool
result = await self.tools[tool_name](tool_params)
observation = str(result)
# Update context with observation
context["last_observation"] = observation
context["intermediate_results"][f"step_{current_step}"] = result
This loop represents the core of our agent's execution model, alternating between reasoning, selecting actions, and making observations. The full implementation can be found in agent_runner.py.
Schema-Driven Design with Pydantic
A key principle in our agent implementation is using schemas to clearly define inputs and outputs. We use Pydantic models for this purpose:
class AgentReasoning(BaseModel):
"""Model for agent's reasoning step output."""
content: str = Field(..., description="The agent's reasoning about the current state")
class AgentAction(BaseModel):
"""Model for agent's action selection output."""
action_type: str = Field(..., description="Type of action: 'use_tool' or 'finish'")
tool_name: Optional[str] = Field(None, description="Name of the tool to use")
tool_params: Optional[Dict[str, Any]] = Field(None, description="Parameters for the tool")
output: Optional[Dict[str, Any]] = Field(None, description="Final output if finishing")
class AgentOutput(BaseModel):
"""Model for the final PR review output."""
review_summary: str = Field(..., description="Summary of the PR changes")
issues_found: List[str] = Field(default_factory=list, description="List of issues found")
suggestions: List[str] = Field(default_factory=list, description="List of suggestions")
overall_assessment: str = Field(..., description="Overall assessment of the PR")
These schemas serve multiple purposes:
- Validation: Ensure data meets our expectations
- Documentation: Self-document the interface for developers
- Structure for LLMs: Give the language model clear guidance on expected outputs
The GitHub PR Tool Implementation
Our agent needs a way to interact with GitHub. While we could create a generic MCP client that connects to a GitHub MCP server, here we implement a simple GitHubPRTool
class that fetches PR information:
@trace(name="github_pr_tool.execute")
async def execute(self, params: Dict[str, Any]) -> Dict[str, Any]:
"""Execute the GitHub PR tool."""
try:
# Get PR information
pr_info = await self._get_pr_info(params["owner"], params["repo"], params["pr_number"])
# Check if repository is private and we're not authenticated
if pr_info.get("private", False) and "Authorization" not in self.headers:
return {
"error": "This is a private repository. A GitHub token is required for access.",
"status": "error"
}
# Get PR files and diff
files = await self._get_pr_files(params["owner"], params["repo"], params["pr_number"])
diff = await self._get_pr_diff(params["owner"], params["repo"], params["pr_number"])
# Return structured result
return {
"pr_title": pr_info["title"],
"pr_author": pr_info["user"]["login"],
"changed_files": [f["filename"] for f in files],
"additions": pr_info["additions"],
"deletions": pr_info["deletions"],
"diff": self._truncate_diff(diff),
"pr_description": pr_info["body"] or "",
"pr_url": pr_info["html_url"],
"repository_private": pr_info.get("private", False),
"status": "success"
}
except Exception as e:
logger.error(f"Error executing GitHub PR tool: {e}", exc_info=True)
return {"error": f"Error retrieving PR information: {str(e)}", "status": "error"}
The tool handles fetching various pieces of PR information from the GitHub API and returns them in a structured format. For the complete implementation, see github_pr_tool.py.
Using Opper for LLM Calls
A critical part of our agent is the LLM-powered reasoning and decision making. We use the Opper SDK for making structured LLM calls:
async def _react_reasoning(self, agent: Dict[str, Any], context: Dict[str, Any]) -> AgentReasoning:
"""Generate reasoning based on the current context."""
reasoning_instructions = """
You are in the REASONING phase of a ReAct (Reasoning-Acting-Observation) loop.
In this phase, you should:
1. Analyze the current state and context
2. Think step-by-step about what you know and what you need to find out
3. Consider what tools or actions might be helpful
4. Determine your next steps
Your reasoning should be thorough, logical, and clear.
Additionally, provide a confidence score from 0.0 to 1.0 indicating how
confident you are in your reasoning.
"""
result, _ = await opper.call(
name="agent_reasoning",
instructions=reasoning_instructions,
input={
"agent_instructions": agent.get("instructions", ""),
"context": context,
"step_number": context.get("current_step", 0),
"last_observation": context.get("last_observation", None),
},
output_type=AgentReasoning,
)
return result
async def _react_action_selection(
self, agent: Dict[str, Any], context: Dict[str, Any], reasoning: AgentReasoning
) -> AgentAction:
"""Select the next action based on reasoning."""
# Get the list of available tools
available_tools = list(self.tools.keys())
action_instructions = """
You are in the ACTION SELECTION phase of a ReAct (Reasoning-Acting-Observation) loop.
Based on your prior reasoning, you must now decide on the next action to take.
You have two options:
1. Use a tool to gather more information or make progress:
- action_type: "use_tool"
- tool_name: Select from the available tools in the input
- tool_params: Provide the necessary parameters for the tool
2. Finish the task if you have enough information:
- action_type: "finish"
- output: Provide your final review with:
- review_summary: A concise summary of the PR changes
- issues_found: A list of issues or concerns
- suggestions: A list of improvement suggestions
- overall_assessment: Your final assessment of the PR
"""
result, _ = await opper.call(
name="agent_action",
instructions=action_instructions,
input={
"reasoning": reasoning.content,
"reasoning_confidence": reasoning.confidence,
"context": context,
"available_tools": available_tools,
"agent_instructions": agent.get("instructions", ""),
"step_number": context.get("current_step", 0),
},
output_type=AgentAction,
)
return result
The key features here are:
- Structured inputs: We carefully prepare the context for the LLM
- Structured outputs: We use Pydantic models to define expected responses
- Tracing: The
@trace
decorator enables observability
Comprehensive Tracing
One important aspect of building reliable agents is observability. We use Opper's tracing capabilities to track each step of execution:
@trace(name="agent_runner.run_agent")
async def run_agent(self, agent_id: str, agent: Dict[str, Any], input_data: Dict[str, Any]) -> Dict[str, Any]:
"""Run an agent with the given input data."""
# ... implementation ...
By wrapping key functions with @trace
, we get comprehensive traces for each agent run, including:
- Time spent in each function
- Inputs and outputs at each step
- Error conditions
- Custom metrics
Putting It All Together
Our main script ties everything together, creating a complete GitHub PR Reviewer:
# Agent configuration
PR_REVIEW_AGENT = {
"instructions": """
You are a GitHub PR reviewer. Your task is to review pull requests and provide helpful feedback.
You should:
1. Fetch the PR information using the github_pr_tool
2. Analyze the changes and their impact
3. Identify potential issues or improvements
4. Provide a detailed review with actionable feedback
Your final output should include:
- A summary of the changes
- List of issues found (if any)
- Suggestions for improvement
- Overall assessment
""",
"verbose": False, # Will be set from command line args
}
# Initialize services
agent_runner = AgentRunnerService()
# Initialize GitHub PR tool
github_token = os.getenv("GITHUB_TOKEN") # Optional for public repositories
github_pr_tool = GitHubPRTool(github_token)
# Register tools
agent_runner.register_tools({
"github_pr_tool": github_pr_tool.execute
})
# Run the agent
result = await agent_runner.run_agent(
agent_id="github_pr_reviewer",
agent=PR_REVIEW_AGENT,
input_data={
"owner": args.owner,
"repo": args.repo,
"pr_number": args.pr_number,
}
)
For the complete working example, see main.py.
Running the Example
To run the complete example:
- Clone the repository
- Navigate to the example directory:
cd react-agent/01-github-pr-reviewer
- Install dependencies:
pip install -r requirements.txt
- Create a
.env
file with your Opper API key (GitHub token optional) - Run the script:
python main.py <owner> <repo> <pr_number>
- Add
-v
flag to see the agent's thought process:python main.py <owner> <repo> <pr_number> -v
Detailed instructions are available in the README.
Next Steps
This implementation demonstrates the core concepts, but there are many ways to enhance it:
- Error handling and retries: Add robust error handling for API calls and LLM calls
- Caching: Cache API responses to avoid rate limiting. Opper supports returning cached respones for the same inputs.
- Advanced PR analysis: Add code quality checks and security scanning
- State persistence: Save and retrieve agent state between runs
- Human feedback: Allow humans to provide feedback on the agent's reviews
Conclusion
We've built a simple but functional GitHub PR review agent using the ReAct pattern. This agent demonstrates several key principles:
- ReAct pattern for structured reasoning, action, and observation cycles
- Schema-driven design with Pydantic for clear interfaces
- Opper tracing for comprehensive observability
- Modular tools architecture for extensibility
For a deep-dive into the implementation, explore the complete code.
In a future post, we'll expand our agent's capabilities with more advanced GitHub features, and also make the agent more advanced.
Stay tuned!