Introducing Opperator: A composable agent to automate tasks on the web

By Göran Sandahl - 2/7/2025

We've built Opperator - a proof of concept agent for automating web tasks from code. It runs completely in the background without supervision, optimized for declaring tasks and consuming results programmatically. We've packaged it as a container with a REST API for seamless integration into existing infrastructure and applications.

Demo

Here's a demo of Opperator running parallel tasks via its REST API interface and a small UI:

Features

Autonomous: No direct human supervision required, runs all tasks in the background with optional callbacks.
Programmable: Designed to be triggered from code, and has completion schemas for intepreting task results.
Observable: Project is automatically traced so that tasks can be debugged and intrepreted.
Scalable: Agent is concurrent and can process multiple tasks in parallel, and can be vertically scaled in most environments.
Compound: It makes use of multiple models to achieve for efficient and effective task completion (Molmo, Sonnet, Flash)
Open source: MIT licensed, with code is available on GitHub. Feel free to modify or contribute.

Motivation

Most agents today require direct human supervision, making them impractical for tasks where the time investment outweighs the benefits. In the case of automating web tasks, having to be an assisant to the assistant kind of beats the purpose!

Building fully autonomous agents also pushes us to explore some challenging problems around AI reliability: How do we optimize calls so they work, figure out model resilience, deployment, observability and control. Because the key with autonomy is that it just needs to work.

We also believe there are many atomic tasks that can become valuable to automate if the process is simple, reliable and easily integrated into existing services.

Example tasks

For example, we may want to test our login flow after a deployment:

After we have deployed a change, go to platform.opper.ai, log in and create a new api key. Report completion as a structured JSON object with status as a boolean and the api key as a string, so that I can catch any issues with our login flow and raise an alert

Or maybe list the top 3 pages visited blog posts of ours every day and push to our discord channel:

Go to our Google Analytics and report back our top 3 pages for the last 24 hours. Report completion as a structured JSON object with page title, url and count so that I can put it in our dashboard

Or maybe issue a notification in our Discord channel when a new AI related story is trending on hackernews:

Go to hackernews frontpage and identify any AI related stories. Report completion as a structured JSON object with story title, url and score so that I have it reported in our Discord channel

Automating tasks like this could be very valuable, especially when their benefits compound over time. AI can help tackle tasks that would otherwise be impractical to invest human effort into.

Source code

Full code is available on GitHub

How to use

Note! To run Opperator, you'll need an API key from Opper. See our Free Tier if you are interested in testing it out.

A container for the willing

The repo features a docker compose file for manually building a container. It is also available on Github Container Registry, so you can run it with:

docker run --rm -ti \
  -e OPPER_API_KEY=op-<your-api-key> \
  -p 8000:8000 \
  ghcr.io/opper-ai/opper-webagent:latest

The container will start a REST API on port 8000, and you can interact with it like this:

# Execute a web task (returns session id)
curl -X POST http://localhost:8000/run \
  -H "Content-Type: application/json" \
  -d '{
    "goal": "Go to https://opper.ai and check the pricing",
    "response_schema": {
      "type": "object",
      "properties": {
        "has_free_tier": {"type": "boolean"},
        "pricing_details": {"type": "string"}
      },
      "required": ["has_free_tier", "pricing_details"]
    }
  }'

# Stream status updates
curl -N http://localhost:8000/status-stream/<session_id>

There is also a simple web ui running on http://localhost:8000/ that you can use to run tasks.

To run the interface in the demo video you need to clone the repo, run the container as per instructions above, followed by:

uv run examples/multi/app.py

See the README for additional instructions.

A library for the brave

While the container and the REST API is probably the easiest way to get started, it wraps the same Python library that you can use to interact with the agent.

from opper_webagent import run

# Example: Verify blog post existence
result = run(
    goal="Go to https://opper.ai and verify that there is a blog post covering DeepSeek-R1 there",
    response_schema={
        "type": "object",
        "properties": {
            "is_posted": {"type": "boolean"},
            "post_title": {"type": "string"}
        }
    },
    callback=lambda action, details, screenshot: print(f"Action: {action}, Details: {details}")
)

Conclusion

With Opperator, we're exploring the boundaries of agent autonomy and composability. By treating agents as independent workers rather than assistants, we enable scalable automation without direct supervision. The low implementation cost and ease of integration make this approach particularly promising.

Try it out

The Opper free tier includes $10 monthly credit for inference and tracing, perfect for exploring Opperator's capabilities. To get going sign up to the Opper platform, issue an API key and see detailed instructions in the GitHub repository.