Using o1-preview and o1-mini with RAG and structured output

By Göran Sandahl - 9/15/2024

In this blog we do a quick explorative test of OpenAI's new reasoning models o1-mini and o1-preview on a retrieval use case, challenging reasoning and with structured output. Since these models are very new - essentially preview releases - they lack a lot of the features we have come to expect for model APIs. For example, there is no support for structured output and little guiding documentation on how to properly prompt them.

At Opper we have evolved a generic way of interacting with models with structured input and output and provide very accessible APIs for indexing and retrieval of custom knowledge. In the example we feature below, we will simply plug o1-mini and o1-preview into an existing RAG pipeline (very similar to what is described in our earlier blog post on Simple RAG with Mistral). We will show that responses are very good, require no modifications to prompts and that they follow instructions and utilize context very well. One difference from the Mistral blog post is that we plug in a lot more context for the model to reason over (north of 25k words).

We decided to use a SWOT analysis of the Reddit S1 filing as a use case since it is a typical high reasoning task that can be challenging. It requires utilizing a lot of context data and reasoning over that data to form a comprehensive analysis that is to the point and well structured. The task we will try to complete is Provide a data-driven SWOT analysis of Reddit with emphasis on impact from AI using the Reddit S1 filing PDF as source knowledge..

The pipeline is built with the following steps (available in full in our Python cookbook)

Index the Reddit S1 filing as a PDF using the Opper SDK
To retrieve high amounts of relevant context, we will generate relevant "sub-questions" with o1-mini and retrieve segments using the Opper SDK.
We will use o1-preview to reason through retrieved knowledge and provide a SWOT analysis, using structured input and output.
We will also ask for clear citations to where data is cited from, highlighting page number and document name.
We will be using structured input and output in all parts of the pipeline, something that isn't expressively supported with these models.

The Result:

Strengths:

Reddit's massive corpus of conversational data is foundational to current AI technology and many LLMs, making it valuable for model training [1].
Reddit is investing in AI to enhance the user experience, making it more personalized and safer, and to improve search capabilities, which is expected to increase user engagement and retention [2].
AI is expected to improve Reddit's ability to localize content and moderate content as they expand internationally [2].

Weaknesses:

New AI applications require additional investment, increasing costs and complexity, which may impact gross margin [3].
Market acceptance of AI technologies is uncertain; Reddit may be unsuccessful in its product development efforts [3].
Reddit may face competition from LLMs; users might choose to use AI models instead of visiting Reddit directly [4].

Opportunities:

Emerging opportunity in data licensing given the value of Reddit's data in sentiment analysis and trend identification [1].
Reddit can harness AI to improve content recommendations, driving user growth and retention [2].

Threats:

AI is subject to evolving regulatory scrutiny; Reddit may need to adjust offerings as legal frameworks develop [3].
Potential misuse of Reddit data by third parties could harm Reddit's business and reputation [3].

[1] "We believe that Reddit will be core to the capabilities of organizations that use data as well as the next generation of generative AI and LLM platforms." from reddit-sec.pdf page 17 [2] "We are investing in ways to harness AI to make the user experience more personalized and safer and to improve our search capabilities, which we expect will increase user engagement and retention." from reddit-sec.pdf page 136 [3] "Developing, testing and deploying these technologies may also increase the cost profile of our offerings due to the nature of the computing costs involved in such initiatives. Moreover, market acceptance of AI technologies is uncertain, and we may be unsuccessful in our service or product development efforts." from reddit-sec.pdf page 66 [4] "In addition, we face competition from large language models ("LLMs"), such as ChatGPT, Gemini, and Anthropic; Redditors may choose to find information using LLMs, which in some cases may have been trained using Reddit data, instead of visiting Reddit directly." from reddit-sec.pdf page 43

Here is a few selected segments of the implementation:

Using `o1-mini` to drive context retrieval

We used o1-mini to generate subquestions that will drive the context retrieval. We used the opper.call API to call the model. Note that we ask the model to return List[str] for a structured set of subquestions to iterate over for retrieval.

question = "Provide a data-driven SWOT analysis of Reddit with emphasis on impact from AI"

subquestions, _ = opper.call(
    name="generate_subquestions",
    instructions="Given that you can query Reddit's S1 filing to answer the question, generate a list of subquestions that you would want the answer to in order to answer the main question. Only return the subquestions, not the question.",
    input=question,
    output_type=List[str],
    model="openai/o1-mini"
)
knowledge = []
for subquestion in subquestions:
    print(subquestion)
    result = index.query(
        query=subquestion,
        k=1
    )
    knowledge.append(result)

# How does Reddit's revenue model currently perform, and what are its primary sources of income?
# What weaknesses does Reddit face in its platform infrastructure and user experience?
# In what ways is Reddit integrating AI to enhance content moderation and user interactions?
# How is Reddit addressing ethical considerations related to AI, such as bias and transparency in algorithms?
# What are the projected financial implications of AI integration on Reddit's operational costs and revenue growth?
# What opportunities does AI present for Reddit to innovate its services or expand its user base?

Using `o1-preview` for the SWOT

We built a response object that includes a thought process, an answer and a list of citations and gave that to o1-preview to complete. Knowledge in this case was roughly 25k words. This call took around 90 seconds to complete, which is around 5-10 times more than with other models. Note that we have a slightly more complex output structure in this call with a Response type that contains a list of Citation types.

class Citation(BaseModel):
    file_name: str 
    page_number: int 
    citation: str 

class Response(BaseModel):
    thoughts: str
    answer: str 
    citations: List[Citation]

response, _ = opper.call(
    name="o1/respond",
    model="openai/o1-preview",
    instructions="Produce an answer to the question using knowledge. Refer to any facts with [1], [2] etc.",
    input={
        "question": question,
        "knowledge": knowledge
    },
    output_type=Response
)

print(response.answer)

The response to this was printed in full earlier above so we will leave that out. As for the response, I find it to be to the point with correct, relevant citations. It is interesting how plug and play these models were with our existing pipeline. Structured output and RAG worked out of the box with no modifications or adaptations to the pipeline. I believe these models may become very useful for integrating into our AI pipelines, especially for high reasoning tasks.

Takeaways

In this blog post we showed how to utilize the new reasoning models o1-mini and o1-preview from OpenAI to answer a question using knowledge retrieval and with structured output in a plug and play manner. We used the Opper indexing API to store and query the PDF and then used Opper's opper.call API to call the models. We think these new models offer an exciting new addition to the toolbox of building capable AI features. We look forward to exploring them in greater depths :)

The Result:

Using o1-mini to drive context retrieval

Using o1-preview for the SWOT

Takeaways

Using `o1-mini` to drive context retrieval

Using `o1-preview` for the SWOT