How I Built a Multi-Agent Research Crew That Becomes an Expert on Any Topic

What I discovered from building specialized CrewAI agents that collaborate on research tasks, and the surprising insights about dynamic expertise that emerged.

Aug 05, 2025

I’ve been experimenting with CrewAI to address something that’s been bothering me about the current AI landscape: most models try to do everything at once but fall short when it comes to deep, focused research. On the other end of the spectrum, tools like OpenAI’s Deep Research offer stronger depth, but at a steep cost. Don’t believe me? Check out this article I wrote recently…

I spent $96 and burned 150 million tokens with OpenAI’s Deep Research API and all I got were these 5 great insights

Taylor Ortiz

Jul 7

I spent $96 and burned 150 million tokens with OpenAI’s Deep Research API and all I got were these 5 great insights

Creating my newsletter has always been a fun but time consuming task. The iterations of it over the years have been this:

Read full story

So, I built a system with two AI agents that work sequentially on research tasks. What I discovered about dynamic specialization and agent collaboration was more interesting than I expected.

The Problem with Jack-of-All-Trades AI

We’ve gotten used to treating AI like a magic 8-Ball. Ask it anything and hope for wisdom. But that’s not how real expertise works. When serious research is needed, you don’t hire one person to do everything. You build a team.

The researcher who excels at digging up obscure sources might not be great at turning them into actionable insights. The analyst who sees patterns in data could miss important industry developments without strong research support.

It got me thinking: maybe AI should work the same way. Instead of expecting one model to do everything, what if we built a team of specialized agents, each one genuinely skilled at a specific role?

The System: Two Agents, Sequential Workflow

Here's what I actually built using CrewAI:

Agent 1 - The Researcher

This agent uses CrewAI tools to perform real-time web searches. But here's the interesting part: it doesn't have a fixed identity. When I input "Quantum Computing," it becomes a "Quantum Computing Senior Data Researcher." When I switch to "Urban Beekeeping," it becomes an "Urban Beekeeping Senior Data Researcher." Same agent, completely different expertise context.

Agent 2 - The Reporting Analyst

This agent receives everything the researcher found and transforms it into a structured report. Like the researcher, it dynamically adapts its role, becoming a "Quantum Computing Reporting Analyst" or "Urban Beekeeping Reporting Analyst" based on the topic.

The workflow is strictly sequential: researcher completes its task first, then hands off all findings to the analyst. In addition to dividing labor, we also create a context handoff that allows the analyst to build on comprehensive, focused research rather than starting from a generic prompt.

What the System Actually Produces

I tested the two-agent workflow on a variety of topics to see what kind of output it could generate. One of the most compelling examples came from a research request on quantum computing.

The researcher agent gathered up-to-date information on IBM’s quantum roadmap, investment trends in quantum startups, and recent technical breakthroughs. The reporting analyst then organized this into a well-structured report, including sections on IBM’s goal of building a 4,000+ qubit quantum-centric supercomputer, the $2 billion invested in quantum startups in 2024, and advances in fault-tolerant quantum computing.

What stood out was the quality of the synthesis. The researcher didn’t just collect and paste search results. It surfaced relevant insights through the lens of a domain expert in quantum computing. The analyst then wove those findings into a cohesive narrative and outlined key implications, such as how specialized quantum systems are beginning to replace broader, universal quantum computing models.

The final product was a clean markdown report that reads like something produced by a professional research team, not a collection of disconnected search summaries.

The Technical Implementation

The core system is surprisingly straightforward. Here's the actual crew definition:

@CrewBase

class ResearchTeam():

    @agent
    def researcher(self) -> Agent:

        return Agent(
            config=self.agents_config['researcher'],
            verbose=True,
            tools=[SerperDevTool()]
        )

    @agent
    def reporting_analyst(self) -> Agent:

        return Agent(
            config=self.agents_config['reporting_analyst'],
            verbose=True
        )

    @crew
    def crew(self) -> Crew:

        return Crew(
            agents=self.agents,
            tasks=self.tasks,
            process=Process.sequential,
            verbose=True,
        )

The magic happens in the YAML configuration files. The researcher agent's role is defined as `{topic} Senior Data Researcher` with a goal to `Uncover cutting-edge developments in {topic}`. When you input "Quantum Computing," CrewAI's templating system automatically substitutes this into the agent's identity.

The tasks are similarly templated. The research task asks the agent to "Conduct thorough research about {topic}" and expects "10 bullet points of the most relevant information." The reporting task takes this output and expands it into "a fully fledged report with main topics, each with a full section of information."

To make the system usable in practice, I wrapped it in a FastAPI app that runs the workflow and returns the final markdown report. Here's the core function that kicks off the crew:

def run_crew_sync(topic: str):
    crew = ResearchTeam()
    result = crew.crew().kickoff(inputs={'topic': topic})

    with open("report.md", "r") as f:
        report = f.read()

    return {
        "success": True,
        "topic": topic,
        "report": report,
        "timestamp": datetime.now().isoformat(),
        "agents_used": ["Researcher Agent", "Reporting Analyst"]
    }

Key Insights from Building This

Template-based specialization works better than expected

Using `{topic}` variables in agent configurations creates genuine specialization. The same agent code produces different research approaches when it believes it's a "Quantum Computing Senior Data Researcher" versus a "Space Exploration Senior Data Researcher."

Sequential processing enables context building

The analyst receives the researcher's complete output, not just my original query. This context handoff allows the analyst to build on specific findings rather than generating generic responses about the topic.

Task structure shapes output quality

The research task specifically asks for "10 bullet points of the most relevant information," giving the researcher a concrete target. The reporting task then asks to "expand each topic into a full section," which creates structured, comprehensive reports rather than loose summaries.

Interface design matters for understanding

Building both a Streamlit web app and FastAPI backend revealed how much the interface affects perception of the system. The Streamlit version shows real-time progress ("Researcher Agent: Searching..." then "Reporting Analyst: Processing..."), which makes the collaboration visible.

Where This Actually Gets Used

The system works well for exploratory research where you need both current information and structured analysis:

Topic exploration

When I need to understand a new field quickly, the two-agent approach gives me both current developments and organized context. Much more structured than asking a single AI model.

Content research

Before writing about unfamiliar topics, the research agent finds recent information while the analyst organizes it into usable sections.

Technology landscape analysis

Understanding emerging technologies, market trends, or regulatory changes where you need both facts and synthesis.

The Architecture Scales Interestingly

Running this on API calls, the cost is primarily in the LLM usage during research and analysis phases. The tooling used has its own costs for web search, but these are typically much lower than the LLM calls.

What's interesting is how the sequential architecture affects scaling: you can batch multiple research requests, but each reporting task needs to wait for its corresponding research to complete.

What Could Be Extended

The current system demonstrates a useful pattern that could be extended in several directions:

Additional specialized agents: A fact-checking agent that validates claims, or a source quality agent that evaluates information reliability
Parallel research streams: Multiple researcher agents focusing on different aspects (technical, market, regulatory) feeding into a synthesis agent
Output format specialization: Agents that format results for specific use cases—executive summaries, technical documentation, or presentation slides

The core insight is that template-based role specialization combined with sequential handoffs creates more structured output than single-agent approaches.

The Technical Setup

If you want to reproduce this:

Install dependencies: pip install crewai[tools] (Python 3.10+ required)
Get API keys: OpenAI API key for the LLMs, Serper API key for web search
Configure agents: YAML files defining roles with `{topic}` templates
Define tasks: Sequential tasks with clear expected outputs
Choose interface: Streamlit for interactive testing, FastAPI for programmatic access

The project structure separates configuration (YAML files) from implementation (Python classes), making it easy to modify agent behavior without changing code.

What This Suggests About AI Tool Design

Building this system revealed something surprising about AI architecture: role templating can create the appearance of specialization without the need to train separate models for each domain.

The same agent behaves differently when framed as a specialist in quantum computing compared to urban planning. This highlights how role definition through templating can drive surprisingly effective specialization, even when using a general-purpose model.

The sequential handoff structure also improves output quality. It forces the synthesis step to build on grounded research rather than starting from a generic prompt, which leads to more coherent and actionable reports.

The real value is not just in using multiple agents. It comes from designing workflows where each agent has a clear role, specific goals, and relevant context to work from.

If you are working on similar challenges, such as producing both current information and structured analysis, this pattern of lightweight specialization and structured collaboration may be worth exploring.

Another Coding Blog

I spent $96 and burned 150 million tokens with OpenAI’s Deep Research API and all I got were these 5 great insights

Discussion about this post