Building With Vercel Eve: What I Learned About Agent Loops, Slack, and Enterprise Harnesses

I spent the weekend wiring up Vercel's new agent framework to Slack and Notion. The useful takeaway was less about building another bot, and more about understanding where managed agent loops may fit in enterprise architecture.

Jun 22, 2026

Introduction

I spent part of this weekend building with Vercel’s new agent framework, Eve.

My goal was pretty simple. I wanted to understand how it actually worked by wiring up a real workflow instead of only reading the announcement. I have been spending a lot of time thinking about agents lately, especially how they fit into enterprise environments where security, reliability, approvals, identity, and auditability matter just as much as model quality.

The first idea was straightforward: build a Slack-based content workflow. A user messages an agent in Slack, the agent drafts a LinkedIn post, runs it through a style check, asks for approval, and then writes the finished draft to Notion.

I kept the workflow small on purpose. I wanted something I could finish in a weekend, while still touching the pieces that matter: Slack as the user surface, a model behind the agent loop, Notion as a downstream system, OAuth, human approval, evals, and deployment.

After getting the full workflow running, the part I kept thinking about was the agent loop.

The Two Agent Patterns I Had In My Head

Before building with Eve, I had been thinking about agents in two broad patterns.

The first pattern is closer to a graph or workflow. There is a start node, a set of steps, maybe some branching, maybe fan-out, maybe a critic step, and eventually an end node. This is useful when the shape of the work is mostly known ahead of time. You can model the process, test the transitions, and reason about the execution path.

The second pattern is closer to a generalist agent inside a harness. The model has instructions, tools, context, permissions, memory or state, and a runtime. The developer does not script every step. The agent reasons, calls a tool, observes the result, and decides what to do next. It may ask a question, hit an approval gate, or delegate to another agent.

Both patterns have a place.

If I know the process, I generally want the graph. If the agent needs to interpret a messy user request, recover from missing context, pick the right tool, or decide whether it needs a human, the generalist loop starts to make more sense.

What I wanted to understand was where Eve fits.

What I Built

I started with Vercel’s Eve content agent template and turned it into a Slack-first content assistant.

The workflow looked like this:

I send a message to the Eve bot in Slack.
Eve receives the Slack event through a Vercel Connect Slack connector.
The agent interprets the request and drafts a LinkedIn post.
It loads the LinkedIn style path.
It calls a deterministic lint tool before showing the draft.
I iterate in the Slack thread.
When I say the draft is ready to publish, Eve requests approval.
Slack renders the approval step.
After approval, Eve writes the draft to a Notion database.
Eve replies back in Slack with the Notion page link.

The actual file structure was pretty clean:

agent/
  agent.ts
  instructions.md
  channels/
    slack.ts
  connections/
    notion.ts
  tools/
    lint_against_style.ts
    request_publish_approval.ts
  skills/
    linkedin-style/
    blog-style/
    newsletter-style/
    release-notes-style/

That structure is a big part of how Eve works. The agent is authored as a directory. The model config lives in agent.ts, the standing behavior lives in instructions.md, Slack lives in channels/, Notion lives in connections/, tools live in tools/, and reusable procedures or style guides live in skills/.

In this template, I was working with one deployed Eve agent rather than a universal agent router.

That distinction matters.

When I message the bot in Slack, the event routes to this deployed Eve app. Inside that app, the model sees the harness I authored and decides which available capability to use.

The Harness Is Where The Engineering Shows Up

The word “harness” has become useful for me here.

The harness is the bounded environment around the model:

instructions
tools
connections
skills
auth
approvals
state
evals
channels

In my case, the harness said:

You are a Slack-first content assistant.
You can draft LinkedIn posts.
You can load the LinkedIn style skill.
You can lint drafts.
You can ask for publish approval.
You can write to Notion through the signed-in user.
You should not publish before approval.

That is where most of the meaningful engineering work happened. The model mattered. Most of the leverage came from the environment the model was allowed to operate inside.

That is also where the enterprise questions start to show up.

What can the agent see? What can it do? Whose credentials does it use? Which actions require approval? What gets logged? What happens if the workflow pauses? What happens if OAuth is missing? What does a successful run look like? How do we test that behavior later?

Those questions are harder than “can the model draft a LinkedIn post?”

The Loop

The most interesting part of Eve for me was the managed loop.

A deterministic automation might look like this:

receive input
run step A
run step B
run step C
return output

An agent loop is different:

receive input
call the model
model requests a tool
run the tool
append the result
call the model again
maybe ask the human
maybe wait for OAuth
maybe call another tool
eventually return output

In code, the mental model is something like:

while (!done) {
  const next = await model({
    history,
    instructions,
    tools,
    connections,
    state,
  });

  if (next.toolCall) {
    const result = await runTool(next.toolCall);
    history.push(result);
    continue;
  }

  if (next.needsHumanInput) {
    await parkUntilHumanResponds();
    continue;
  }

  if (next.needsOAuth) {
    await parkUntilAuthorizationCompletes();
    continue;
  }

  done = true;
}

With Eve, I did not write that loop directly. I authored the capabilities around the loop. Eve handled the runtime behavior: session creation, step execution, tool results, parking, resuming, and streaming events back to Slack.

That was the part that started to feel important from an enterprise architecture perspective.

If a company has many Slack or Teams-based agents, we probably do not want every team rebuilding this loop layer from scratch. We do not want every team solving durable state, OAuth pauses, approval rendering, event streams, tool call history, retries, and observability in a slightly different way.

We probably want the loop layer to be managed by a platform.

Any platform in this category has to be robust enough for us to shape the harness around it.

What Worked

The Slack path worked first.

I added a simple status command that bypassed the model and responded directly from the Slack channel handler. That helped prove the connector path before worrying about the model:

Slack -> Vercel Connect trigger -> /eve/v1/slack -> Eve channel -> Slack thread reply

After setting up Vercel AI Gateway credits and choosing deepseek/deepseek-v4-pro, the model-backed path worked too. Eve could receive a Slack message, draft the LinkedIn post, call the lint tool, and reply in thread.

The publishing path was more interesting.

I added a custom request_publish_approval tool. The tool itself does not publish anything. It creates a human approval checkpoint before the Notion write. Slack renders that approval step, Eve parks the workflow, and once I approve, the workflow resumes and continues to Notion.

That made the workflow feel much more real.

There is a big difference between telling an agent, “ask before publishing,” and putting an approval-gated tool in the execution path. The first is an instruction. The second is part of the harness.

What Failed First

The first few failures were useful.

The model path was blocked until Gateway credits were enabled. I treated that as setup friction rather than a framework failure. The Slack connector worked. The deployed route worked. The model call was waiting on account setup.

The Notion path also surfaced a useful identity distinction. The project had a Notion connector configured, but the Slack user still needed to complete the per-user Notion authorization flow. That is exactly the kind of thing that shows up in real enterprise systems.

There is a difference between:

This application has a connector configured.

and:

This user has authorized this connector and can act through it.

That distinction matters for Salesforce, Jira, GitHub, ServiceNow, Snowflake, Databricks, and almost every other enterprise system you might connect to an agent.

After re-verifying Notion, Eve was able to search for the draft target, find the correct Notion database, request publish approval in Slack, and create the final page.

Evals Were More Useful Than I Expected

I also added a few Eve evals.

They were simple, but they caught real behavior gaps:

If the user asks for a LinkedIn draft, call the lint tool.
If the user says not to publish, do not request publish approval.
If the user asks to publish to Notion, park on the approval tool first.
If the surface is ambiguous, ask which surface to draft for.
If publishing, include the Notion target in the approval request.

This is another enterprise takeaway for me. If the agent loop is going to be managed, the evaluation layer still has to be ours. The platform can expose the event stream and test harness, but we have to define what good behavior looks like.

For this little content workflow, that meant testing for tool calls and approval behavior.

For an enterprise workflow, that might mean:

Did it call Salesforce for CRM requests?
Did it avoid Salesforce for unrelated requests?
Did it use read-only mode when required?
Did it ask for approval before writes?
Did it preserve the requesting user's identity?
Did it produce an audit trail?
Did it stop when the task was complete?

That is a different way of thinking about evals. I started grading the path through the loop alongside the final answer.

Where This Fits With Other Agent Patterns

This experiment also helped me separate a few concepts that are easy to blend together.

Claude Code feels like a generalist coding assistant inside a very rich harness. It can move through context, files, tools, and commands in a way that feels closer to an open-ended agent loop.

Claude Managed Agents are a different pattern. They are closer to hosted managed agents with persistent memory, a vault, MCP, subagents, and supervisor-style delegation.

LangGraph is useful when I want to explicitly design the graph. I can define nodes, edges, state transitions, routing logic, retries, and termination criteria.

Eve, at least through this template, feels like a framework for packaging one agentic workflow as a deployed app. It gives you the loop, the channel surface, the connection model, the approvals, and the eval surface. You still design the architecture.

If I wanted one Slack bot to route across three different agents, I would still need to build that supervisor pattern. Eve would not magically route across all of my deployed agents through Vercel AI Gateway. AI Gateway handles model access. Agent routing is part of the architecture I would design.

That is an important practical point.

Eve still leaves plenty of agent architecture to design. What it reduces is the amount of loop infrastructure I would have to build around that architecture.

Connectors And Identity

The connector layer may be one of the most important pieces for enterprise use.

With a naive agent integration, it is very easy to end up with a shared API key or service account in an environment variable. That works for demos, but it gets uncomfortable quickly.

In Eve, the Notion connection used Vercel Connect. That means the agent can hit an OAuth boundary, park the workflow, ask the user to sign in, resume after authorization, and then call Notion with a token the model never sees.

That pattern opens up the more interesting enterprise version:

Human identity: who invoked the agent in Slack
Runtime identity: the Vercel deployment executing the workflow
Agent identity: the named workflow or bot performing the action
Downstream authority: either a user OAuth token or an integration user

For Salesforce, this could mean a few different designs.

The agent might call Salesforce through REST or SOQL using a user-scoped OAuth token. It might call a Salesforce MCP server. It might call an OpenAPI connection. It might call a custom tool we write. It might eventually call a Salesforce-native agent or Agentforce-style endpoint if that is exposed through an API.

The important part is that the auth model and capability boundary are explicit. I would not describe the agent as simply having Salesforce. I would describe a defined Salesforce capability, a known auth mode, a schema, a risk level, and a policy around approval.

Nested Human Input

One of the more interesting questions that came out of this is what happens when a downstream agent needs human input.

Imagine a Slack-facing Eve agent delegates to a Salesforce agent. The Salesforce agent starts working, but then it needs the user to choose between two opportunities before it can continue.

That is a nested loop problem.

The Salesforce-side agent needs human input, but the human is sitting in Slack. Something has to translate that interruption across the boundary.

The contract might look like this:

{
  "status": "input_required",
  "prompt": "Which opportunity should I use?",
  "options": [
    { "id": "opp_1", "label": "Acme Renewal FY26" },
    { "id": "opp_2", "label": "Acme Expansion" }
  ],
  "resume_handle": "opaque-salesforce-run-token"
}

Eve would then render the question in Slack, park the outer workflow, collect the user’s answer, and send the response back to Salesforce with the resume handle.

That is probably a second post by itself, because it gets into interrupt propagation across agent loops. But it is the same basic lesson: once agents move into enterprise workflows, the hard parts are state, identity, permissions, interruptions, and auditability.

What I Am Taking Away

After building with Eve, I am thinking about enterprise agent platforms in a slightly different way.

There is still plenty of architecture to design. We still need to decide when to use deterministic graphs, when to use generalist loops, how to expose tools, how to design supervisors, how to evaluate behavior, and how to enforce permissions.

But I am more convinced that the loop layer is something many teams will not want to build over and over again.

The valuable platform layer is the one that can:

receive work from Slack or another modality
run a model-tool loop
checkpoint state
resolve user-scoped auth
pause for OAuth or human input
resume later
avoid duplicate side effects
emit useful traces
support evals against the actual behavior

Then the enterprise work is to build the harness around that loop:

tools
connectors
schemas
permissions
approvals
policies
subagents
evals
audit expectations

That is the part I would watch as these frameworks mature.

Before adopting one of these frameworks, I would still ask whether the agent can complete the task. I would also ask whether the platform gives us enough control to build the harness around the loop, inject our enterprise capability registry, and test the orchestration path safely.

That is what this Eve experiment clarified for me.

Another Coding Blog

Discussion about this post

Ready for more?