Building Multi-Agent Systems in Java Without Leaving the JVM

Mar 18, 2026

You don’t need Python to build multi-agent systems.

I know that’s a contrarian take. The entire AI agent ecosystem — CrewAI, AutoGen, LangGraph — is Python-first. But if your backend is Java, your CI is Gradle or Maven, and your team thinks in terms of interfaces and generics rather than decorators and duck typing, there’s no reason to introduce a second language just for agent orchestration.

This post is a hands-on walkthrough. We’ll build four progressively more sophisticated multi-agent systems, all in Java, all running on the JVM, using AgentEnsemble.

Setup

Add the dependency to your build.gradle.kts:

dependencies {
    implementation("net.agentensemble:agentensemble-core:2.3.0")
}

Or if you’re using Maven:

<dependency>
    <groupId>net.agentensemble</groupId>
    <artifactId>agentensemble-core</artifactId>
    <version>2.3.0</version>
</dependency>

You’ll also need a LangChain4j model provider. For OpenAI:

implementation("dev.langchain4j:langchain4j-open-ai:1.0.0-beta4")

Build 1: The Research-Writer Pipeline

The classic starting point — two agents, sequential execution, one feeds into the other.

// Create an LLM model
ChatLanguageModel model = OpenAiChatModel.builder()
    .apiKey(System.getenv("OPENAI_API_KEY"))
    .modelName("gpt-4o-mini")
    .build();

// Define agents with roles and goals
Agent researcher = Agent.builder()
    .role("Senior Research Analyst")
    .goal("Find comprehensive, accurate information about {{topic}}")
    .background("Expert at synthesizing information from multiple sources")
    .build();

Agent writer = Agent.builder()
    .role("Technical Writer")
    .goal("Transform research into clear, engaging content")
    .background("Skilled at making complex topics accessible")
    .build();

// Define tasks -- the writer's task depends on the researcher's output
Task researchTask = Task.builder()
    .description("Research {{topic}} thoroughly, covering key concepts, "
        + "current state, and recent developments")
    .expectedOutput("Comprehensive research notes with sources")
    .agent(researcher)
    .build();

Task writeTask = Task.builder()
    .description("Write a well-structured article based on the research")
    .expectedOutput("A polished, publication-ready article")
    .agent(writer)
    .context(List.of(researchTask)) // <-- this creates the dependency
    .build();

// Build and run the ensemble
EnsembleOutput output = Ensemble.builder()
    .agents(researcher, writer)
    .tasks(researchTask, writeTask)
    .chatLanguageModel(model)
    .inputs(Map.of("topic", "WebAssembly beyond the browser"))
    .build()
    .run();

System.out.println(output.getRaw());

A few things to notice:

{{topic}} is a template variable, resolved at runtime from the inputs() map.
context(List.of(researchTask)) tells the framework that writeTask needs the output of researchTask. This is how dependencies are expressed.
No workflow declaration. The framework sees a linear dependency chain and infers sequential execution.

Build 2: The Hierarchical Team

Now let’s build something more interesting — a manager agent that delegates to specialist workers.

ChatLanguageModel managerModel = OpenAiChatModel.builder()
    .apiKey(System.getenv("OPENAI_API_KEY"))
    .modelName("gpt-4o")
    .build();

ChatLanguageModel workerModel = OpenAiChatModel.builder()
    .apiKey(System.getenv("OPENAI_API_KEY"))
    .modelName("gpt-4o-mini")
    .build();

Agent marketResearcher = Agent.builder()
    .role("Market Research Specialist")
    .goal("Analyze market trends and competitive landscape")
    .llm(workerModel)
    .build();

Agent financialAnalyst = Agent.builder()
    .role("Financial Analyst")
    .goal("Analyze financial data and provide investment insights")
    .llm(workerModel)
    .build();

Agent reportWriter = Agent.builder()
    .role("Report Writer")
    .goal("Compile findings into a comprehensive report")
    .llm(workerModel)
    .build();

Task comprehensiveReport = Task.builder()
    .description("Create a comprehensive analysis of {{company}}")
    .expectedOutput("A detailed report covering market position, "
        + "financials, and strategic outlook")
    .build();
// Note: no .agent() -- the manager decides who handles what

EnsembleOutput output = Ensemble.builder()
    .agents(marketResearcher, financialAnalyst, reportWriter)
    .tasks(comprehensiveReport)
    .chatLanguageModel(managerModel) // the manager's brain
    .workflow(Workflow.HIERARCHICAL)
    .inputs(Map.of("company", "Tesla"))
    .build()
    .run();

Key differences from the sequential pipeline:

No agent assigned to the task. The manager decides which worker handles what.
Different models for different roles. The manager gets gpt-4o for its coordination reasoning; workers get the cheaper gpt-4o-mini.
Workflow.HIERARCHICAL is explicit here because there’s a single unassigned task — the framework needs to know you want delegation, not just a single-agent execution.

You can also add constraints to the delegation:

HierarchicalConstraints constraints = HierarchicalConstraints.builder()
    .requiredWorkers(List.of("Market Research Specialist", "Financial Analyst"))
    .maxCallsPerWorker(2)
    .globalMaxDelegations(6)
    .build();

Ensemble.builder()
    // ... agents, tasks, model ...
    .workflow(Workflow.HIERARCHICAL)
    .hierarchicalConstraints(constraints)
    .build()
    .run();

This ensures the manager consults both specialists and doesn’t get stuck in a loop delegating endlessly to one agent.

Build 3: The Parallel DAG

What if you have tasks that can run concurrently? Let’s build a competitive intelligence pipeline:

Agent marketAnalyst = Agent.builder()
    .role("Market Analyst")
    .goal("Analyze market positioning and trends")
    .build();

Agent financialAnalyst = Agent.builder()
    .role("Financial Analyst")
    .goal("Analyze financial performance and projections")
    .build();

Agent strategist = Agent.builder()
    .role("Strategy Consultant")
    .goal("Synthesize findings into strategic recommendations")
    .build();

// These two tasks are independent -- they can run in parallel
Task marketResearch = Task.builder()
    .description("Analyze market position of {{company}}")
    .expectedOutput("Market analysis report")
    .agent(marketAnalyst)
    .build();

Task financialAnalysis = Task.builder()
    .description("Analyze financial performance of {{company}}")
    .expectedOutput("Financial analysis report")
    .agent(financialAnalyst)
    .build();

// This task depends on BOTH -- it waits for them to finish
Task swotAnalysis = Task.builder()
    .description("Create a SWOT analysis based on market and financial findings")
    .expectedOutput("Complete SWOT analysis")
    .agent(strategist)
    .context(List.of(marketResearch, financialAnalysis)) // both must complete first
    .build();

// This depends on everything
Task executiveSummary = Task.builder()
    .description("Write an executive summary of all findings")
    .expectedOutput("One-page executive summary")
    .agent(strategist)
    .context(List.of(marketResearch, financialAnalysis, swotAnalysis))
    .build();

EnsembleOutput output = Ensemble.builder()
    .agents(marketAnalyst, financialAnalyst, strategist)
    .tasks(marketResearch, financialAnalysis, swotAnalysis, executiveSummary)
    .chatLanguageModel(model)
    .inputs(Map.of("company", "Nvidia"))
    .build()
    .run();

Again, no explicit workflow declaration. The framework sees that marketResearch and financialAnalysis have no dependencies on each other, so it runs them concurrently. swotAnalysis waits for both. executiveSummary waits for all three.

The dependency graph is a DAG (directed acyclic graph), and the framework does topological scheduling automatically.

If you want resilience, add an error strategy:

Ensemble.builder()
    // ...
    .workflow(Workflow.parallel()
        .errorStrategy(ParallelErrorStrategy.CONTINUE_ON_ERROR)
        .build())
    .build()
    .run();

Now if the financial analysis fails, the market research still completes, and downstream tasks get whatever results are available.

Build 4: Typed Structured Output

Raw text output is fine for articles, but many use cases need structured data. Java records make this clean:

record CompetitorProfile(
    String name,
    String marketPosition,
    List<String> strengths,
    List<String> weaknesses,
    double estimatedMarketShare
) {}

Task profileTask = Task.builder()
    .description("Create a detailed profile of {{competitor}}")
    .expectedOutput("A structured competitor profile")
    .agent(analyst)
    .outputType(CompetitorProfile.class)
    .build();

EnsembleOutput output = Ensemble.builder()
    .agents(analyst)
    .tasks(profileTask)
    .chatLanguageModel(model)
    .inputs(Map.of("competitor", "AMD"))
    .build()
    .run();

// Typed access -- no parsing, no casting
CompetitorProfile profile = output.getTaskOutputs().get(0)
    .getStructuredOutput(CompetitorProfile.class);

System.out.println(profile.name());           // "AMD"
System.out.println(profile.strengths());       // ["Strong GPU lineup", ...]
System.out.println(profile.estimatedMarketShare()); // 24.5

The framework instructs the LLM to return JSON conforming to the record’s schema, deserializes it, and hands you back a typed object. If the LLM’s output doesn’t parse correctly, the framework retries automatically (configurable via maxOutputRetries()).

The Common Thread

All four examples share the same building blocks: Agent.builder(), Task.builder(), Ensemble.builder(). The same API produces a simple pipeline, a hierarchical delegation system, a parallel DAG, or a typed extraction pipeline.

And all of it runs on your existing JVM. Your existing Gradle build. Your existing CI pipeline. Your existing logging and monitoring infrastructure.

No Python sidecar. No REST wrapper. No new deployment topology.

What’s Next

This post covered the core building blocks. In upcoming posts in this series, we’ll dig into:

Production concerns: observability, error handling, cost tracking, rate limiting
Advanced patterns: MapReduce ensembles, dynamic agent creation, tool pipelines
Human-in-the-loop: review gates, approval workflows, pre-flight validation

Get started:

Documentation — guides, examples, and API reference
Getting Started — up and running in 5 minutes
Examples — runnable code for every pattern
GitHub — source, issues, and contributions

AgentEnsemble is MIT-licensed and available on GitHub.