Skip to content
AgentEnsemble AgentEnsemble
Get Started

Building Multi-Agent Systems in Java Without Leaving the JVM

You don’t need Python to build multi-agent systems.

I know that’s a contrarian take. The entire AI agent ecosystem — CrewAI, AutoGen, LangGraph — is Python-first. But if your backend is Java, your CI is Gradle or Maven, and your team thinks in terms of interfaces and generics rather than decorators and duck typing, there’s no reason to introduce a second language just for agent orchestration.

This post is a hands-on walkthrough. We’ll build four progressively more sophisticated multi-agent systems, all in Java, all running on the JVM, using AgentEnsemble.

Add the dependency to your build.gradle.kts:

dependencies {
implementation("net.agentensemble:agentensemble-core:2.3.0")
}

Or if you’re using Maven:

<dependency>
<groupId>net.agentensemble</groupId>
<artifactId>agentensemble-core</artifactId>
<version>2.3.0</version>
</dependency>

You’ll also need a LangChain4j model provider. For OpenAI:

implementation("dev.langchain4j:langchain4j-open-ai:1.0.0-beta4")

The classic starting point — two agents, sequential execution, one feeds into the other.

// Create an LLM model
ChatLanguageModel model = OpenAiChatModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.modelName("gpt-4o-mini")
.build();
// Define agents with roles and goals
Agent researcher = Agent.builder()
.role("Senior Research Analyst")
.goal("Find comprehensive, accurate information about {{topic}}")
.background("Expert at synthesizing information from multiple sources")
.build();
Agent writer = Agent.builder()
.role("Technical Writer")
.goal("Transform research into clear, engaging content")
.background("Skilled at making complex topics accessible")
.build();
// Define tasks -- the writer's task depends on the researcher's output
Task researchTask = Task.builder()
.description("Research {{topic}} thoroughly, covering key concepts, "
+ "current state, and recent developments")
.expectedOutput("Comprehensive research notes with sources")
.agent(researcher)
.build();
Task writeTask = Task.builder()
.description("Write a well-structured article based on the research")
.expectedOutput("A polished, publication-ready article")
.agent(writer)
.context(List.of(researchTask)) // <-- this creates the dependency
.build();
// Build and run the ensemble
EnsembleOutput output = Ensemble.builder()
.agents(researcher, writer)
.tasks(researchTask, writeTask)
.chatLanguageModel(model)
.inputs(Map.of("topic", "WebAssembly beyond the browser"))
.build()
.run();
System.out.println(output.getRaw());

A few things to notice:

  • {{topic}} is a template variable, resolved at runtime from the inputs() map.
  • context(List.of(researchTask)) tells the framework that writeTask needs the output of researchTask. This is how dependencies are expressed.
  • No workflow declaration. The framework sees a linear dependency chain and infers sequential execution.

Now let’s build something more interesting — a manager agent that delegates to specialist workers.

ChatLanguageModel managerModel = OpenAiChatModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.modelName("gpt-4o")
.build();
ChatLanguageModel workerModel = OpenAiChatModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.modelName("gpt-4o-mini")
.build();
Agent marketResearcher = Agent.builder()
.role("Market Research Specialist")
.goal("Analyze market trends and competitive landscape")
.llm(workerModel)
.build();
Agent financialAnalyst = Agent.builder()
.role("Financial Analyst")
.goal("Analyze financial data and provide investment insights")
.llm(workerModel)
.build();
Agent reportWriter = Agent.builder()
.role("Report Writer")
.goal("Compile findings into a comprehensive report")
.llm(workerModel)
.build();
Task comprehensiveReport = Task.builder()
.description("Create a comprehensive analysis of {{company}}")
.expectedOutput("A detailed report covering market position, "
+ "financials, and strategic outlook")
.build();
// Note: no .agent() -- the manager decides who handles what
EnsembleOutput output = Ensemble.builder()
.agents(marketResearcher, financialAnalyst, reportWriter)
.tasks(comprehensiveReport)
.chatLanguageModel(managerModel) // the manager's brain
.workflow(Workflow.HIERARCHICAL)
.inputs(Map.of("company", "Tesla"))
.build()
.run();

Key differences from the sequential pipeline:

  • No agent assigned to the task. The manager decides which worker handles what.
  • Different models for different roles. The manager gets gpt-4o for its coordination reasoning; workers get the cheaper gpt-4o-mini.
  • Workflow.HIERARCHICAL is explicit here because there’s a single unassigned task — the framework needs to know you want delegation, not just a single-agent execution.

You can also add constraints to the delegation:

HierarchicalConstraints constraints = HierarchicalConstraints.builder()
.requiredWorkers(List.of("Market Research Specialist", "Financial Analyst"))
.maxCallsPerWorker(2)
.globalMaxDelegations(6)
.build();
Ensemble.builder()
// ... agents, tasks, model ...
.workflow(Workflow.HIERARCHICAL)
.hierarchicalConstraints(constraints)
.build()
.run();

This ensures the manager consults both specialists and doesn’t get stuck in a loop delegating endlessly to one agent.

What if you have tasks that can run concurrently? Let’s build a competitive intelligence pipeline:

Agent marketAnalyst = Agent.builder()
.role("Market Analyst")
.goal("Analyze market positioning and trends")
.build();
Agent financialAnalyst = Agent.builder()
.role("Financial Analyst")
.goal("Analyze financial performance and projections")
.build();
Agent strategist = Agent.builder()
.role("Strategy Consultant")
.goal("Synthesize findings into strategic recommendations")
.build();
// These two tasks are independent -- they can run in parallel
Task marketResearch = Task.builder()
.description("Analyze market position of {{company}}")
.expectedOutput("Market analysis report")
.agent(marketAnalyst)
.build();
Task financialAnalysis = Task.builder()
.description("Analyze financial performance of {{company}}")
.expectedOutput("Financial analysis report")
.agent(financialAnalyst)
.build();
// This task depends on BOTH -- it waits for them to finish
Task swotAnalysis = Task.builder()
.description("Create a SWOT analysis based on market and financial findings")
.expectedOutput("Complete SWOT analysis")
.agent(strategist)
.context(List.of(marketResearch, financialAnalysis)) // both must complete first
.build();
// This depends on everything
Task executiveSummary = Task.builder()
.description("Write an executive summary of all findings")
.expectedOutput("One-page executive summary")
.agent(strategist)
.context(List.of(marketResearch, financialAnalysis, swotAnalysis))
.build();
EnsembleOutput output = Ensemble.builder()
.agents(marketAnalyst, financialAnalyst, strategist)
.tasks(marketResearch, financialAnalysis, swotAnalysis, executiveSummary)
.chatLanguageModel(model)
.inputs(Map.of("company", "Nvidia"))
.build()
.run();

Again, no explicit workflow declaration. The framework sees that marketResearch and financialAnalysis have no dependencies on each other, so it runs them concurrently. swotAnalysis waits for both. executiveSummary waits for all three.

The dependency graph is a DAG (directed acyclic graph), and the framework does topological scheduling automatically.

If you want resilience, add an error strategy:

Ensemble.builder()
// ...
.workflow(Workflow.parallel()
.errorStrategy(ParallelErrorStrategy.CONTINUE_ON_ERROR)
.build())
.build()
.run();

Now if the financial analysis fails, the market research still completes, and downstream tasks get whatever results are available.

Raw text output is fine for articles, but many use cases need structured data. Java records make this clean:

record CompetitorProfile(
String name,
String marketPosition,
List<String> strengths,
List<String> weaknesses,
double estimatedMarketShare
) {}
Task profileTask = Task.builder()
.description("Create a detailed profile of {{competitor}}")
.expectedOutput("A structured competitor profile")
.agent(analyst)
.outputType(CompetitorProfile.class)
.build();
EnsembleOutput output = Ensemble.builder()
.agents(analyst)
.tasks(profileTask)
.chatLanguageModel(model)
.inputs(Map.of("competitor", "AMD"))
.build()
.run();
// Typed access -- no parsing, no casting
CompetitorProfile profile = output.getTaskOutputs().get(0)
.getStructuredOutput(CompetitorProfile.class);
System.out.println(profile.name()); // "AMD"
System.out.println(profile.strengths()); // ["Strong GPU lineup", ...]
System.out.println(profile.estimatedMarketShare()); // 24.5

The framework instructs the LLM to return JSON conforming to the record’s schema, deserializes it, and hands you back a typed object. If the LLM’s output doesn’t parse correctly, the framework retries automatically (configurable via maxOutputRetries()).

All four examples share the same building blocks: Agent.builder(), Task.builder(), Ensemble.builder(). The same API produces a simple pipeline, a hierarchical delegation system, a parallel DAG, or a typed extraction pipeline.

And all of it runs on your existing JVM. Your existing Gradle build. Your existing CI pipeline. Your existing logging and monitoring infrastructure.

No Python sidecar. No REST wrapper. No new deployment topology.

This post covered the core building blocks. In upcoming posts in this series, we’ll dig into:

  • Production concerns: observability, error handling, cost tracking, rate limiting
  • Advanced patterns: MapReduce ensembles, dynamic agent creation, tool pipelines
  • Human-in-the-loop: review gates, approval workflows, pre-flight validation

Get started:


AgentEnsemble is MIT-licensed and available on GitHub.