What consulting services does Kiryl Rusanau offer?

I offer six services focused on Java + AI integration: Java Stack AI Assessment (identify where LLMs add value in your Spring Boot or Quarkus services), AI Integration Roadmap (phased plan with framework recommendations), LLM Integration Sprint (2-4 week engagement to ship a specific AI feature), Java AI Workshop (hands-on team training with LangChain4j and Spring AI), AI-Ready Architecture Review, and AI Code & Cost Review. All services are built around existing Java systems.

How can I book a consultation with Kiryl Rusanau?

The easiest way is to book a free consultation directly through my calendar (link available on this page). You can also reach me via email at kr.rusanov@gmail.com or through LinkedIn. I typically respond within 24 hours.

What is Kiryl Rusanau's AI engineering experience?

I have production experience integrating LLM capabilities into enterprise Java applications using LangChain4j and Spring AI frameworks. This includes building RAG (Retrieval-Augmented Generation) pipelines with vector databases, implementing prompt chains for structured output parsing, and deploying conversational AI systems. My AI work spans both enterprise integrations and consumer-facing AI products.

What AI frameworks does Kiryl Rusanau work with?

My primary AI toolkit includes LangChain4j for Java-based LLM integration, Spring AI for enterprise Spring Boot applications, and LangGraph (Python) for multi-agent workflow orchestration. I work with various LLM providers and have experience implementing vector stores, embeddings pipelines, and production-grade prompt engineering patterns.

How does Kiryl Rusanau integrate AI into enterprise applications?

I follow a structured approach: designing RAG architectures for domain-specific knowledge retrieval, implementing vector embeddings with appropriate chunking strategies, building prompt chains that produce reliable structured outputs, and ensuring proper error handling and fallback mechanisms. The integration respects enterprise concerns like security, observability, and cost optimization.

What AI products has Kiryl Rusanau built?

I built CartSync, a conversational AI grocery planning assistant on Telegram (now completed) that demonstrated practical AI integration in consumer applications. I'm also developing Quibench, an AI-powered productivity platform — you can see a demo at https://youtu.be/16qD_FMklvA. Both projects showcase real-world AI integration patterns.

Quibench is an AI productivity platform I'm developing that focuses on workflow optimization using LLM technologies. It aims to enhance team collaboration and individual productivity through intelligent automation and AI-assisted task management. Watch the demo: https://youtu.be/16qD_FMklvA

What is Kiryl Rusanau's technical background?

I have 7+ years of experience as a full-stack engineer with deep expertise in Java (8-21), Spring Boot ecosystem, React with TypeScript, and AWS cloud infrastructure. My background includes building microservices at scale, implementing OAuth2/OIDC authentication systems, and working with event-driven architectures using Kafka.

What is Kiryl Rusanau's FinTech experience?

I spent nearly 4 years at Azati Corporation working on financial technology projects including cryptocurrency custody platforms with Fireblocks integration, securities registration systems for regulatory compliance, and banking authentication systems managing identity across multiple services. This experience shaped my understanding of security-first development and compliance requirements.

Is Kiryl Rusanau available for consulting or contract work?

Yes, I offer six consulting services focused on AI integration for Java-powered businesses: Java Stack AI Assessment, AI Integration Roadmap, LLM Integration Sprint, Java AI Workshop, AI-Ready Architecture Review, and AI Code & Cost Review. Book a free consultation through the link on this page, or email kr.rusanov@gmail.com.

Where is Kiryl Rusanau located and what's the best way to contact?

I'm based in Poland and work with clients worldwide. The best way to start is by booking a free consultation (link available on this page). You can also reach me via email at kr.rusanov@gmail.com or through LinkedIn.

Your Java AI Agent Isn't Dumb. Your Context Is.

Your Java AI agent isn't dumb. Your context is.

I built my first LangChain4j agent thinking the hard part was picking the right model. Spoiler: the model was fine from day one. What kept breaking was everything the model didn't see, or saw wrong, or saw at the wrong moment.

Turns out this is common. LangChain's State of Agent Engineering report (2025), surveying 1,340 professionals: 57% of organizations have AI agents in production. One in three cite quality as their top barrier. Most of those failures trace back not to which model you picked, but to how you structured what the model sees.

There's a name for this: Context Engineering.

Prompt Engineering is how you ask the model.
Context Engineering is what the model sees when you ask — what data, from where, in what format, at what moment.

For Java developers, this shows up in 5 recurring mistakes. I've made most of them myself.

Mistake 1: The giant system prompt

You know the pattern. System prompt grows over time. Team keeps adding requirements. Edge cases accumulate. Six months later you have a 4,000-token document that reads like a terms of service nobody asked for.

The model technically "sees" everything. But seeing 4,000 tokens of static context is not the same as processing relevant information at the right moment.

The fix is just-in-time loading. Don't predict what the agent needs. Give it ways to retrieve what it needs, when it needs it.

// Don't — static blob loaded upfront
String systemPrompt = loadEntireKnowledgeBase(); // 5000 tokens

// Do — retrieve dynamically at query time
RetrievalAugmentor augmentor = DefaultRetrievalAugmentor.builder()
    .contentRetriever(EmbeddingStoreContentRetriever.builder()
        .embeddingStore(vectorStore)
        .maxResults(5)
        .minScore(0.75)  // this line matters — see Mistake 2
        .build())
    .build();

Practitioners building production agents generally report that roughly 20% of the final context is static instructions, while 80% arrives dynamically based on what the user actually asked. Static prompts don't scale — retrieval does.

Mistake 2: Missing minScore

This one is subtle and causes confident hallucinations.

Without minScore, your retriever returns the top-k results by similarity — even if none of them are actually relevant. Your agent gets slightly-wrong context, reasons from it confidently, and produces a wrong answer that sounds right.

In a banking environment this is not acceptable. A retriever returning UK compliance rules when the client asked about Polish regulations is a real problem.

EmbeddingStoreContentRetriever.builder()
    .embeddingStore(vectorStore)
    .maxResults(5)
    .minScore(0.75)   // nothing rather than irrelevant garbage
    .build()

0.75 is not a magic number. Tune it for your domain. But starting without it is a guaranteed path to context pollution.

Mistake 3: Shared ChatMemory across users

I see this in almost every first Java AI implementation.

// This is a data leakage bug
ChatMemory sharedMemory = MessageWindowChatMemory.withMaxMessages(20);

@AiService
interface SupportAgent {
    String chat(String question); // all users share one memory
}

User A's session bleeds into User B's context. At best, irrelevant responses. At worst, user data leaking between sessions.

The fix is one annotation:

@AiService
interface SupportAgent {
    String chat(@MemoryId String userId, @UserMessage String question);
}

LangChain4j then manages a separate ChatMemory instance per unique userId. Spring AI handles this with explicit conversation IDs passed to ChatMemory. Different API, same principle.

In any regulated industry, per-user memory scoping is not a nice-to-have. It's a compliance requirement.

Mistake 4: Treating @Tool descriptions as Javadoc

The tool description is not documentation for humans. It is the context the model uses to decide which tool to call.

// The model will make poor choices with this
@Tool("Get customer data")
public Customer getCustomer(String id) { ... }

// Better
@Tool("Retrieve customer account details: balance, status, last 5 " +
      "transactions. Use when the user asks about their account or " +
      "recent activity. Returns null if customer ID not found.")
public Customer getCustomer(String id) { ... }

I rewrote every tool description in my agent after debugging a case where it consistently picked the wrong retrieval method. The model wasn't confused. It lacked context about when to use what.

Write tool descriptions like you're briefing a new team member who needs to know exactly when to call this function — and when not to.

Mistake 5: RAG and conversation history are not the same thing

They serve different purposes. Mixing them creates a mess.

Semantic memory (your RAG / vector store): the agent's library. Documentation, policies, domain knowledge. Content that doesn't change per user.
Episodic memory (conversation history): the agent's journal. What this specific user said, what decisions were made, the context of this session.

Querying conversation history through vector similarity is wrong. You need chronological recall, not semantic scoring. Loading all user history into the vector store pollutes domain knowledge with personal data.

// Two systems, two jobs
ContentRetriever documentRetriever =
    EmbeddingStoreContentRetriever.from(vectorStore); // semantic

ChatMemory sessionMemory = MessageWindowChatMemory.builder()
    .maxMessages(20)
    .id(userId)
    .build(); // episodic, scoped per user

RetrievalAugmentor augmentor = DefaultRetrievalAugmentor.builder()
    .contentRetriever(documentRetriever) // semantic only
    .build();

Spring AI maps to the same separation: QuestionAnswerAdvisor for semantic retrieval, MessageChatMemoryAdvisor for conversation history. Same two-layer design, different API.

The reframe

If you've been building with LangChain4j or Spring AI, you've been doing Context Engineering already. You just didn't call it that.

RAG pipeline → semantic memory design
ChatMemory → episodic memory management
Tool calling → context for agent decision-making

The difference between an agent that works and one that doesn't is rarely the model. It's whether the model sees the right information, at the right time, in the right format.

57% of enterprise agents have quality problems. Most teams respond by switching models. The actual fix — in my experience — is usually one of the five things above.

What's the hardest context engineering problem you've run into building Java agents? I'm especially curious whether the tooling gap vs Python is actually a problem in practice, or whether LangChain4j is good enough for most production use cases.