A Dutch Court Rulings MCP Server. Powered by Solr, Not Vectors

For a new project, I had to dust off my Solr skills, and it turned into a great opportunity to see if I could build a vector-less AI chat application on a truly large dataset.

Last year, I downloaded all Dutch public court rulings since 1990 (3.3 million records, roughly 30 GB of text), originally for a RAG experiment.

The idea was simple: create embeddings for each document, store them in a vector database, and build a semantic search system on top. Classic RAG 101.

But generating embeddings for 30 GB of text took days and became quite expensive using hosted models.

The project stalled due to other obligations, but revisiting Solr gave me a chance to explore whether a search-only RAG setup could handle such a massive dataset.

Spoiler alert: it works great.


Back to Basics: Solr Still Delivers

I deployed a Solr Cloud instance on Kubernetes and indexed all public Rechtspraak.nl cases.

It still took a few hours, but that is a fraction of the time embedding generation required.

No token limits. No embedding costs. Just classic full-text indexing, filters, and facets: the things Solr has always been good at.


The Challenge

How do you use that Solr index in an LLM-style chat interface, without creating embeddings, chunks, or a RAG pipeline?

I wanted to ask questions like:

  • How many cases were there in Den Haag in 2022?
  • How many robbery cases were tried in 2020?
  • What was the highest sentence for fraud in 2024?
  • Compare the number of civil cases between 2015 and 2020.

And also more interpretive ones:

  • What does the Deliveroo arrest mean voor freelancers?
  • Why can a DGA be seen as a freelancer under the DBA law?
  • Why do limited companies receive higher tax fines than sole proprietors?

All answers should come only from official Rechtspraak.nl data: grounded, traceable, and verifiable.


Enter MCP: Model Context Protocol

MCP stands for Model Context Protocol.

Where traditional APIs expose resources for developers, MCP exposes tools that language models can use directly.

It is an emerging standard for connecting LLMs with external data sources and services such as databases, note apps, booking systems, or, in this case, a national legal archive.

I built an MCP server that connects my Solr index to tools like Claude Desktop and ChatGPT Desktop.

The model can use the MCP tools to query Solr in real time, retrieve relevant rulings, and generate grounded answers.

On top of that, comparable data can be displayed as charts.


Stack Overview

  • Search engine: Apache Solr Cloud
  • Dataset: All public Rechtspraak.nl rulings (1990 – present)
  • Interface: Custom MCP server exposing search, statistics, and similarity tools
  • Clients: Claude Desktop and ChatGPT Desktop
  • Embeddings: None required

Why This Works

  1. Speed – Solr answers in milliseconds, even across millions of documents.
  2. Cost – No embedding generation or vector storage.
  3. Precision – Legal data depends on exact matches and metadata filters.
  4. Transparency – Every query can be inspected and verified.
  5. Scalability – Solr Cloud scales horizontally without friction.

Sometimes Old Tech Beats New Hype

There is a lot of excitement around vector databases and embeddings, and they are great for unstructured or semantic similarity search.

But for structured, citation-driven domains like law, the old tools still shine.

With MCP, we can combine that reliability with the conversational power of modern LLMs. Complex RAG pipelines are not always necessary.

Not every AI problem needs the newest stack.

Sometimes, pairing proven search technology with a lightweight MCP bridge gives you the best of both worlds: trustworthy retrieval and natural language interaction.

If you are working with large text corpora or open datasets, you do not always need vector embeddings.

You might discover that classic search engines like Solr, combined with MCP, are already more than enough.


Try It Out

For technical readers:

Repository available here → github.com/axyr/rechtspraak-solr-mcp-server

For non-technical readers:

If you would like to try it out, feel free to send me a message.

Let us build something strong

Briefly describe your goals. I will respond with a clear proposal, scope, and timeline.