Installing Neo4J Graph Database Self-Managed for Retrieval Augmented Generation (RAG)

Installing Neo4J Graph Database Self-Managed for Retrieval Augmented Generation (RAG)

In previous posts, we learned how to install PosgreSQL and then how to install pgvector (a vector search extension for PostgreSQL.) The Book Search Archive (our toy app to show off Mindfire’s growing low-cost open-source AI stack) currently utilizes a PostgreSQL database with vectors handled by pgvector to do our semantic searches using a HNSW index built-into pgvector.

The Woes of PostgreSQL and Graph Database Extensions

However, PostgreSQL is a standard relational database, and I really wanted to add graph database capabilities to our open-source stack. Lucky for me – or so I thought! – PostgreSQL has a graph database extension called Apache AGE. (GitHub repo here.) Apache AGE is the extension used for the “for pay” PuppyGraph database. So, I thought Apache AGE looked promising. I imagined in my mind having a single database that has regular relational tables, vector fields and indexes for semantic search, as well as graph database capabilities. It sounded incredible.

What’s that old saying about if it sounds too good?

It turns out that Apache AGE is not truly compatible with Windows (as noted here and here and here). I also had PostgreSQL 16 loaded and, at the time, Apache AGE did not support version 16. However, checking their website today, it looks like maybe they do now.

I’ll revisit Apache AGE as some future point. The temptation of having relational, graph, and vector databases all wrapped up in one stable open-source database is just too much for me. But for now, I’ve decided to try out a simpler solution for a Windows-based graph database: Neo4J.

Introducing Neo4J: An Amazing Open-Source Graph Database

Neo4J is a graph database that also has vector search support. Haystack also supports Neo4J with a built-in integration. So, at a minimum, I’d like the Mindfire open-source AI stack to provide support for Neo4J even if we decide later to go with Apache AGE as default.

What is a graph database you ask?

So, unlike a regular relational database where you have tables with rows, a graph database has nodes with named connections. Here is an example right off the Neo4J website:

Detailed description will be added at a later date. image 1

As you can imagine, this will create a sprawling database of connections between ideas. Some kinds of data work much better organized as graphs rather than as tables. Plus, graph theory is basic to both computer science in general and AI in particular. So having graph database capabilities would be a real boon for many kinds of algorithms used in and out of Artificial Intelligence. (Find an introduction to graph theory here.)

Installing Neo4J

The good news is that installing Neo4J is quite a bit easier than trying to get PostgreSQL + pgvector to work. The bad news is that despite some extensive documentation on how to install and use Neo4J, they still screwed it up and made some mistakes. I’ll try to walk you through the correct way to avoid these mistakes.

Let’s start with the easy part. Browse to the Neo4J deployment center and find the version applicable for your operating system. For me (installing on Windows) it looked like this:

Detailed description will be added at a later date. image 2

Be sure to select Community edition if you want the free open-sourced edition. For me, I selected Windows Executable and version 5.25.1. Strangely version 4.4.39 has a later release date than version 5.25.1. I’m not sure why.

This will take you to a download page with further instructions that, strangely, are only available on this download page. Scroll down a bit until you see something like this (for Windows):

image 3

Downloading Java JDK: Trying to Figure Out the Right Version

You may already have Java JDK installed. If not, this is where the official instructions are a bit off. The official instructions gave me a link to download either Open JDK or Oracle Java 8. I’m going open source, so I chose OpenJDK. This is the link they gave me:

https://openjdk.org/

You then have to – somewhat painfully – read through this link to find the right link. It is found here:

Detailed description will be added at a later date. image 4

And here is the link they give you:

https://openjdk.org/projects/jdk/23/

From this link you have to painfully read to figure out which link you need.

Detailed description will be added at a later date. image 5

Here is the one I found:

https://jdk.java.net/23/

Detailed description will be added at a later date. image 6

Why didn’t they just send me directly there?

Because guess what? They sent you to JDK 23! Which is the wrong version for Neo4J! If you make the mistakes of following their instructions and that cascade of links and then try to run Neo4J it will work but you’ll get this warning:

Please use Java(TM) 17 or Java(TM) 21 to run Neo4j.

Yup! You need to downgrade to version 21 (at least) of the JDK to avoid this warning.

Also note that they sent us to a version of JDK without an installer. No biggie, I’ll give some instructions below on how to deal with that. But it would have been easier to have an installer.

Downloading the JDK: For Real This Time

Here are some better ways to find the right version of the JDK. Make sure you get a version between 17 and 21 and avoid all the pain above.

Links to download JDK Version 21 – preferably with an installer:

Setting Up Java_Home for Windows

If you didn’t use an installer (like me) then you will need to copy the downloaded JDK somewhere (I put it right off the root of the D drive) and then you’ll need to set the JAVA_HOME environment variable so that your Windows operating system knows where to find the JDK.

First run the “Edit the system environment variables” program:

Detailed description will be added at a later date. image 7

Neo4J\_

You’ll see this screen below. Click “Environment Variables”:

Detailed description will be added at a later date. image 8

You’ll then see this modal below. Click “New”:

Detailed description will be added at a later date. image 9

Set the JAVA_HOME like this:

Detailed description will be added at a later date. image 10

You may have to reboot in some cases.

Installing Neo4J

You previously (above) downloaded Neo4J. Now ‘install’ it by unzipping the neo4j-community-5.25.1-windows.zip file somewhere. I put it off the root D drive at this location:

D:\neo4j-community-5.25.1

Now (for Windows) run the command prompt (i.e. ‘cmd’) and navigate to the bin folder of the Neo4J install directory and run: (Using my directory as an example. Replace D:\neo4j-community-5.25.1 with where you installed yours.)

D:\neo4j-community-5.25.1\bin>neo4j console

Detailed description will be added at a later date. image 11

You’ll see something like this:

Detailed description will be added at a later date. image 12

Alternatively, you can run it as a service:

D:\neo4j-community-5.25.1\bin\neo4j install-service.

Running the Neo4J Console in the Browser

Now you are ready to run the Neo4J console in your browser by navigating in your browser to:

http://localhost:7474

You’ll see something like this:

Detailed description will be added at a later date. image 13

The default login is username = neo4j and password = neo4j also. It will prompt you for a new password. I generated one at random. Be sure to save it or write it down.

Trying out Neo4J

There are two built-in tutorials that you can play with. I won’t go into all the detail but run this in the console at the top:

:play movie graph

Detailed description will be added at a later date. image 14

The app will then walk you through the tutorial. The results are pretty amazing! For example, here is the graph playing “6 Degrees from Kevin Bacon” (well, actually 4 degrees but who is counting?)

Detailed description will be added at a later date. image 15

And here is the shortest path from Kevin Bacon to Meg Ryan:

Detailed description will be added at a later date. image 16

Go try to do that in your mundane relationship SQL database!

Conclusion

This post went over pretty much the same install instructions for Neo4J that you get from Neo4J themselves. I tried to add some value by correcting a few mistakes (i.e. sending you to the wrong version of the JDK) and clarifying a few rough spots in the instructions. I also went over a quick comparison of a relational SQL database vs a Graph Database and explained why you might want a graph database for an AI stack. I also did a quick comparison between Apache AGE vs Neo4J and why – for Windows at least – I’m currently going with Neo4J in our stack. (Though I hope to revisit Apache AGE in the future.)

Finally, here are a collection of helpful links in getting started with Neo4J.

Links

SHARE


comments powered by Disqus

Follow Us

Latest Posts

subscribe to our newsletter