Self Hosted Jupyter Notebook for Neptune

Amazon Neptune is a fast, reliable, fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets. Whether you’re creating a new graph data model and queries, or exploring an existing graph dataset, it can be useful to have an interactive query environment that allows you to visualize the results.

One of the ways to achieve this is to use Jupyter notebooks. You can refer to Analyze Amazon Neptune Graphs using Amazon SageMaker Jupyter Notebooks for steps to use an Amazon SageMaker hosted Jupyter notebook. If you want to deploy Jupyter notebooks locally in your on-premises or any other source environment, you can use the steps mentioned in this post to do so. This option also gives flexibility if you want to customize your notebook and its configurations as per your business needs.

We can use Docker containers to deploy a self-hosted Jupyter notebook. You can deploy the notebook using Amazon Elastic Container Service (Amazon ECS) as described in this post, or you can deploy the notebook using Amazon Elastic Kubernetes Service (Amazon EKS) or any other host like an Amazon Elastic Compute Cloud (Amazon EC2) instance or on-premises server using similar steps.

You can only create a Neptune DB cluster in Amazon Virtual Private Cloud (Amazon VPC). Its endpoints are only accessible within that VPC. Therefore, if you’re deploying this Jupyter notebook outside the VPC of your Neptune cluster, you also need to establish connectivity via SSH tunnelling or a proxy (like Application Load Balancer, Network Load Balancer or Amazon API Gateway), which is out of scope of this post.

Before we begin with the walkthrough, let’s understand the environment variables that we’re using:

GRAPH_NOTEBOOK_AUTH_MODE – This variable indicates the authentication mode. Possible values include DEFAULT and IAM. For this post, we use DEFAULT.
GRAPH_NOTEBOOK_HOST – This variable is the cluster endpoint of your Neptune cluster.
GRAPH_NOTEBOOK_PORT – This variable is the port of your Neptune cluster. For our post, we use 8182.
NEPTUNE_LOAD_FROM_S3_ROLE_ARN – This variable is needed if we plan to load data from Amazon Simple Storage Service (Amazon S3) into our Neptune cluster. For more information, refer to Prerequisites: IAM Role and Amazon S3 Access. For our post, we leave it blank.
AWS_REGION – This variable indicates the Region where our Neptune cluster resides. For our post, we use us-east-1.
NOTEBOOK_PORT – This variable indicates the notebook’s port that we use while accessing the hosted Jupyter notebook. For our post, we use 8888.
LAB_PORT – This variable indicates the port of JupyterLab. For our post, we leave it blank.
GRAPH_NOTEBOOK_SSL – This variable indicates if we should use SSL for communicating with Neptune. Neptune now enforces SSL connections to your database. You have the option to disable SSL in Regions, such as US East (N. Virginia) or Europe (London), where both SSL and non-SSL connections are supported.
NOTEBOOK_PASSWORD – This variable indicates the password that we use to access the hosted Jupyter notebook. If you’re hosting the notebook on an EC2 instance, if you leave this variable, the image assumes the default value (the EC2 instance ID).

About the docker image

The container image available in this repository utilizes an Amazon Linux container image as the base image with Anaconda package, Node.js, Conda, and Jupyter Notebooks. This also includes Jupyter notebook libraries for integration with Apache TinkerPop and RDF SPARQL. Depending on the configurations provided, a Jupyter notebook environment is automatically created.

Build and deploy this image with the following code:

docker build -t graph\_notebook . 

docker run \
--env GRAPH\_NOTEBOOK\_AUTH\_MODE="DEFAULT" \
--env GRAPH\_NOTEBOOK\_HOST="neptune.cluster-XXXXXXX.us-east-1.neptune.amazonaws.com" \
--env GRAPH\_NOTEBOOK\_PORT="8182" \
--env NEPTUNE\_LOAD\_FROM\_S3\_ROLE\_ARN="" \
--env AWS\_REGION="cn-northwest-1" \
--env NOTEBOOK\_PORT="8888" \
--env LAB\_PORT="8889" \
--env GRAPH\_NOTEBOOK\_SSL="True" 
--env NOTEBOOK\_PASSWORD="mypassword@123"\
-p 8888:8888 \
-d graph\_notebook:latest

Browse to the URL of your machine (http://:< NOTEBOOK_PORT>).
Log in using the password provided for variable NOTEBOOK_PASSWORD.
In the Jupyter window, open the Neptune directory, and then the Getting-Started directory.

Now you can load data into the database, query it, and visualize the results using this Jupyter notebook.

hy714335634 / graph-notebook-docker-image

Self Hosted Jupyter Notebook for Neptune

About the docker image

About

Languages