Bayer-Group / COLID-Setup

The setup repository is part of the Corporate Linked Data Catalog - short: COLID - application. It helps setting up a local environment based on Docker Compose.

Home Page:https://bayer-group.github.io/COLID-Documentation/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Easy setup of the whole environment

The Setup is part of the Corporate Linked Data Catalog - short: COLID - application. Here you can find an introduction to the application. A description of all its functions is here.

The complete guide can be found at the following link.

This repository helps settings up a local environment based on Docker Compose.

Installation instructions

  1. Install Docker Desktop for Windows from Docker Hub (latest test with Docker Desktop 4.21.1)
  2. Clone this repository locally
    git clone --recursive [URL to this Git repo]
  3. Pull all changes in all submodules
    git pull --recurse-submodules
  4. Create a file .env in parallel to the file docker-compose.yml and insert the following variables (example values are shown):
    MESSAGEQUEUE_COOKIE=SWQOKODSQALRPCLNMEQG
    MESSAGEQUEUE_USERNAME=guest
    MESSAGEQUEUE_PASSWORD=guest
    GRAPHDATABASE_USERNAME=admin
    GRAPHDATABASE_PASSWORD=admin
    RELATIONAL_DATABASE_ROOT_PASSWORD=dbadminpass
    RELATIONAL_DATABASE_USERNAME=dbuser
    RELATIONAL_DATABASE_PASSWORD=dbpass
    MINIO_ACCESS_KEY=minio
    MINIO_SECRET_KEY=minio123
    MINIO_BUCKET_NAME=colid-files
    SMTP_USERNAME=any
    SMTP_PASSWORD=any
    
  5. Run docker-compose up to download and build all Docker images and startup the environment
  6. Wait for docker-compose to start up
  7. Open the COLID Data Marketplace frontend (see URL below). Go to the profile menu in the upper right corner and click on "Administration". Open the Metadata Graph Configuration sub-menu page and click the "Start reindex" button in the upper right corner.

Known problems

  • While building the frontend the following error could occur. In the Dockerfiles of the frontend applications node is used with an increased heap size while building the applications node --max_old_space_size=8000. Try to increase this, if the error occurs.

    FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
    
  • After starting the application a second time, the fuseki database could throw exceptions. Delete the Docker container of the fuseki database with docker container rm fuseki. ATTENTION: This will remove all your created data and reload the database with the initial data.

  • fuseki-loader/loader.sh could contain Carriage Return characters, remove them.

Application URLs

Component URL for Docker environment URL for local environment Username Password
COLID frontend http://localhost:4200/ http://localhost:4201/ - -
Data Marketplace frontend http://localhost:4300/ http://localhost:4301/ - -
COLID API Swagger documentation http://localhost:51770/swagger http://localhost:51771/swagger - -
COLID Indexing Crawler Service API Swagger documentation http://localhost:51780/swagger http://localhost:51781/swagger - -
COLID Search Service API Swagger documentation http://localhost:51800/swagger http://localhost:51801/swagger - -
COLID AppData Service API Swagger documentation http://localhost:51810/swagger http://localhost:51811/swagger - -
COLID Scheduler Service Hangfire http://localhost:51820/hangfire http://localhost:51821/hangfire - -
COLID Reporting Service API Swagger documentation http://localhost:51910/swagger http://localhost:51911/swagger - -
Apache Jena Fuseki Database Webinterface http://localhost:3030/ - admin admin
RabbitMQ Webinterface http://localhost:15672/ - guest guest
KGE-Editor-Frontend http://localhost:4400/ http://localhost:4400/ - -
KGE-Web-Service http://localhost:8080/ http://localhost:8080/ - -
Resource Relationship Manager-Service http://localhost:51830/ http://localhost:51831/ - -
Resource Relationship Manager-Frontend http://localhost:7000/ http://localhost:7000/ - -
COLID API Carrot2 Service http://localhost:4305/ http://localhost:4305/ - -
Opensearch Dashboard http://localhost:5601/ - admin admin
Minio Browser http://localhost:9001/ - minio minio123

Quick Tips

Some quick tips and advices to work faster.

Docker

To purge all unused or dangling images, containers, volumes, and networks run the following command:

docker system prune -a

To remove all containers:

docker container rm $(docker container ls -aq)

Opensearch & Kibana

  • After starting the first time, some indices and aliases need to be created
  • Open http://localhost:5601, go to the Dev Tools in the left panel, enter and run the following commands
    PUT dmp-resource-1970-01-01_00.00.00
    
    PUT dmp-metadata-1970-01-01_00.00.00
    {
        "mappings": {
            "enabled": false 
        }
    }
    
    POST /_aliases
    {
        "actions" : [
            { "add" : { "index" : "dmp-resource-1970-01-01_00.00.00", "aliases" : ["dmp-search-resource", "dmp-update-resource"] } },
            { "add" : { "index" : "dmp-metadata-1970-01-01_00.00.00", "aliases" : ["dmp-search-metadata", "dmp-update-metadata"] } }
        ]
    }
    

App-wide customization for URL Domain (Optional)

On the Semantic Web, URIs identify not just Web documents, but also real-world objects like people and cars, and even abstract ideas and non-existing things like a mythical unicorn. We call these real-world objects or things. COLID uses the native bayer.com as default domain in each of its URI as the project was conceived for Bayer Ag. For example - https://pid.bayer.com/kos/19050/hasLabel

However you can also configure the custom domain in the URI if needed. In order to do that before building the docker containers, all the triples in the triplestore as well as the references to the URIs should be updated to use the custom domain. Multiple files references across the projects need to be changed from bayer.com to any custom specific domain - https://pid.orange.com/kos/19050/hasLabel Details are mentioned below.

File Project Variable Comments
loader.sh fuseki-staging baseUrl change baseUrl (example.com) as per your need in the shellscript before uploading triples
appsettings.json AppData Service ServiceUrl,
HttpServiceUrl
change both variables as per your custom domain.
"ServiceUrl": "https://pid.example.com/",
"HttpServiceUrl": "http://pid.example.com/"
appsettings.json Indexing Crawler Service ServiceUrl,
HttpServiceUrl
change both variables as per your custom domain.
"ServiceUrl": "https://pid.example.com/",
"HttpServiceUrl": "http://pid.example.com/"
appsettings.json Registration Service ServiceUrl,
HttpServiceUrl
change both variables as per your custom domain.
"ServiceUrl": "https://pid.example.com/",
"HttpServiceUrl": "http://pid.example.com/"
appsettings.json Reporting Service ServiceUrl,
HttpServiceUrl
change both variables as per your custom domain.
"ServiceUrl": "https://pid.example.com/",
"HttpServiceUrl": "http://pid.example.com/"
appsettings.json Search Service ServiceUrl,
HttpServiceUrl
change both variables as per your custom domain.
"ServiceUrl": "https://pid.example.com/",
"HttpServiceUrl": "http://pid.example.com/"
appsettings.json Scheduler Service ServiceUrl,
HttpServiceUrl
change both variables as per your custom domain.
"ServiceUrl": "https://pid.example.com/",
"HttpServiceUrl": "http://pid.example.com/"
appsettings.json Resource Relationship Manager Backend Service ServiceUrl,
HttpServiceUrl
change both variables as per your custom domain.
"ServiceUrl": "https://pid.example.com/",
"HttpServiceUrl": "http://pid.example.com/"
environment.ts, environment.docker.ts Editor Frontend baseUrl, PidUriTemplate.baseUrl change baseUrl (example.com) in both sections as per your custom domain
environment.ts, environment.docker.ts Data Marketplace Frontend baseUrl change baseUrl (example.com) as per your custom domain
environment.ts, environment.docker.ts Resource Relationship Manager Frontend baseUrl change baseUrl (example.com) as per your custom domain

COLID: Carrot2 clustering service

Carrot2 clustering service is an opensource for clustering text. It can automatically discover groups of related documents and label them with short key terms or phrases. Please publish few resources in your local COLID Setup and then you can view the clusters in the Data Marketplace. Refer link below for more details

Minio and S3

The repository contains a local S3 bucket image for minio. If you want to use certain features such as exporting and importing excel. Please follow below steps

  • Make sure minio image is running
  • Browse to http://localhost:9001
  • Create Bucket 'colid-files'
  • Now you can use the Export and Import functionalities in Data Marketplace

Links

About

The setup repository is part of the Corporate Linked Data Catalog - short: COLID - application. It helps setting up a local environment based on Docker Compose.

https://bayer-group.github.io/COLID-Documentation/

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Shell 92.7%Language:Dockerfile 7.3%