rubenkruiper / irec

Text-to-KG tool for learning a Compliance Checking lexicon, published at LDAC 2023

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Intelligent Regulatory Compliancy (iReC)

Scripts and data to reproduce some of the work done for the iReC project, a collaboration between Northumbria University (NU) and Heriot-Watt University (HWU) that was funded by NU and the Building Research Establishment (BRE). More details on the work in this repository can been found in the paper describing the code and results in this repository, as well as our position paper on Automated Compliance Checking.


How to get started

Download and install the free version of the Anaconda package manager for your system. If needed, there are many tutorials online on how to get started with Anaconda and Jupyter Notebook; see this one for example.

After installing anaconda, open a terminal/console window (mac/linux) or Anaconda prompt (Windows) and verify your installation by running: conda -V

The terminal should return the version of Anaconda that is now installed on your system. Next run conda install -c anaconda git -y

Navigate to the directory on your computer where you'd like to create a folder with the code for the iReC project, e.g., some specific folder for coding projects or simply cd ~/Documents/ Then clone this repository git clone https://github.com/rubenkruiper/irec.git Sign in to your GitHub account if prompted. Navigate into the new folder cd irec

  1. Create a separate iReC environment that runs python 3.9:
  • conda create --name irec python=3.9 -y
  • conda activate irec
  1. Install dependencies:
  • pip install -r requirements.txt
  • conda install -c conda-forge pdftotext -y
  1. Start a notebook, which will open up a browser window:
  • jupyter notebook
  • Inside the browser window double-click on one of the notebooks to start it. Numbering indicates the order in which notebooks should be run, e.g., start with 1. Term Extraction.ipynb.
  • Run the cells from top to bottom, either by selecting Cell > Run All in the drop-down menu or pressing Shift+Enter on the top cell and working your way down.

Figure: Example output of running Louvain community detection on a network representation of the term KG: alt text

If you have any questions, please let me know!


Visualize the graph in GraphDB

Download Ontotext GraphDB, the free version will do!

  1. Running GraphDB should open a tab in your preferred browser with the tool's interface.
  2. Go to Setup > Repositories, and create a new repository
  • use graphDB free,
  • enter some name for the repository at Repository ID, e.g., 'IReC'
  • all standard settings should be fine
  1. Go to Import > RDF
  • In the very top right of your screen you can select which repository you are working on, select which repository you would like to load the data into.
  • click Upload RDF files and select the "intial_graph.ttl" file that is created in the folder "data/graph_output"
  • click om 'import', all standard settings should be fine (assuming you are using an empty repository that doesn't have any data in the default graph, otherwise replace the data in the default graph -- but that shouldn't be necessary)
  1. Once imported, go to Explore > Graphs overview, and click on the default graph.
  2. Select any of the nodes to get a detailed view of its relations, you'll need to select a node to start the visual graph I think.
  3. You can find the "visual graph" button in the top right, double-clicking on a node expands its edges.

Again, let me know if you have any questions!

Figure: Example visualisation of KG terms using GraphDB: alt text



If you use any of our work in your research, please consider citing one or both of our papers:

@inproceedings{Kruiper2023-LDAC_irec,
    title = "Taking stock: a Linked Data inventory of Compliance Checking terms derived from Building Regulations",
    author = "Kruiper, Ruben  and
      Konstas, Ioannis  and
      Gray, Alasdair J.G.  and
      Sadeghineko, Farhad  and
      Watson, Richard  and
      Kumar, Bimal",
    month = jun,
    year = "2023",
    address = "Matera, Italy",
    booktitle = "11th Linked Data in Architecture and Construction (LDAC) Workshop",
    url = "https://linkedbuildingdata.net/ldac2023/files/papers/papers/LDAC2023_paper_4692.pdf"
}
@inproceedings{Kruiper2023-LDAC_position,
    title = "Don’t Shoehorn, but Link Compliance Checking Data",
    author = "Kruiper, Ruben  and
      Konstas, Ioannis  and
      Gray, Alasdair J.G.  and
      Sadeghineko, Farhad  and
      Watson, Richard  and
      Kumar, Bimal",
    month = jun,
    year = "2023",
    address = "Matera, Italy",
    booktitle = "11th Linked Data in Architecture and Construction (LDAC) Workshop",
    url = "https://linkedbuildingdata.net/ldac2023/files/papers/papers/LDAC2023_paper_2636.pdf"
}

The code in this repository is licensed under a Creative Commons Attribution 4.0 License.

About

Text-to-KG tool for learning a Compliance Checking lexicon, published at LDAC 2023


Languages

Language:HTML 76.5%Language:Jupyter Notebook 22.0%Language:Python 1.5%