yago-mendoza / suskind-knowledge-graph

Graph-based NLP framework leveraging a curated database and an intuitive CLI for advanced, context-rich language understanding.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Suskind Knowledge Graph

This documentation presents the Graph-based Framework for NLP, a system enhancing NLP through graph techniques, inspired by GNNs and graph transformer models. It features a dynamic graph where entities as nodes are linked by contextual relevance, offering improved relationship and context understanding beyond linear text analysis. The database, built from three years of detailed data collection, ensures quality. The framework includes Command Line Interfaces (CLI) for straightforward database management and supports customizable graph search algorithms for flexible data exploration. This integration provides a powerful tool for NLP, combining graph-based analysis with dynamic algorithm customization for diverse analytical needs.

Documentation

The following documentation provides an exhaustive study of all components of the system, especially those related with the CLIs interaction.

Roadmap

  • Database Preparation

    • Cleaning Initial Node Database
  • SK Components Dessign

    • Design of Node Class
    • Design of NodeSet
    • Design of Graph
  • Implementation of Search Algorithms

    • Design of Centrality Algorithm
    • Development of Density Search Algorithm
    • Optional Design of Shortest Path Algorithm
  • Organizing All Files into a Single Directory

  • Development of Main Command Line Interface (CLI)

    • Implementation of Basic Commands (cd, ls, etc.)
    • Development of Helper Functions
    • Creation of User Interfaces for Specific Functions
    • Design of Auxiliary CLIs
      • LS_Interface
      • VG_Interface
      • NW_Interface
      • GB_Interface
    • Integration of Centrality, Density Search, and Shortest Path Algorithms
    • Edge cases testing.
  • AI for data processing

    • Study existing heuristic methods applicable to network analysis.

    • Create a test corpus aligned with the principles of "atomicity" and "proximity".

    • Large Language Model (LLM) Integration

      • Research and select an appropriate LLM for the task.
      • Dessign the display to rate the success (granularity + contextual placement).
      • Testing and Validation
        • Implement the heuristic model on a subset of data.
        • Monitor performance and adjust prompting to maximize success rate.
  • CLI Integration

    • Dessign a pipeline that connects AI generative capabilities with a gateway on CLI.
    • Test success rate with updated tooling.

License

MIT

🔗 Links

portfolio linkedin twitter

About

Graph-based NLP framework leveraging a curated database and an intuitive CLI for advanced, context-rich language understanding.


Languages

Language:Python 100.0%