mallika2011 / WikiCats

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

WikiCats: Leveraging Wikipedia Category Information to Enrich Content in Similar Articles

Directory Structure:

.
├── data
├── documents
├── LICENSE
├── README.md
├── scripts
├── src
└── Union_Territories

File Descriptions for each of the repository files can be obtained in ./FileDescriptions.md

Data Description:

  • The project focuses on the Union Territories subdomain. The Wikipedia category tree is used to explore the graph properties between the various nodes in the tree.

  • There are a total of 19625 Articles and 1969 Categories under the Union Territories category.

  • In order to represent a category by its knowledge graph embedding, the WikiData Embeddings by PyTorch BigGraph were used. The size of these category embeddings are 200 dimensions.

Contributors:

About

License:MIT License


Languages

Language:Python 97.5%Language:Shell 2.5%