khyatigarg1 / test_company

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

{\rtf1\ansi\ansicpg1252\cocoartf2580
\cocoatextscaling0\cocoaplatform0{\fonttbl\f0\fswiss\fcharset0 Helvetica;\f1\froman\fcharset0 Times-Roman;}
{\colortbl;\red255\green255\blue255;\red0\green0\blue0;}
{\*\expandedcolortbl;;\cssrgb\c0\c0\c0;}
\paperw11900\paperh16840\margl1440\margr1440\vieww11520\viewh8400\viewkind0
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0

\f0\fs24 \cf0 Title: 
\f1 \cf2 \expnd0\expndtw0\kerning0
\outl0\strokewidth0 \strokec2 CompanyName2Vec: Company Entity Matching Based on Job Ads\
Project: CEM\
Author: Ran Ziv\
Date: September 2021\
\
Contains:\
\'97\'97\'97\'97\'97\'97\
- /code/ - directory with an archive of the CEM project\
   - Project contains several directories:\
      - This Readme file\
      - jobAdsProcessing - contains fingerprinting and job ads corpus processing jobs\
      - model - contains the method building blocks\
         - model_emb.py - primary executable, includes environmental settings. Loads the input data, generates the model and calls the evaluation process\
         - evaluate _emb.py - contains the evaluation process. Calls the index_emb.py\
         - Index_emb.py - builds an index for evaluation purposes\
         - datasetFiltering.py - prepare job ads corpus and save it to input directory. includes filtering capabilities for testing purposes\
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0
\cf2 \expnd0\expndtw0\kerning0
\outl0\strokewidth0       - utils - contains several utilities, like t-test calculator, fuzzy distance test function, data export/import functions, etc. \expnd0\expndtw0\kerning0
\outl0\strokewidth0 \
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0
\cf2 - /Data/cem-input-export.zip - compressed directory with input directories and files for the CEM project\
- \expnd0\expndtw0\kerning0
\outl0\strokewidth0 /Data/cem-output-export.zip - compressed directory with output directories and files for the CEM project\expnd0\expndtw0\kerning0
\outl0\strokewidth0 \
\
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0
\cf2 \
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0
\cf2 Instructions:\
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0
\cf2 \expnd0\expndtw0\kerning0
\outl0\strokewidth0 \'97\'97\'97\'97\'97\'97\expnd0\expndtw0\kerning0
\outl0\strokewidth0 \
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\pardirnatural\partightenfactor0
\cf2 - Download and extract files from both Code and Data directories\
- Make sure python 3.6.8 or compatible version of it is installed and all required packages are installed\
- Edit environment variables in model_emb.py\
- Execute model_emb.py\
- Results will be printed to stdout\
}

About


Languages

Language:Python 100.0%