DrakenWan / Rekrut

Chrome extension that will scrape a linkedin profile.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Update 2023

I will be archiving this repo. Most of the code for scraping is not useful anymore and is quite old. The server side scripting can still be used if someone wants to integrate Yale3 repository with it. I will archive this in a few hours.

UPDATE 2021

Simplified Version of Rekrut released. Go to Yale3 . The extension doesn't need a server for communication. It doesn't have any Rekrut login or database storage features. The extension works client sided and just prints raw data on a sidebar that appears when you click on the extension icon on a linkedin profile page. Go to the above link.

     

I haven't made core changes to this extension for months. So the extraction tool may not work as supposed to. If you feel like making appropriate changes you can enabel developer mode and see the changes made to the linkedin document model and use the navtress for reference.

Rekrut

Rekrut is a chrome extension that automatically scraps a linkedin profile and can be used by recruiters of companies to extract important details from a profile and store it in their recruitment database.
Current version is 3.0

Installation or Setting Up

You need to ensure that you have the following dependencies installed within your system:

  • node.js : Ensure this is in path.
  • mongodb : Ensure the mongodb is properly installed and it is accessible through shell.
  • npm : Node Package Manager to install Rekrut Server dependencies and other stuff
  1. Clone this git folder into a suitable place in your system. Extract the folder out. Go to the test folder through command shell.
  2. To install all the internal dependencies run:
    npm install After the package has installed itself in the test folder, type:
    npm start or you can even type node test.js < database > < portnumber >. The latter one will let you manually choose a database and port number of your choice. If you choose a new database name then the details will store from scratch.

If there are some errors while running node.js please inform me promptly. They are most likely dependency error. If possible, you can see the name of missing module and type npm install modulename

  1. Now that your server is running, ensure that your mongodb server is running in the background. It must be running by default (it runs on startup of PC).

  2. Open chrome browser. Type chrome://extensions. Turn on developer mode. Click on the button Load unpacked extension. Browse the Rekrut-master folder and select ''extension'' folder. The extension is installed within your browser!

  3. If the mongodb URL and port are different then you need to change it in the code.

The extension will only work on linkedin.com and only extract viable information from linkedin.com/in/{profileurl} pages. You will also need to login first.

Registering your own details of authentication

You can register yourself using the signup_user.js file. This will check if the username already exists in the database or not. If it doesn't then you can successfully login. Say you want to signup with username "drakenwan" and password "ilikecupcakes". Then type:
node signup_user.js drakenwan ilikecupcakes.
There should not be any spaces in username or password. Your details will store in database but your unique token won't be generated.
Go back to chrome. Login with your registered details. The browser will check if you have the unique token if not then it will authenticate ur details if they are authenticated then your unique token will be stored in browser and you will stay logged in until you erase browser memory or cookies.

The extension will only authenticate your login details if the server file test.js is running in the background. If you have your own server then I suggest you change certain parameters within testing folder files as well as extension folder . You can find the constant variable like SERVER_URL or just SERVER in the files and change the link to your own server's. Transfer the express code from the testing folder files or create your own to serve the middleware of your serve if you are using a different framework.

3.1

Rekrut3.1 has new features added to it. Now it has a login feature and if the rekruter's (recruiter's) name is within the database only then they are able to access the extension's main feature. The details are stored in the database without any failure.
Small bugs were removed.

  • Feature to logout the user has been added. 3.1.2

  • OpenSSL certificate added into the server. The extension will work over https now. (It will not work on any other IP host other than 'localhost' possibly due to my invalidated certificate on browser.)

    Note: The certificate will be invalidated by browser because I, as an issuer, am not valid. If you have certificates from appropriate authorities such as comodo, symantec, etc. then you can deploy the certificate into the server folder with its key. Just little changes needed to be made to the test.js file for certification. Add the object {key: fs.readFileSync('yourkeyfilename'), cert: fs.readFileSync('yourcertificatefilename'), passphrase: yourpassphrase} at |1| in the code: https.createServer( |1|, app).listen(3000) . This will work if you have the valid certificate. Then you can change the HOST name in the content.js file of extension and reload the extension it will work fine.

  • Added the feature to extract skill. The feature is working fine in almost all profile pages. The issue in previous code update has been resolved. One can retrieve the details from their own profile too.

  • Can now extract licenses and certifications section.

    If there are any errors please inform me with a screenshot of the developer console error message. (Press F12) to open dev console.

  • [timestamp: 7319525]: lot of changes done. Starting procedure of server changed. Added secure and unsecure server selection option in the server file. If isSecure is changed in both test.js and content.js file to false then server will share details over unsecure server. Also added devmode constant. If set to true certain features will be accessed. Such as the details of working will be shared on development console. By default these are set to true. You can change them to false. They will not interfere with normal UI of the extension.

Note: Change the devmode constant (in every file) to false if it is set to true or else you will notice unnerving changes on the website. However if you are indeed in devmode, Firstoff zoomout at 25% and then do a server refresh if a single node does not exist on the linkedin profile, the background of entire page will change to red. The background of successfully extract nodes will turn to green. Errors and warnings will be logged in console and you can verify with the logged errors if something has changed in the website. It is possible that error will occur frequently given the linkedin profile chosen for debugging is not ideal everytime but messages logged in console will compensate for that. Do server refresh (ctrl+f5) after every changes you make.

About

Chrome extension that will scrape a linkedin profile.

License:MIT License


Languages

Language:JavaScript 93.0%Language:HTML 4.9%Language:CSS 2.0%Language:Dockerfile 0.1%