Larusyang / LagouJob

Job data mining repo for lagou.com

Home Page:https://www.zhihu.com/question/36132174/answer/94392659

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Data Analysis of Lagou Job

LagouIcon

Introduction

This repository holds the code for job data analysis of Lagou. The main functions included are as follows:

  1. Crawling job data from Lagou, and get the latest information of jobs about Internet.
  2. Data analysis and visualization.
  3. Crawling job details info and generate word cloud as Job Impression.
  4. In order to train a NLP task with machine learning, the data of interviewee's comments will be stored in mongodb

Prerequisites

  1. Install 3rd party libraries

    sudo pip3 install -r requirements.txt
    
  2. Install mongodb and start mongodb service

    sudo service mongod start
    

How to Use

  1. clone this project from github.
  2. run m_lagou_spider.py to crawl job data, it will output an Excel file.
  3. run hot_words.py to cut sentences, and return TOP-30 hot words.

Analysis Results

Image1 Image2 Image3 Image4 Image5 Image6 Image7

Report

  • For technical details, please refer to my answer at Zhihu.
  • The PDF report can be downloaded from here.

LICENSE

Apache-2.0

About

Job data mining repo for lagou.com

https://www.zhihu.com/question/36132174/answer/94392659

License:Apache License 2.0


Languages

Language:Python 99.1%Language:Shell 0.9%