GuoGuang / flask_isomerism

Flask框架实现基础定时爬虫,为网站提供数据,集成Eureka,实现一个基础的异构系统

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Version scrapy License: MIT Twitter: GuoGuang0536

Flask isomerism

提供基础爬虫接口、爬虫脚本,集成到Eureka,主要实现异构系统使用。 如果需要添加新的脚本的在jobs\tasks下添加

Prerequisites

  • python3
  • Flask

Install

git clone https://github.com/GuoGuang/spider.git

Table structure

CREATE TABLE `movie`  (
  `id` varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
  `name` varchar(200) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '电影名称',
  `desc` text CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL COMMENT '电影描述',
  `classify` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '类别',
  `actor` varchar(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '主演',
  `director` varchar(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '导演',
  `cover_pic` varchar(300) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '封面图',
  `pics` varchar(1000) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '图片地址',
  `magnet_url` varchar(5000) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '磁力下载地址',
  `online _url` varchar(5000) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '在线播放地址',
  `pub_date` bigint(20) NOT NULL COMMENT '发布日期',
  `rating` varchar(20) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '评分',
  `source` varchar(20) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL DEFAULT '' COMMENT '来源',
  `visits` int(11) NOT NULL DEFAULT 0 COMMENT '阅读数',
  `is_recommend` int(11) NOT NULL DEFAULT 0 COMMENT '是否推荐,0不推荐,1推荐',
  `update_at` bigint(20) NOT NULL,
  `create_at` bigint(20) NOT NULL,
  PRIMARY KEY (`id`) USING BTREE,
  INDEX `idx_pu_date`(`pub_date`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8mb4 COLLATE = utf8mb4_0900_ai_ci ROW_FORMAT = Dynamic;

// If you create a new entity use auto generate model
flask-sqlacodegen "mysql://root:123456@127.0.0.1/movie_cat" --tables user --outfile "common/models/user.py"  --flask

Usage

# 使用以下命令启动爬虫
 python manager.py runjob -m movie 
 
# 使用以下命令启动Flask web
python manager.py runserver

Job task

Use Linux Crontab implementation

// 编辑文件
crontab -e 

# 编写脚本 自动执行爬虫
* */1 * * * { export ops_config=local && python3 /Yourdirectory/manager.py runjob -m movie }

Author

👤 GuoGuang

🤝 Contributing

Contributions, issues and feature requests are welcome!
Feel free to check issues page.

Show your support

Give a ⭐️ if this project helped you!

📝 License

Copyright © 2019 GuoGuang.
This project is GuoGuang licensed.

About

Flask框架实现基础定时爬虫,为网站提供数据,集成Eureka,实现一个基础的异构系统

License:MIT License


Languages

Language:Python 100.0%