ZHOU (AugustLONG)

AugustLONG

Geek Repo

Company:None

Location:Shanghai

Github PK Tool:Github PK Tool

ZHOU 's repositories

medusa

(2015)(Python)网络服务、爬虫、索引、搜索(基于 django、scrapy、elasticsearch、postgresql、redis)

Language:PythonStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

iScript

各种脚本 -- 关于 虾米 xiami.com, 百度网盘 pan.baidu.com, 115网盘 115.com, 网易音乐 music.163.com, 百度音乐 music.baidu.com, 360网盘/云盘 yunpan.cn, 视频解析 flvxz.com, bt torrent ↔ magnet, ed2k 搜索, tumblr 图片下载, unzip

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

autograd

Efficiently computes derivatives of numpy code.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

newspaper

News, full-text, and article metadata extraction in Python 3 good

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

machine_learning-1

机器学习笔记,来源于:李航的《统计学习方法》 周志华的《机器学习》 Peter Harrington 的《机器学习实战》 以及Python的 Scikit-Learn 开源库。

License:GPL-3.0Stargazers:0Issues:0Issues:0

NewsSpider

爬取今日头条,网易,腾讯等新闻,并建立简单的搜索引擎

Language:PythonStargazers:0Issues:0Issues:0

WechatSogou

基于搜狗微信搜索的微信公众号爬虫接口

Language:PythonStargazers:0Issues:0Issues:0

CoolplaySpark

酷玩 Spark: Spark 源代码解析、Spark 类库等

Language:ScalaStargazers:0Issues:0Issues:0

heritrix3

Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.

Language:JavaStargazers:0Issues:0Issues:0

Google-ML-Recipes-Chs-sub-and-code

Google出品的机器学习入门视频的中文字幕翻译与示例代码

Language:PythonStargazers:0Issues:0Issues:0

kindo

a lightweight automated deployment tool developed with python

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

flume-json-interceptor

Apache flume interceptor which tracks the json and pass it different kafka queue based on json input

Language:JavaStargazers:0Issues:0Issues:0

keepcrawler

基于WebMagic的Keep爬虫

Language:JavaStargazers:0Issues:0Issues:0

mysql-sink

flume mysql sink

Language:JavaStargazers:0Issues:0Issues:0

EasySearchEngine-V2.0

基于今日哈工大新闻的简易搜索引擎v2.0good

Language:PythonStargazers:0Issues:0Issues:0

distributed_systems_readings

a list of papers, conferences, books, mooc, Q&A and other stuffs for distributed systems

Stargazers:0Issues:0Issues:0

my_blog

多个进程 python

Language:HTMLStargazers:0Issues:0Issues:0

nutcher

nutcher是中文的nutch文档,包含nutch的配置和源码解析,持续更新中。

Language:HTMLLicense:GPL-2.0Stargazers:0Issues:0Issues:0

flume-mysql-sink

get kafka events in mysql

Language:JavaStargazers:0Issues:0Issues:0

Scrapy-1

Scrapy 框架爬取亚马逊书籍信息,保存到mysql数据库。

Language:PythonStargazers:0Issues:0Issues:0

ppd_code

拍拍贷 算法比赛

Language:PythonStargazers:0Issues:0Issues:0

webmagic

A scalable web crawler framework.

Language:JavaStargazers:0Issues:0Issues:0

SimpleSearchEngine

网络数据挖掘作业,简单搜索引擎

Language:PythonStargazers:0Issues:0Issues:0

SparkStreaming_Crawler_Redis

use pyspark to process data from web crawler and pass data through redis

Language:PythonStargazers:0Issues:0Issues:0

zufang

Douban rental data search engine(豆瓣租房搜索引擎)

Language:PythonStargazers:0Issues:0Issues:0

DigWebForChemNoun

Dig The Web For Chemstry Noun. This is for education purpose

Language:PythonStargazers:0Issues:0Issues:0

SpiderIndex

简单的搜索引擎,包括爬虫、分词(含pagerank)两部分

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

MysqlToElasticsearch

用于分库分表,表结构完全相同情况下从Mysql数据到导入数据到Elasticsearch搜索引擎。

Language:PythonStargazers:0Issues:0Issues:0