ldkhanh / Kiwi-Spider

Crawl over 10000 deliverable emails & phone numbers per day from Linkedin & SalesGenie & Google based on your keywords and send bulk emails.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Contact Crawler

Crawl Over Ten Thousands Deliverable Emails & Phone Numbers Per Day ✔️

Request-Handling Pipeline:
insert pending query -> query get scanned -> update query's status -> check results during or after the crawling

Front-End:

- login and schedule crawl queries
- check query status
- display results in tables
- export results to .csv files and download
- send bulk emails

Documentation:

html Description
index.html the homepage to login
search.html input the query into database
result.html show the result table fetch from database
php Description
login.php verifyt the account from database
searchQ.php insert the query into database, fetch the exist query
result.php show the result table fetch from database which searched from linkedin
result_sg.php show the result table fetch from database which searched from salegenie
refresh.php asynchronous refresh the query
delete.php asynchronous delete the query from database
javascript Description
table2CSV.js convert the table in html to csv file that can be downloaded

Back-End:

- scan and process the pending queries multithreaded
- crawl target emails & phone numbers from Linkedin & SalesGenie & Google
- verify if the emails are valid and deliverable
- store results into MySQL

Documentation:

Package crawler Description
EmailCrawlerAPI         Entrance of the program, launch the Spring Boot
EmailCrawlerConfig Read configuration file
Package crawler.controller Description
CrawlEmailController     Map RESTful API
Package crawler.DAO Description
MySQLConnector     Encapsulate the methods of connecting to MySQL
RecnctThread     Reconnect to MySQL to avoid timeout
CompanyDAO     insert and update data to the Company table
CustomerDAO     insert and update data to the Customer table
EmailDAO     insert and update data to the Email table
ResultDAO     insert and update data to the Result table
ResultSgDAO     insert and update data to the ResultSg table
SalesgenieDAO     insert and update data to the SalesgenieDAO table
SearchQueryDAO     insert and update data to the SearchQueryDAO table
Package crawler.model Description
CrawlerQuery     Data model of a query
Customer     Data model of a customer
Email     Data model of an Email
SalesGenieResult     Data model of a result from SalesGenie
Package crawler.service Description
Callback     Interface of the callback when a query has been completed or failed
PollSearchQueryService     Check if there is any pending query in database. If yes, send it to the line of production
CrawlEmailService     The process of crawling email from Linkedin
CrawlSalesGenieService     The process of crawling SalesGenie
DriveBrowserService     The implementation of the general browser operations
DriveLinkedinService     The implementation of the browser operations for crawling Linkedin
DriveSalesgenieService     The implementation of the browser operations for crawling SalesGenie
EmailVerifyService     Verify if a given email address is deliverable
GeneratAccurateEmailsService     Generate a person's email addresses based on his name and companies
SendEmailService     Send email by Java code, should not be used to spam emails
LaunchWindowService     UI by Java Swing (discarded)
Package crawler.thread Description
CrawlCompanyThread   Unit of task excuted when crawling emails of a company of a person (not used)
CrawlCustomerThread   Unit of task excuted when crawling emails of a person

Copyright and license

Code and documentation copyright 2016-2017 the Jianyang Zhang, Wentao Wang and Yihan Lu. Code released under the MIT License.

About

Crawl over 10000 deliverable emails & phone numbers per day from Linkedin & SalesGenie & Google based on your keywords and send bulk emails.

License:MIT License


Languages

Language:PHP 45.1%Language:Java 39.5%Language:HTML 13.0%Language:JavaScript 1.7%Language:Vue 0.3%Language:ApacheConf 0.3%