Webpage Grabber
This is an email to web gateway service. It uses fetchmail to receive emails, procmail to match emails against rules, then either links or wget to retrieve files.
Setup
This is broken into three phases so you can do a chunk at a time.
Prerequisites
-
Ensure you have wget, zip, mime-construct, fetchmail, procmail, xvfb and a working SMTP server (I am using msmtp).
-
Install LWP, LWP::Protocol::https and HTML::TokeParser via CPAN so Perl can run
url.pl
. On Ubuntu you need libnet-ssleay-perl and libcrypt-ssleay-perl for Net::SSLeay to work. -
Test
url.pl
and ensure it works. You may need to updateText::Wrap
.echo -n "" | ./url.pl -t https://www.yahoo.com
-
Install wkhtmltopdf so you have at least version 0.12.x in the path.
Set up Fetchmail
- Copy
fetchmailrc-template
tofetchmailrc
. - Change permissions on
fetchmailrc
:chmod 0600 fetchmailrc
- Run
touch logs/fetchmail.log
to create the initial log file.
Set up Outgoing Mail
-
Install a MTA. I've configured msmtp to relay to another SMTP server. If you use procmail to relay, you may want to investigate adding
message_size_limit = 20480000
to your config. -
Run the test files in
tests/
. -
Once that works, set up a cron job to run fetchmail.
@reboot fetchmail -f webpage/fetchmailrc > /dev/null 2>&1
Configuration
Create a file called banned
and add email addresses in there when you want users to be banned from using the system.