analytically / hadoop-ansible

Ansible playbook that installs a Hadoop cluster, with HBase, Hive, Presto for analytics, and Ganglia, Smokeping, Fluentd, Elasticsearch and Kibana for monitoring and centralized log indexing.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Spin up some DigitalOcean boxes when running Travis to deploy a full Hadoop stack

analytically opened this issue · comments

It'd be good if I could fully test the playbook using Travis CI. It'd go like:

  • validate playbook
  • before_script: setup hosts in Travis CI box (/etc/hosts)
  • run ansible script to spin up 8 DO hosts with an encrypted key using my account
  • run playbook (this might take a long time, need to check if Travis CI is ok with this)
  • validate Hadoop was correctly installed (port checks?)
  • after_script: destroy DO hosts, check they are destroyed

Travis build timeouts are probably okay:

Because it is very common to see test suites or before scripts to hang up, Travis CI has hard time limits. If a script or test suite takes longer to run, the build will be forcefully terminated and you will see a message about this in your build log.

With our current timeouts, a build will be terminated if it's still running after 50 minutes (respectively 70 on travis-ci.com), or if there hasn't been any log output in 10 minutes.

Secure environment variables are important here too. Note that they're stripped some of the time (namely pull requests from third parties), so be sure to fail gracefully if they're absent.

Cleanup is a consideration since after_script might not actually get run, i.e. if the Travis node explodes mid-test. Launching the droplets and immediately issuing shutdown -h +60 won't help since DigitalOcean doesn't have a way to terminate droplets on shutdown. I think the failsafe needs to be a recurring process on some other machine that finds and terminates any testing-related DO droplets running longer than an hour.

It's partially functioning. Ironing the last issues. https://travis-ci.org/analytically/hadoop-ansible

Fully functional.

I feel like CI screenshots don't belong inside this repo. Each build adds 800 KB of screenshots to hadoop-ansible that everyone will have to clone.

Now pushes screenshots to S3.