A proving ground for an image converting utility.
As of now, this project contains a script to run locally that scales images with the greatest possible throughput.
All tests are run three times in a row, with the median time displayed here.
Latest Version (2 instances of RemoteMaster, each w/ 4 workers)
0m29.069s
Wow, 3x improvement! What changed here was using the gm command scale
instead of resize
(at the suggestion of gm4java's author. The big difference is partly because the image no longer needs its dimensions checked since scale takes an output resolution while resize takes a percentage)
Latest Version (1 instance of RemoteMaster w/ 8 workers)
0m27.819s
Version: e1a34b9 (2 instances of RemoteMaster each w/ 4 workers)
1m51.552s
Version: e1a34b9 (3 instances of RemoteMaster each w/ 3 workers)
1m39.190s
Version: e1a34b9 (4 instances of RemoteMaster each w/ 2 workers)
1m39.871s
Version: e1a34b9 (4 instances of RemoteMaster each w/ 3 workers)
1m34.310s
Version: e1a34b9 (4 instances of RemoteMaster 2 w/ 4 workers, 2 w/ 3) (4 w /4 failed)
1m32.502s
localdir Documents/doublet/test_images/avg_size
Running serially (version in the 2nd commit: e2b7af8)
real 1m58.674s
Old Parallel akka version with 5 workers (commit: ab97939):
real 0m43.117s
(with 4 workers was 2 seconds slower)
Version: e1a34b9 (two instances of RemoteMaster each w/ 4 workers)
real 0m34.700s
The number of ImageProcessor
workers that you should start (which each start 1 gm process) should equal the number of logical cores on your machine returned by: Runtime.getRuntime.availableProcessors
.
Optimally, there should be as many gm processes started as there are effective cores in your computer. If you are on an intel chipset, it is likely that your processor is hyper-threaded, and the number of independent execution engines available is not equal to the number of physically distinct processors in your machine.
Open 3 terminal widows in the root directory of singlet:
- In window 1,
sbt 'run-main remote.Remote "1" "4"'
- this is the 1st Remote worker manager
- In window 2,
sbt 'run-main remote.Remote "2" "4"'
- this is the 2nd Remote worker manager
- In window 3, run
sbt 'run-main local.Local "2" "/Users/me/Pictures/photo.jpg" "/Users/me/Pictures/diff_photo.jpg"'
to convert these two images in parallel.- we are telling the local master that there are 2 remote worker managers
The example above can be done with more/less Remote worker managers, each with their own number of Actors (this is capped by how much memory you have available for each JVM).
requires Java JDK 7 for java.nio
, get it here
- portability/modularity
- providers
- storage
- support custom processing (either natively or through pre/post-processing via a separate service)
- List of these is coming
- overall processing speed
- parallelized processing of different versions
- scalability across a cluster is easily supported by the communication between Actors in different JVMs demonstrated by
LocalMaster
andRemoteMaster
- scalability across a cluster is easily supported by the communication between Actors in different JVMs demonstrated by
- prioritizing more immediately-needed versions
- available and coming soon http://doc.akka.io/docs/akka/snapshot/scala/mailboxes.html
- handle very large original files
- performance tests for these are coming
- handle wide variety of original formats
- List of supported formats includes animated GIFs
- preserve originals
- handle new kinds of image versions, either based on ad-hoc URLs, or by management of approved versions
- would be easy as demonstrated in the
ImageResize
actor
- would be easy as demonstrated in the
- incremental backup (particularly of originals)
- good local/development workflow (it should at least work)
- stability: processing should recover, no matter how large or corrupt an original
- On failure re-queuing of an image with actors is easy. Actors provide great encapsulation for individual image failures. Demo of this is coming
- feasible migration path for legacy images
(c) Artsy, 2015 + Ilya Kavalerov
MIT