ryansb / sklearn-build-lambda

Build the numpy/scipy/scikitlearn packages and strip them down to run in Lambda

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

how does this work

Jfeng3 opened this issue · comments

commented

Hi

  1. thanks for making this, it really helps!
  2. New to lambda, May I ask when you"to use them add your handler file to the zip, and add the lib directory so it can be used for shared libs" How exactly does it work? say I create a main.py with handler function, where do I place the main.py?

I put my "main.py" file the same level as the lib (and other) directories in the example at https://github.com/ryansb/sklearn-build-lambda/blob/master/sample-site-packages-2016-02-20.zip. You also must specify the handler function in that file in the Amazon Lambda config.

Also, I don't use sklearn, just numpy and scipy, and ever since AWS Lambda started supporting the lib directory on the default search path, I have found NO NEED for this (slowish) loading step from the README.md example:

import ctypes

for d, _, files in os.walk('lib'):
    for f in files:
        if f.endswith('.a'):
            continue
        ctypes.cdll.LoadLibrary(os.path.join(d, f))

Oh, really? Thanks @thunderfish24 I'll have to test that out. I'd love to dump that hack 😄

@thunderfish24 How the import section of your lambda handler looks like? Cheers,

The directory structure for the lambda deploy looks like (recall that I don't use sklearn):

  • lambda_function_package
    • main.py
    • lib/
      • libatlas.a
      • ...
    • numpy/
      • compat/
      • ...
    • scipy/
      • _build_utils/
      • ...
    • my_package/
      • ...

After some other deploy pre-processing for my specific application, I use this bash command to zip up all the files FROM WITHIN THE lambda_function_package DIRECTORY:

# Zip it up at maximum compression
zip -r9 ../lambda_function_package.zip * -x "*.pyc" -x "*tests*"

The handler function main.py looks like:

import numpy as np
import scipy as sp

import my_package

def handler(event, context):
    ...handler code...

# Allow testing from command line
if __name__ == "__main__":
    # Create test event
    event = ...

    print(handler(event, None))

This is great. Thanks.

I cannot figure why, though, my directory structure looks so different after running the build.sh:

lambda_function_package
---main.py
---lib/
------python2.7/
---------site-packages/
------------sklearn/
------------...
---lib64/
------python2.7/
---------site-packages/
------------sklearn/
------------...

Any idea how to bring the packages (opencv-python, sklearn, etc...) to the top level?