josiahdavis / lambdamap

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

lambdamap

Massively parallel serverless computing using AWS Lambda.

Installation

Python 3.8, conda, npm

Python 3.8 is preferred, which can be easily via conda:

npm can then be installed as follows

conda install -c conda-forge nodejs

Python Package

# Install the `lambdamap` python package
pip3 install -e .

Lambda Container Stack

# Install the AWS CDK Toolkit and CLI
npm i -g aws-cdk

cd ./cdk

# Install the AWS CDK Python 3.x dependencies
pip3 install -r requirements.txt

# Deploy the LambdaMap stack
cdk bootstrap
cdk deploy

# Generate the CFN template
cdk synth

Dockerfile

You can specify additional system and Python packages to be used by the Lambda container in ./cdk/stack/Dockerfile.

lambdamap
├── cdk/
│   ├── app.py
│   ├── cdk.json
│   ├── requirements.txt
│   ├── setup.py
│   ├── source.bat
│   └── stack/
│       ├── Dockerfile <--- modify to include custom system and Python packages
│       ├── __init__.py
│       ├── lambda.py
│       └── stack.py
├── lambdamap/
│   ├── core.py
│   └── __init__.py
├── README.md
└── setup.py

Example Usage

import pandas as pd
from lambdamap import LambdaExecutor

# Define your custom function
def my_power(x, **kwargs):
    import numpy as np
    import pandas as pd
    
    exponent = kwargs.get("exponent", 2)
    
    df = pd.DataFrame()
    df["formula"] = [f"{x}**{exponent}"]
    df["result"] = np.power(x, exponent)
    
    return df

# Instantiate the Lambda executor
executor = LambdaExecutor(
    max_workers=1000,
    lambda_arn="LambdaMapFunction")

# Generate the function payloads
payloads = [{"args": (i,), "kwargs": {"exponent": 2}} for i in range(1000)]

# Distribute the function calls over the lambdas
results = executor.map(my_power, payloads)

# Concatenate the list of results into a single dataframe
df_results = pd.concat(results)
df_results

About

License:Apache License 2.0


Languages

Language:Python 86.9%Language:Dockerfile 8.0%Language:Batchfile 5.1%