emaxerrno / ai-wasm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A simple example of exporting a transformer model with Python, then loading it into tract to do question answering via Extractive QA. This example uses the MobileBERT model (previous commits contain an Albert model implementation). We also load it into Redpanda (via WebAssembly!) to show how AI models can be embedded directly into your favorite message broker.

To Use

First export the pre-trained transformer model using Python and PyTorch.

pip3 install transformers torch torchinfo
python3 export.py

the exported model and tokenizer are saved in ./mobilebert. Then build the wasm module for deployment in Redpanda.

RUSTFLAGS="-Ctarget-feature=+simd128" cargo build --release --target=wasm32-wasi

Enable WebAssembly and tweak configs in the latest build of Redpanda:

rpk cluster config set data_transforms_enabled true
# NOTE: These limits allow for a single transform with half a GiB of memory.
rpk cluster config set data_transforms_per_core_memory_reservation 536870912
rpk cluster config set data_transforms_per_function_memory_limit 536870912
# Since we're hackily embedding the model in the Wasm binary, we need to support large binaries.
rpk cluster config set data_transforms_binary_max_size 125829120
# Allow some extra time on startup over the default. This could probably be lower.
rpk cluster config set data_transforms_runtime_limit_ms 30000
# Restart Redpanda!

Deploy the model into Redpanda!

rpk topic create questions answers -r 3
rpk wasm deploy --file ./target/wasm32-wasi/release/ai-qa-wasi.wasm \
  --input-topic questions \
  --output-topic answers \
  --var CONTENT='My name is Bert and I live in the broker.'

echo "What is my name?" | rpk topic produce questions
rpk topic consume answers

About


Languages

Language:Rust 53.8%Language:Shell 23.7%Language:Python 22.5%