Model Transparency

Overview
Projects
- Model Signing
- SLSA for ML
Status
Contributing

Overview

There is currently significant growth in the number of ML-powered applications. This brings benefits, but it also provides grounds for attackers to exploit unsuspecting ML users. This is why Google launched the Secure AI Framework (SAIF) to establish industry standards for creating trustworthy and responsible AI applications. The first principle of SAIF is to

Expand strong security foundations to the AI ecosystem

Building on the work with Open Source Security Foundation, we are creating this repository to demonstrate how the ML supply chain can be strengthened in the same way as the traditional software supply chain.

This repository hosts a collection of utilities and examples related to the security of machine learning pipelines. The focus is on providing verifiable claims about the integrity and provenance of the resulting models, meaning users can check for themselves that these claims are true rather than having to just trust the model trainer.

Projects

Currently, there are two main projects in the repository: model signing (to prevent tampering of models after publication to ML model hubs) and SLSA (to prevent tampering of models during the build process).

Model Signing

This project demonstrates how to protect the integrity of a model by signing it with Sigstore, a tool for making code signatures transparent without requiring management of cryptographic key material.

When users download a given version of a signed model they can check that the signature comes from a known or trusted identity and thus that the model hasn't been tampered with after training.

We are able to sign large models with very good performance, as the following table shows:

Model	Size	Sign Time	Verify Time
roberta-base-11	8K	1s	0.6s
hustvl/YOLOP	215M	1s	1s
bertseq2seq	2.8G	1.9s	1.4s
bert-base-uncased	3.3G	1.6s	1.1s
tiiuae/falcon-7b	14GB	2.1s	1.8s

See model_signing/README.md for more information.

SLSA for ML

This project shows how we can generate SLSA provenance for ML models, using either Github Actions or Google Cloud Platform.

SLSA was originally developed for traditional software to protect against tampering with builds, such as in the Solarwinds attack, and this project is a proof of concept that the same supply chain protections can be applied to ML.

We support both TensorFlow and PyTorch models. The examples train a model on [CIFAR10][cifar10] dataset, save it in one of the supported formats, and generate provenance for the output. The supported formats are:

Workflow Argument	Training Framework	Model format
`tensorflow_model.keras`	TensorFlow	Keras format (default)
`tensorflow_hdf5_model.h5`	TensorFlow	Legacy HDF5 format
`tensorflow_hdf5.weights.h5`	TensorFlow	Legacy HDF5 weights only format
`pytorch_model.pth`	PyTorch	PyTorch default format
`pytorch_full_model.pth`	PyTorch	PyTorch complete model format
`pytorch_jitted_model.pt`	PyTorch	PyTorch TorchScript format

See slsa_for_models/README.md for more information.

Status

This project is currently experimental, not ready for all production use-cases. We may make breaking changes until the first official release.

Contributing

Please see the Contributor Guide for more information.

laurentsimon / model-transparency