google / oculi

A Google Cloud-based pipeline for tagging image and video ads based on their content, enabling advanced creative analysis.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Note: This is not an officially supported Google product. It is a reference implementation.

Oculi

Oculi is a Google Cloud-based pipeline for tagging large sets of images or videos with labels based on their content, generating a BigQuery dataset for further analysis. Content tagging is done through Cloud's pre-trained computer vision models (Vision API and Video Intelligence API).

The primary use case is for analyzing creatives (images and videos) in digital advertising. Combined with creative performance data, the output from this pipeline can be used to explore correlations between advertising content and performance (e.g. creatives with a human model tend to perform better).

Creative Sources

This pipeline supports three sources of creatives 1:

  • A Google Campaign Manager (CM) account. Oculi will attempt to extract all creatives on the account that have an image or video asset in a suitable format 2, then download the asset and save a copy to Cloud Storage. Users of DV360 may be able to use this option (see FAQ).
  • A BigQuery table of URLs. URLs must point to images or videos, and be accessible without login. Oculi will download the asset and save a copy to Cloud Storage. The required table columns are:
    • Creative_ID, an unique integer for each image or video
    • Advertiser_ID, an integer identifying a parent entity
    • Creative_Name, a text field
    • Full_URL, the URL to the image or video file
  • A Google Cloud Storage (GCS) bucket of creative files. Files must be image or video files (JPG and MP4 preferred) at the top level of the bucket.
    • If the filenames follow the convention {numeric_id}_{other_stuff}.jpg, then the numeric_id will be used as the creative_id.
    • Otherwise, a creative_id is generated by calling the Python hash() function on the entire filename.

You could deploy Oculi in couple of ways:

  • Colab - Files located under colab subdirectory
  • Dataflow - Files located under Dataflow subdirectory

Footnotes

  1. The term "creative" is used to indicate just the image or video content in an ad, excluding other components like targeting preferences.

  2. This excludes dynamic creatives. The goal of this pipeline is to break images and videos down into their content components; dynamic creatives are already broken into content components. Rather, analysis of the data from this pipeline can be used to inform a dynamic creative strategy.

About

A Google Cloud-based pipeline for tagging image and video ads based on their content, enabling advanced creative analysis.

License:Apache License 2.0


Languages

Language:Jupyter Notebook 72.6%Language:Python 25.9%Language:Shell 1.4%