RADAR-CNS back-end

RADAR-Backend is a Java application based on Confluent Platform to standardize, analyze and persist data collected by RADAR-CNS data sources. It supports the backend requirements of RADAR-CNS project. The data is produced and consumed in Apache Avro format using the schema stored inside the RADAR-CNS schema repository.

RADAR-Backend provides an abstract layer to monitor and analyze streams of wearable data and write data to Hot or Cold storage. The Application Programming Interfaces (APIs) of RADAR-Backend makes the process of to integrating additional topics, wearable devices easier. It currently provides MongoDB as the Hot-storage and HDFS data store as the Cold-storage. They can be easily tuned using property files. The stream-monitors monitor topics and notify users (e.g. via emails) under given circumstances.

Dependencies

The following are the prerequisites to run RADAR-Backend on your machine:

Java 8
Confluent Platform 5.0.0 ( Running instances of Zookeeper, Kafka-broker(s), Schema-Registry and Kafka-REST-Proxy services ).
SMTP server to send notifications from the monitors.

Installation

Install the dependencies mentioned above.

Clone radar-backend repository.

git clone https://github.com/RADAR-base/radar-backend.git

Build the project from project directory

# Navigate to project directory
cd RADAR-Backend/

# Clean
./gradlew clean

# Build
./gradlew distTar

# Unpack the binaries
sudo mkdir -p /usr/local && \
sudo tar --strip-components 1 -C /usr/local xzf build/distributions/*.tar.gz

Now the backend is available as the /usr/local/bin/radar-backend script.

Usage

The RADAR command-line has three subcommands: stream, monitor and mock. The stream command will start all streams, the monitor command will start all monitors, and the mock command will send mock data to the backend. Before any of these commands are issued, start the Confluent platform with the zookeeper, kafka, schema-registry and rest-proxy components. Put the build/libs/radarbackend-1.0.jar and radar.yml in the same folder, and then modify radar.yml:

RADAR-Backend streams

In radar.yml, Specify in which mode you want to run the application. There are two alternatives: standalone and high_performance. The standalone starts one thread for each streams without checking the priority, whereas the high_performance starts as many thread as the related priority value
If auto.create.topics.enable is false in your Kafka server.properties, before starting you must create the topics manually. The stream server will print what topics to create.
Run radar-backend with configured radar.yml and stream argument
```
radar-backend -c path/to/radar.yml stream
```

The phone usage event stream uses an internal cache of 1 million elements, which may take about 50 MB of memory. Adjust org.radarcns.stream.phone.PhoneUsageStream.MAX_CACHE_SIZE to change it.

RADAR-backend monitors

To get email notifications for Empatica E4 battery status, an email server without a password set up, for example on localhost.

For battery status monitor, configure the following

battery_monitor:
  # level of battery you want to monitor
  level: CRITICAL 
  # list of email addresses to be notified   
  notify:
    - project_id: s1
      email_address:
        - test@thehyve.nl
    - project_id: s2
      email_address:
        - radar@thehyve.nl
  # host name of your email server     
  email_host: localhost
  # port of email server   
  email_port: 25
  # notifying email account   
  email_user: noreply@example.com
  # list of topics to be monitored ( related to monitor behavior)   
  topics:
    - android_empatica_e4_battery_level

For device connection monitor, configure the following

disconnect_monitor:
  # timeout in milliseconds -> 5 minutes
  timeout: 300000
  email_host: localhost
  email_port: 25
  email_user: no-reply@example.com
  notify:
    - project_id: s1
      email_address:
        - test@thehyve.nl
    - project_id: s2
      email_address:
        - radar@thehyve.nl
  # temperature readings are sent very regularly, but
  # not too often.
  topics:
    - android_empatica_e4_temperature

For Source Statistics monitors, configure what source topics to monitor to output some basic output statistics (like last time seen)

stream:
  statistics_monitors:
    # Human readable monitor name
    - name: Empatica E4
      # topics to aggregate. This can take any number of topics that may
      # lead to slightly different statistics
      topics:
        - android_empatica_e4_blood_volume_pulse_1min
      # Topic to write results to. This should follow the convention
      # source_statistics_[provider]_[model] with produer and model as
      # defined in RADAR-Schemas
      output_topic: source_statistics_empatica_e4
      # Maximum batch size to aggregate before sending results.
      # Defaults to 1000.
      max_batch_size: 500
      # Flush timeout in milliseconds. If the batch size is not larger than
      # max_batch_size for this amount of time, the current batch is
      # forcefully flushed to the output topic.
      # Defaults to 60000 = 1 minute.
      flush_timeout: 15000
    - name: Biovotion VSM1
      topics:
        - android_biovotion_vsm1_acceleration_1min
      output_topic: source_statistics_biovotion_vsm1
    - name: RADAR pRMT
      topics:
        - android_phone_acceleration_1min
        - android_phone_bluetooth_devices
        - android_phone_sms
      output_topic: source_statistics_radar_prmt

Run radar-backend with configured radar.yml and monitor argument
```
radar-backend -c path/to/radar.yml monitor
```

Send mock data to the backend

Configure the REST proxy setting in radar.yml:

rest-proxy:
  host: radar-test.thehyve.net
  port: 8082
  protocol: http

To send pre-made data, create a mock_data.yml YAML file with the following contents:
```
data:
  - topic: topic1
    file: topic1.csv
    key_schema: org.radarcns.kafka.ObservationKey
    value_schema: org.radarcns.passive.empatica.EmpaticaE4Acceleration
```
Each value has a topic to send the data to, a file containing the data, a schema class for the key and a schema class for the value. Also create a CSV file for each of these entries:
```
userId,sourceId,time,timeReceived,acceleration
a,b,14191933191.223,14191933193.223,[0.001;0.3222;0.6342]
a,c,14191933194.223,14191933195.223,[0.13131;0.6241;0.2423]
```
Note that for array entries, use brackets ([ and ]) to enclose the values and use ; as a delimiter.
To generate data on some backend_mock_empatica_e4_<> topic with a number of devices, run (substitute <num-devices> with the needed number of devices):
```
radar-backend -c path/to/radar.yml mock --devices <num-devices>
```
Press Ctrl-C to stop.
To generate the file data configured in point 2, run
```
radar-backend -c path/to/radar.yml mock --file mock_data.yml
```
The data sending will automatically be stopped.

Docker image

The backend is published to Docker Hub. Mount a /etc/radar.yml file to configure either the streams or the monitor.

This image requires the following environment variable:

KAFKA_REST_PROXY: a valid Rest-Proxy instance
KAFKA_SCHEMA_REGISTRY: a valid Confluent Schema Registry.
KAFKA_BROKERS: number of brokers expected (default: 3).

For a complete use case scenario, check the RADAR-base docker-compose file available here

Contributing

Code should be formatted using the Google Java Code Style Guide. If you want to contribute a feature or fix browse our issues, and please make a pull request.

There are currently two APIs in RADAR-Backend: one for streaming data (RADAR-Stream) and one for monitoring topics (RADAR-Monitor). To contribute to those APIs, please mind the following.

Extending RADAR-Stream

RADAR-Stream is a layer on top of Kafka streams. Topics are processed by streams in two phases. First, a group of sensor streams aggregates data of sensors into predefined time windows (e.g., 10 seconds). Next, internal topics aggregate and transforms data that has already been processed by an earlier stream.

KafkaStreams currently communicates using master-slave model. The StreamMaster defines the stream-master, while StreamWorker represents the stream-slave. The master-stream creates, starts and stops a list of stream-slaves registered with the corresponding master. While the classical Kafka Consumer requires two implementations to support standalone and group executions, the StreamWorker provides both behaviors with one implementation.

To extend the RADAR-Stream API, follow these steps (see the org.radarcns.passive.empatica package as an example):

For each topic, create a StreamWorker or more conveniently extend SensorStreamWorker.
Add the stream topic to the stream: streams: [{class: MyClass}] configuration

Empatica E4

Currently, RADAR-Backend provides implementation to stream, monitor, store Empatica E4 topics data produced by RADAR-AndroidApplication. It defines the following streams:

E4Acceleration aggregates data coming from accelerometer
E4BatteryLevel aggregates battery level information
E4BloodVolumePulse aggregates blood volume pulse data
E4ElectroDermalActivity aggregates electrodermal activity informations
E4InterBeatInterval aggregates inter-beat-interval data
E4Temperature aggregates data coming form temperature sensor

And one internal topic:

E4HeartRate: starting from the inter-beat-interval, this aggregator computes the heart rate

DeviceTimestampExtractor implements a TimestampExtractor such that: given in input a generic Apache Avro object, it extracts a field named timeReceived. DeviceTimestampExtractor works with the entire set of sensor schemas currently available.

Android Phone

For the Android Phone, there is a stream to get an app category from the Google Play Store categories for app usage events.

Extending RADAR-Monitor

Monitors can be used to evaluate the status of a single stream, for example whether each device is still online, has acceptable values and is transmitting at an acceptable rate. To create a new monitor, extend AbstractKafkaMonitor. To use the monitor from the command-line, modify KafkaMonitorFactory. See DisconnectMonitor for an example.

NOTE

Another path to the YAML configuration file can be given with the -c flag:
```
# Custom
java -jar radarbackend-1.0.jar -c path/to/radar.yml
```
the default log path is the jar folder

RADAR-base / RADAR-Backend