RADAR-Backend is a Java application based on Confluent Platform to standardize, analyze and persist data collected by RADAR-CNS data sources. It supports the backend requirements of RADAR-CNS project. The data is produced and consumed in Apache Avro format using the schema stored inside the RADAR-CNS schema repository.
RADAR-Backend provides an abstract layer to monitor and analyze streams of wearable data and write data to Hot or Cold storage. The Application Programming Interfaces (APIs) of RADAR-Backend makes the process of to integrating additional topics, wearable devices easier. It currently provides MongoDB as the Hot-storage and HDFS data store as the Cold-storage. They can be easily tuned using property files. The stream-monitors monitor topics and notify users (e.g. via emails) under given circumstances.
The following are the prerequisites to run RADAR-Backend on your machine:
- Java 8
- Confluent Platform 5.0.0 ( Running instances of Zookeeper, Kafka-broker(s), Schema-Registry and Kafka-REST-Proxy services ).
- SMTP server to send notifications from the monitors.
-
Install the dependencies mentioned above.
-
Clone radar-backend repository.
git clone https://github.com/RADAR-base/radar-backend.git
-
Build the project from project directory
# Navigate to project directory cd RADAR-Backend/ # Clean ./gradlew clean # Build ./gradlew distTar # Unpack the binaries sudo mkdir -p /usr/local && \ sudo tar --strip-components 1 -C /usr/local xzf build/distributions/*.tar.gz
Now the backend is available as the
/usr/local/bin/radar-backend
script.
The RADAR command-line has three subcommands: stream
, monitor
and mock
. The stream
command will start all streams, the monitor command
will start all monitors, and the mock
command will send mock data to the backend. Before any of these commands are issued, start the Confluent platform with the zookeeper, kafka, schema-registry and rest-proxy components. Put the build/libs/radarbackend-1.0.jar
and radar.yml
in the same folder, and then modify radar.yml
:
-
In
radar.yml
, Specify in whichmode
you want to run the application. There are two alternatives:standalone
andhigh_performance
. Thestandalone
starts one thread for each streams without checking the priority, whereas thehigh_performance
starts as many thread as the related priority value -
If
auto.create.topics.enable
isfalse
in your Kafkaserver.properties
, before starting you must create the topics manually. The stream server will print what topics to create. -
Run
radar-backend
with configuredradar.yml
andstream
argumentradar-backend -c path/to/radar.yml stream
The phone usage event stream uses an internal cache of 1 million elements, which may take about 50 MB of memory. Adjust org.radarcns.stream.phone.PhoneUsageStream.MAX_CACHE_SIZE
to change it.
To get email notifications for Empatica E4 battery status, an email server without a password set up, for example on localhost
.
-
For battery status monitor, configure the following
battery_monitor: # level of battery you want to monitor level: CRITICAL # list of email addresses to be notified notify: - project_id: s1 email_address: - test@thehyve.nl - project_id: s2 email_address: - radar@thehyve.nl # host name of your email server email_host: localhost # port of email server email_port: 25 # notifying email account email_user: noreply@example.com # list of topics to be monitored ( related to monitor behavior) topics: - android_empatica_e4_battery_level
-
For device connection monitor, configure the following
disconnect_monitor: # timeout in milliseconds -> 5 minutes timeout: 300000 email_host: localhost email_port: 25 email_user: no-reply@example.com notify: - project_id: s1 email_address: - test@thehyve.nl - project_id: s2 email_address: - radar@thehyve.nl # temperature readings are sent very regularly, but # not too often. topics: - android_empatica_e4_temperature
-
For Source Statistics monitors, configure what source topics to monitor to output some basic output statistics (like last time seen)
stream: statistics_monitors: # Human readable monitor name - name: Empatica E4 # topics to aggregate. This can take any number of topics that may # lead to slightly different statistics topics: - android_empatica_e4_blood_volume_pulse_1min # Topic to write results to. This should follow the convention # source_statistics_[provider]_[model] with produer and model as # defined in RADAR-Schemas output_topic: source_statistics_empatica_e4 # Maximum batch size to aggregate before sending results. # Defaults to 1000. max_batch_size: 500 # Flush timeout in milliseconds. If the batch size is not larger than # max_batch_size for this amount of time, the current batch is # forcefully flushed to the output topic. # Defaults to 60000 = 1 minute. flush_timeout: 15000 - name: Biovotion VSM1 topics: - android_biovotion_vsm1_acceleration_1min output_topic: source_statistics_biovotion_vsm1 - name: RADAR pRMT topics: - android_phone_acceleration_1min - android_phone_bluetooth_devices - android_phone_sms output_topic: source_statistics_radar_prmt
-
Run
radar-backend
with configuredradar.yml
andmonitor
argumentradar-backend -c path/to/radar.yml monitor
-
Configure the REST proxy setting in
radar.yml
:rest-proxy: host: radar-test.thehyve.net port: 8082 protocol: http
-
To send pre-made data, create a
mock_data.yml
YAML file with the following contents:data: - topic: topic1 file: topic1.csv key_schema: org.radarcns.kafka.ObservationKey value_schema: org.radarcns.passive.empatica.EmpaticaE4Acceleration
Each value has a topic to send the data to, a file containing the data, a schema class for the key and a schema class for the value. Also create a CSV file for each of these entries:
userId,sourceId,time,timeReceived,acceleration a,b,14191933191.223,14191933193.223,[0.001;0.3222;0.6342] a,c,14191933194.223,14191933195.223,[0.13131;0.6241;0.2423]
Note that for array entries, use brackets (
[
and]
) to enclose the values and use;
as a delimiter. -
To generate data on some
backend_mock_empatica_e4_<>
topic with a number of devices, run (substitute<num-devices>
with the needed number of devices):radar-backend -c path/to/radar.yml mock --devices <num-devices>
Press
Ctrl-C
to stop. -
To generate the file data configured in point 2, run
radar-backend -c path/to/radar.yml mock --file mock_data.yml
The data sending will automatically be stopped.
The backend is published to Docker Hub. Mount a /etc/radar.yml
file to configure either the streams or the monitor.
This image requires the following environment variable:
KAFKA_REST_PROXY
: a valid Rest-Proxy instanceKAFKA_SCHEMA_REGISTRY
: a valid Confluent Schema Registry.KAFKA_BROKERS
: number of brokers expected (default: 3).
For a complete use case scenario, check the RADAR-base docker-compose
file available here
Code should be formatted using the Google Java Code Style Guide. If you want to contribute a feature or fix browse our issues, and please make a pull request.
There are currently two APIs in RADAR-Backend: one for streaming data (RADAR-Stream) and one for monitoring topics (RADAR-Monitor). To contribute to those APIs, please mind the following.
RADAR-Stream is a layer on top of Kafka streams. Topics are processed by streams in two phases. First, a group of sensor streams aggregates data of sensors into predefined time windows (e.g., 10 seconds). Next, internal topics aggregate and transforms data that has already been processed by an earlier stream.
KafkaStreams currently communicates using master-slave model. The StreamMaster defines the stream-master, while StreamWorker represents the stream-slave. The master-stream creates, starts and stops a list of stream-slaves registered with the corresponding master. While the classical Kafka Consumer requires two implementations to support standalone and group executions, the StreamWorker provides both behaviors with one implementation.
To extend the RADAR-Stream API, follow these steps (see the org.radarcns.passive.empatica
package as an example):
- For each topic, create a StreamWorker or more conveniently extend SensorStreamWorker.
- Add the stream topic to the
stream: streams: [{class: MyClass}]
configuration
Currently, RADAR-Backend provides implementation to stream, monitor, store Empatica E4 topics data produced by RADAR-AndroidApplication. It defines the following streams:
- E4Acceleration aggregates data coming from accelerometer
- E4BatteryLevel aggregates battery level information
- E4BloodVolumePulse aggregates blood volume pulse data
- E4ElectroDermalActivity aggregates electrodermal activity informations
- E4InterBeatInterval aggregates inter-beat-interval data
- E4Temperature aggregates data coming form temperature sensor
And one internal topic:
- E4HeartRate: starting from the inter-beat-interval, this aggregator computes the heart rate
DeviceTimestampExtractor implements a TimestampExtractor such that: given in input a generic Apache Avro object, it extracts a field named timeReceived
. DeviceTimestampExtractor works with the entire set of sensor schemas currently available.
For the Android Phone, there is a stream to get an app category from the Google Play Store categories for app usage events.
Monitors can be used to evaluate the status of a single stream, for example whether each device is still online, has acceptable values and is transmitting at an acceptable rate. To create a new monitor, extend AbstractKafkaMonitor. To use the monitor from the command-line, modify KafkaMonitorFactory. See DisconnectMonitor for an example.
-
Another path to the YAML configuration file can be given with the
-c
flag:# Custom java -jar radarbackend-1.0.jar -c path/to/radar.yml
-
the default log path is the jar folder