dorion / CO2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Video conference GHG emission savings calculator, based on session logs processing

Introduction

This project is a GNU Gatekeeper CDR log messages processor that calculates GHG emission savings of using video conferencing

Features

  • filters the original log messages from MySQL database

  • separates the conferences and the participants from the CDR messages

  • resolves participant locations by GDS number or IP address

  • resolves distance and possible travel time (by air or car) with Google distance API between participants' locations

  • creates table based GHG emission savings report with monthly and yearly resolution

  • shows participants' GHG emission saving per conference session

  • shows the particiapts' locations and the calculated route (by car) on Google Map

How does it work?

Install the croner scripts, which process the logs by the algorithm. Filter out the unprocessable data and move data into the temp table. Then scripts from the CDR log message separate and write into database the conferences and the conferences participants. Than resolve the participants' and the conference locations. Finally resolve the distance and the possible travel time between the conference and the participants.

After that the Drupal module shows the GHG emission saved in yearly and monthly reports.

Algorithm

As the first step, we collect the conference log entries from various databases and consolidate them into a common data structure into a temporary database table. After pre filtering the logs/conferences as the second step we restructure the data into a structure that can be handled much more easier later while preparing the reports.

Here the CONFERENCE_ID and the PARTICIPANT_ID are unique database scoped own identifiers. In the process almost all personal data.

While filtering the data, we use the following algorithm to calculate the derived values:

  1. Collect every information in a common format;

  2. Drop all conference shorter than 5 minutes assuming that these were test calls and also narrow log records of only one gatekeeper.

  3. Before processing drop log records of the VCR (Video Content Recorder)-

  4. Than separate the peer-to-peer, MCU conferences and the participants and create connection between them. While (process every conference assuming that CONF_ID is a unique identifier): if (the conference do not have more than one participant): We can forget about this conference

  5. Assign the location of the participants: if (the IP or GDS of participant is known): if (we know really specific location from our own register): store location information into the participant record else: Find out the most exact location information and store it into the participant record (City or Country level information depending on the source: IP or GDS) else: We have no location information about the participant so we can forget about it. Remove Participant and the related confernece-participant connection from the database

  6. Set the location, start_time and duration of the conference to a randomly chosen participant

  7. Calculate the distance of every participants of this conference to its location and the time of the travel needed using Google Maps API

  8. Since the frequency of this query is limited the results are cached.

  9. Create reports using mainly "Conference-Participant" table for the time slot requested by the user by summarizing the distances not done.

Input format

At our organization we collect CDRs into MySQL tables. Therefore it is easy to filter and convert to the input format which is the same as templ_log table schema below: * conf_id char (100): identifier of a conference * start_datetime date time: when the conference was start * duration integer (10): conference duration in second * caller_GDS char (100): caller’s GDS number if it exists * called_GDS char (100): called’s GDS number if it exists * caller_IP char (100): caller’s IP address if it exists * called_IP char (100): called’s IP address if it exists

Afterwards we process these database records, and not the original ones. For this reason, if someone wants to use this application he/she should convert the available log entries into this format and insert them into temp_log table.

There is no import API or interface yet. This conversion must be done manually, we do not provide any converter tool.

Filtering

Before parsing the log, we have to filter the input data: 1. First select record of only one gatekeeper 2. Then drop the short conference (less then 5 minutes) 3. After that drop the VCR (Video Content Recorder) records 4. Finally insert the records into the temp_log table.

In practice this is only two steps: an SQL SELECT and an SQL INSERT operation.

Conference and participant unstick

In the beginning separate the MCU conference and the participants of this type of conference. The procedure is as follows: 1. Select all records from temp_log table where the called IP, called GDS, the caller IP or the caller GDS belong to MCU 2. Then group these records by date and GDS number, these are the conferences, store them into conf table 3. The groups contain the participant, so we can store the info into the participant table 4. After storing the participants, we can create connection between the conferences and the participants 5. Finally delete all the processed records

After that process the peer-to-peer video conference. It is very simple because if a record does not belong to an MCU conference that refers to a peer-to-peer conference. These conferences represented by only one record in the database so we do not have to group them. 1. Select records that are not MCU conferences 2. Create the participants and the conference per record information 3. Connect them into the conf_part_trans table 4. Clean this record from the temp_log table

Create the participants per conference info and add a conference. This time we only know participant’s GDS number or/and IP number. About the conference we generally know only the start date and time and the conference duration what we calculate from the first participant’s connection time and the last’s disconnection time.

Location resolver

When we have participants we need to resolve their location: 1. Select participants which have got unresolved location 2. In this action we use our own GDS number-location database if there is a domestic (Hungarian) GDS number (our database contains only Hungarian data) 3. If the GDS number is not domestic we resolve the location by the GDS country prefix or if we only have IP addresses we use the freegeoip.com provided IP-to-geocoordinate API. 4. Update the participant’s location

The location accuracy is very diverse in many case, so it is only an approximate data!

Location accurancy improvement

At our NREN we have a precise database about our VC endpoints location information. So the domestic participant location is very precise. If your have a similar database you can improve your GHG emission report.

In this case you have to modify some PHP code, like the following example:

#cron/location_resolvel_croner.php
...

class myNRENLocationResolver extends LocationResolver {
  public function   public function ownLocationByGDS($GDS) {
    #your code here
  }
}

#LocationResolver::resolver();
myNRENLocationResolver::resolver();

...

Distance and duration resolver

After that, or parallel with what is described in the previous two paragraphs, we can start resolve distance and the duration. The assumed conference venue will be the location of a randomly choosed connected participant.

We use Google Maps Distance matrix API to provide distance and the duration of the trip. This web-service provide distance and the trip duration between two point on the map on the road network. The API can calculate distance in only three mode walking, bicycling and driving (with car). We use only the third (driving) mode to calculate the distance and duration. Since the web-servise cannot plan the route with train or other means of transport, we use this data when we calculate the GHG emission of the travel. Be aware, that this is a simplification.

In some cases we do not get back valid values, for example if the origin and the destination are not on the same continent or the map is incomplete. In this case we figure out the two points absolute geographical distance from each other and we try to estimate the duration of the journey.

The Google API has many limitations: It supports only 100 elements per query. It resolves only 100 elements per 10 seconds. 2 500 elements per 24 hour period.

For the above limitation and for more efficient operation and speed-up, we reuse the previously resolved A to B and B to A points reach, stored them in a cache.

GHG savings calculator

On a more precise way, the saved emission is equal to sum(calculated distance * vehicle specific GHG emission) - (time of conference * (conference type specific GHG emission + GHG emission of an average HD video conference endpoint device + GHG emission of the average display) * number of participants)

When calculating vehicle dependent emission we are assuming that the following vehicles are used on different distances: distance vehicle 0 - X car X - Y train Y - 800 km air (short distance) 801 km - infinity air (long distance)

X, Y and other constants are tunable parameters of the algorithm.

Conference type depending GHG emission: * MCU conference GHG emission = (MCU emission + gatekeeper emission) / number of participants * peer-to-peer conference GHG emission = gatekeeper emission / number of participants

Visualizer

This is a very simple Drupal module which provides two resolutions of the GHG saving report: monthly and yearly. In the first case we have to give the year and the month and in the second case we have to give only the year. In both two cases it a page is generated with a list of the conferences and on the top of this list there is a summary of the specified period. The list shows the conference ID, conference start date an time, conference duration and the GHG emission saved in kg or ton.

Clicking on the conference ID we can show the participants of the conference and their detailed data such as: * participant ID * Estimated GHG emission * GHG emission saving * Working hours savings and with special access permission * coordinates * display of a map with locations

The third part of the interface is the configuration display, where we can check the constants' (parameter) values.

List of the available constants: * CO2_EMISSION_CAR: average car GHG emission in g/km * CO2_EMISSION_TRAIN: average Hungarian train GHG emission in g/km * CO2_EMISSION_AEROPLANE_800: average aeroplane GHG emission under 800km in g/km * CO2_EMISSION_AEROPLANE_800_PLUS: average aeroplane GHG emission over 800km g/km * CO2_EMISSION_MCU: MCU GHG emission in g/s * CO2_EMISSION_GATEKEEPER: gatekeeper GHG emission in g/s * CO2_EMISSION_VIDCONF_ENDPOINT: video conference endpoint GHG emission in g/s * CO2_EMISSION_VIDCONF_ENDPOINT_DISPLAY: video conference endpoint display GHG emission in g/s * CO2_DISTANCE_CAR: maximal distance value when we calculate with car GHG emission value in km * CO2_DISTANCE_TRAIN: maximal distance value when we calculate with train GHG emission value in km * CO2_DISTANCE_AEROPLANE: maximal distance value when we calculate with aeroplane GHG emission value in km * CO2_AVERAGE_SPEED_AEROPLANE: average aeroplane speed in m/s to the duration calculation when we calculate the absolute distance

Install

Install the Drupal CO2 module

cd /your/drupal_root/sites/all/modules
git clone git://github.com/dorion/CO2.git

Add the following lines to the Drupal settings.php:

 $db_url = array(
             'default' => 'mysqli://drupal_db_user:password@sqlserver/drupal_database',
             'co2'     => 'mysqli://co2_db_user:password@sqlserver/co2_database',
           );

Obviously you have to change the db_user names, password, sql server and the database names.

Then go to http://yourdrupal.com/admin/modules and enable the "CO2 emission reporter" module.

Configure cron jobs

The processor script is written in PHP but they have not got any output. The scripts run with cli PHP in the background. PHP is a script program language so we have to run manually or can be automatically launch from cron. See the crontab file content below:

crontab -e

MAILTO=youremail@example.com
*/5 * * * * /var/www/drupal_6/sites/all/modules/CO2/cron/ghg_calculator_cron

Create database structure

Here is the database schema:

    CREATE TABLE conf (
      cid int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT 'Conference new ID',
      start_datetime datetime DEFAULT NULL COMMENT 'Conference start date and time',
      duration int(10) unsigned DEFAULT NULL COMMENT 'Conference duration',
      latitude double DEFAULT NULL COMMENT 'Latitude',
      longitude double DEFAULT NULL COMMENT 'Longitude',
      PRIMARY KEY (cid)
    ) DEFAULT CHARSET=utf8;

    -- --------------------------------------------------------
    CREATE TABLE conf_part_trans (
      cid int(10) unsigned NOT NULL COMMENT 'Conference ID',
      pid int(10) unsigned NOT NULL COMMENT 'Participant ID',
      distance double unsigned DEFAULT NULL COMMENT 'Distance between the confernce palce and the participant',
      period int(10) unsigned DEFAULT NULL COMMENT 'The calculated journey duration.',
      PRIMARY KEY (cid,pid)
    ) DEFAULT CHARSET=utf8;

    -- --------------------------------------------------------
    CREATE TABLE participant (
      pid int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT 'Participant ID',
      GDS varchar(20) COLLATE utf8_hungarian_ci DEFAULT NULL COMMENT 'GDS number if it exists',
      IP varchar(100) COLLATE utf8_hungarian_ci DEFAULT NULL COMMENT 'IP number is ot exists',
      latitude double DEFAULT NULL COMMENT 'Latitude',
      longitude double DEFAULT NULL COMMENT 'Longitude',
      PRIMARY KEY (pid)
    ) DEFAULT CHARSET=utf8;

    -- --------------------------------------------------------
    CREATE TABLE temp_log (
      ID int(10) unsigned NOT NULL AUTO_INCREMENT,
      conf_id varchar(100) COLLATE utf8_unicode_ci NOT NULL,
      start_datetime datetime NOT NULL,
      duration int(10) unsigned NOT NULL COMMENT 'Duration in second',
      caller_GDS varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
      called_GDS varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
      caller_IP varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
      called_IP varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,
      PRIMARY KEY (ID)
    ) DEFAULT CHARSET=utf8;

Rename and fill in the config files

cd yourDrupalInstallation/module/CO2
mv cron/CO2.conf.php.default cron/CO2.conf.php
mv cron/db.inc.php.default cron/db.inc.php

Then fill them with the right data.

Future development possibility

  • Report narrow to regions or organisations

  • Optional cost calculating (need more data)

  • Export reports to PDF, csv, xls format

Implementation Remarks

About


Languages

Language:PHP 94.2%Language:Shell 5.8%