unispeech / unimrcp

Open source cross-platform implementation of MRCP protocol

Home Page:http://www.unimrcp.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

demo_recog_engine.c leaks unmasked rfc4733 DTMF input

michaelplevy opened this issue · comments

I know it is just a demonstration module, but this is an issue I think is worthy of reporting here rather than the forums.

In plugins/demo-recog/src/demo_recog_engine.c, in demo_recog_stream_write() there is the code:

		if(recog_channel->recog_request) {
			if((frame->type & MEDIA_FRAME_TYPE_EVENT) == MEDIA_FRAME_TYPE_EVENT) {
				if(frame->marker == MPF_MARKER_START_OF_EVENT) {
					apt_log(RECOG_LOG_MARK,APT_PRIO_INFO,"Detected Start of Event " APT_SIDRES_FMT " id:%d",
						MRCP_MESSAGE_SIDRES(recog_channel->recog_request),
						frame->event_frame.event_id);
				}
				else if(frame->marker == MPF_MARKER_END_OF_EVENT) {
					apt_log(RECOG_LOG_MARK,APT_PRIO_INFO,"Detected End of Event " APT_SIDRES_FMT " id:%d duration:%d ts",
						MRCP_MESSAGE_SIDRES(recog_channel->recog_request),
						frame->event_frame.event_id,
						frame->event_frame.duration);
				}
			}
		}

If you receive DTMF digits as rfc2833/rfc4733 input, they arrive as events. The log code above will log DTMF input events even when conf/logger.xml is set as:
<masking>COMPLETE</masking>

Any plugin that follows the example from the demo recognizer will leak rfc2833/rfc4733 DTMF data in the logs even when the system is configured for user input to be masked.

As an example, I entered the DTMF digit sequence 87654321. I have masking set to complete. My unimrcp log shows:

2020-10-22 13:47:28:651839 10508 [INFO]   Detected Start of Event <cc1b347f96dcc84d@speechrecog> id:8
2020-10-22 13:47:29:451433 10508 [INFO]   Detected Start of Event <cc1b347f96dcc84d@speechrecog> id:7
2020-10-22 13:47:30:251464 10508 [INFO]   Detected Start of Event <cc1b347f96dcc84d@speechrecog> id:6
2020-10-22 13:47:30:881451 10508 [INFO]   Detected Start of Event <cc1b347f96dcc84d@speechrecog> id:5
2020-10-22 13:47:31:831466 10508 [INFO]   Detected Start of Event <cc1b347f96dcc84d@speechrecog> id:4
2020-10-22 13:47:32:721474 10508 [INFO]   Detected Start of Event <cc1b347f96dcc84d@speechrecog> id:3
2020-10-22 13:47:33:541462 10508 [INFO]   Detected Start of Event <cc1b347f96dcc84d@speechrecog> id:2
2020-10-22 13:47:34:351517 10508 [INFO]   Detected Start of Event <cc1b347f96dcc84d@speechrecog> id:1

Sensitive data is supposed to be masked using apt_log_data_mask() first and only then passed to the logger.

Yes. My suggestion is that before writing to the log, frame->event_frame.event_id must be masked. The demo code in demo_recog_engine.c does not show this correctly.

Edited-

Sorry, I had added to this a description of a similar problem. This was debug logging code put in our own source, not from yours. mpf_dtmf_detector_get_frame() does not log the event_id.

Agree that the detected digits must be masked, if masking is enabled. Not sure, though, whether using apt_log_data_mask() would make sense for logging a single digit. Instead, apt_log_masking_get() can be used to check whether masking is enabled and replace the digit with a '*' sign.

Or simply do not log those lines if any masking is enabled.

    if(recog_channel->recog_request && APT_LOG_MASKING_NONE == apt_log_masking_get()) {
      if((frame->type & MEDIA_FRAME_TYPE_EVENT) == MEDIA_FRAME_TYPE_EVENT) {
        if(frame->marker == MPF_MARKER_START_OF_EVENT) {
          apt_log(RECOG_LOG_MARK,APT_PRIO_INFO,"Detected Start of Event " APT_SIDRES_FMT " id:%d",
            MRCP_MESSAGE_SIDRES(recog_channel->recog_request),
            frame->event_frame.event_id);
        }
        else if(frame->marker == MPF_MARKER_END_OF_EVENT) {
          apt_log(RECOG_LOG_MARK,APT_PRIO_INFO,"Detected End of Event " APT_SIDRES_FMT " id:%d duration:%d ts",
            MRCP_MESSAGE_SIDRES(recog_channel->recog_request),
            frame->event_frame.event_id,
            frame->event_frame.duration);
        }
      }
    }

We started our custom plug-in by copying demo_recog_engine.c. We never questioned that logging code and never bothered to remove it. Of course, we are now doing a careful security and PII data review and I was surprised to stumble on my DTMF keys showing up in the logs.