Daily Archives: May 28, 2014

Security implications of WebRTC (part 2): End to end encryption. Well, no… 3

The problem

Some time ago i was watching an episode of the VoIP Users Conference (http://vuc.me) about the Jitsi  WebRTC video bridge. The video bridge is automagically maximizing the video of the currently active speaker without processing the audio of each participant. Instead it is utilizing a RTP header extension (RFC 6464) to find out about the audio level of each participant.

WebRTC media streams are encrypted with SRTP, the SRTP key exchange is performed end-to-end with DTLS-SRTP. This ensures that any kind of man-in-the-middle attack can be detected and the two WebRTC endpoints can be sure that nobody can spy on their conversation.

This is great! Except for the fact that SRTP only encrypts the payload part of RTP packets. The RTP header (and all RTP header extensions, like RFC 6464 audio levels) are NOT encrypted. Which means that anybody who can see your SRTP packets (e.g. your ISP or “some three letter agency recording the whole internet with a datacenter in Utah”) knows when you are speaking, when you are silent or even if you have muted your microphone). Considering what can be done with traditional telephony meta data alone, this is a bit scary.

Chrome is enabling the “ssrc-audio-level” RTP header extension by default (by inserting “a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level” into the SDP offer) for every call although Chrome does not use the data from received SRTP packets (__insert_conspiracy_theory_here__).

 See for yourself

To verify my crazy claims (remember, I am just somebody on the internet!) you just need to use one of the gazillion WebRTC demos (make sure you only share your microphone so it’s easier to find the srtp audio session) and capture the packets with wireshark. Then tell wireshark to decrypt those UDP packets as RTP and have a look:


The first 2 of the 4 marked bytes (the other 2 are just padding) are 0×10 (first 4 bit ID, second 4 bit length of the extension – 1) and 0xff (least significant 7bits are the audio level expressed in -dBov). The audio level in this example is -127 dBov, the audio level of a digitally muted source (I muted my microphone).

How to fix this?

There is a RFC for encrypting RTP header extensions (RFC 6904). Chrome should implement this or at least not enable RFC 6464 by default. I have created an issue for this on the WebRTC google code project (issue 3411).

Given the success I had with reporting my STUN gun (issue 2172 still open for over a year..), I am suggesting that WebRTC devs should fix it today with a bit of SDP mangling in javascript. Whenever you get a SDP description feed it through a function to remove the offending RTP header extension:

function processSdp(sdp_str) {
 var sdp = new Array();
 var lines = sdp_str.split("\r\n");
 for (var i = 0; i < lines.length; i++) {
   if (lines[i].indexOf("urn:ietf:params:rtp-hdrext:ssrc-audio-level") != -1) {
     /* drop it like it's hot */
   } else {
     /* keep the rest */
 return sdp.join("\r\n");