Category Archives: VoIP security

Security implications of WebRTC (part 2): End to end encryption. Well, no… 3

The problem

Some time ago i was watching an episode of the VoIP Users Conference ( about the Jitsi  WebRTC video bridge. The video bridge is automagically maximizing the video of the currently active speaker without processing the audio of each participant. Instead it is utilizing a RTP header extension (RFC 6464) to find out about the audio level of each participant.

WebRTC media streams are encrypted with SRTP, the SRTP key exchange is performed end-to-end with DTLS-SRTP. This ensures that any kind of man-in-the-middle attack can be detected and the two WebRTC endpoints can be sure that nobody can spy on their conversation.

This is great! Except for the fact that SRTP only encrypts the payload part of RTP packets. The RTP header (and all RTP header extensions, like RFC 6464 audio levels) are NOT encrypted. Which means that anybody who can see your SRTP packets (e.g. your ISP or “some three letter agency recording the whole internet with a datacenter in Utah”) knows when you are speaking, when you are silent or even if you have muted your microphone). Considering what can be done with traditional telephony meta data alone, this is a bit scary.

Chrome is enabling the “ssrc-audio-level” RTP header extension by default (by inserting “a=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level” into the SDP offer) for every call although Chrome does not use the data from received SRTP packets (__insert_conspiracy_theory_here__).

 See for yourself

To verify my crazy claims (remember, I am just somebody on the internet!) you just need to use one of the gazillion WebRTC demos (make sure you only share your microphone so it’s easier to find the srtp audio session) and capture the packets with wireshark. Then tell wireshark to decrypt those UDP packets as RTP and have a look:


The first 2 of the 4 marked bytes (the other 2 are just padding) are 0×10 (first 4 bit ID, second 4 bit length of the extension – 1) and 0xff (least significant 7bits are the audio level expressed in -dBov). The audio level in this example is -127 dBov, the audio level of a digitally muted source (I muted my microphone).

How to fix this?

There is a RFC for encrypting RTP header extensions (RFC 6904). Chrome should implement this or at least not enable RFC 6464 by default. I have created an issue for this on the WebRTC google code project (issue 3411).

Given the success I had with reporting my STUN gun (issue 2172 still open for over a year..), I am suggesting that WebRTC devs should fix it today with a bit of SDP mangling in javascript. Whenever you get a SDP description feed it through a function to remove the offending RTP header extension:

function processSdp(sdp_str) {
 var sdp = new Array();
 var lines = sdp_str.split("\r\n");
 for (var i = 0; i < lines.length; i++) {
   if (lines[i].indexOf("urn:ietf:params:rtp-hdrext:ssrc-audio-level") != -1) {
     /* drop it like it's hot */
   } else {
     /* keep the rest */
 return sdp.join("\r\n");


A wakeup call from the darknet

This morning I woke up at 6 am to the sound of my own music on hold coming through the speaker of my SNOM SIP phone. For some reason it hat called my cellular phone which forwarded the call back to my PSTN number (on which i use the standard Asterisk music on hold). Being still 50% asleep i just hung up and went back to bed.

A few minutes later I heard somebody’s voicemail greeting playing through the speaker of the phone. I got up, hung up and powered up my laptop to take a look at the phone’s logs. While doing so it automagically dialed another number.

Having been introduced to the world of “unwanted automatic call origination”, I was suprised that the phone wasn’t calling premium numbers in the British Indian Ocean Territory. It was calling numbers that where either on my dialed numbers list or on my missed calls list, indicating that the calls were most likely placed through clicking links on the SNOM’s webinterface.

But who would be able to access my phone’s webinterface? It is on the local network of a wifi AP (not even on the actual local network where everything else is). There was no other machine on the wifi and i have a pretty long WPA2 password. Unlike my neighbours I am also not using WPS. And i had set up a username and password for the webinterface.

After my “asleepness level” went down to 25%, I remembered that i had crosscompiled TOR for the phone and had set up a TOR hidden service on the phone (for research purposes). When you access the webinterface through the hidden service then the SNOM’s LCS process (the one handling the phone’s GUI and webinterface) sees this as a connection coming from localhost and does not require authentication (probably because the SNOM’s XML minibrowser is using this to load XML menus from the phone).

But who would know the .onion address of my phone’s hidden service? I guess nobody does. But I suspect that somebody is scanning the .onion address space and is scraping all the content from each service. By following each and every link on my phone’s webinterface the scraper would also trigger the callback links in the missed calls and dialed numbers lists.

So, there must be somebody who is scanning the .onion address space. Unfortunately it’s not possible (at least for me) to find out who this is, as the requests are (of course) coming from the TOR network.

Lesson learned: Do not assume that your TOR hidden service cannot be found.


Security implications of WebRTC (part 1): The STUN gun

WebRTC compatible browsers use STUN / ICE mechanisms to establish peer to peer connections between endpoints (where possible).

To determine if a direct connection is possible ICE connectivity checks are used. This functionality is encapsulated inside the RTCPeerConnection JavaScript API. A JavaScript application does not require any special permission or consent by the user to use the PeerConnection API (only access to the user’s microphone and camera require the user’s consent).

Receive-only PeerConnections (e.g. for receiving streaming video) use exactly the same STUN / ICE mechanisms as fullduplex connections.

Section 18.5.2 of RFC 5245 describes a STUN amplification attack which works by inserting ICE candidates with the victim’s IP address into a SDP offer or answer. Back in the days when this RFC was crafted the attack required the attacker to be in the signaling path to trick ICE agents into flooding the victim with ICE connectivity checks.

The beauty of WebRTC is that there is no defined signaling mechanism, meaning you need to build your own utilizing the PeerConnection JavaScript API. That means now the attacker only needs to be able to inject JavaScript code (e.g. through Cross-site scripting (XSS)) to flood arbitrary hosts with ICE connectivity checks. Appendix B1 of RFC 5235 defines a pacing mechanism limiting the bandwidth used for connectivity checks to the bandwith that the actual media flow would be using (which can be quite significant for video streams).

Tests with Google Chrome have shown that one PeerConnection usually generates around 20 to 25 kbit’s of ICE connectivity checks (no matter how many ICE candidates are provided). By “massaging” the session description passed to the PeerConnection in the SDP answer (e.g. by replacing the short generated ICE credentials with really really really long credentials) this bandwidth generator can be amplified by a factor of 10. After 10 seconds the ICE connectivity checks are stopped by Chrome (FireFox does this a lot earlier creating other problems for regular use cases).

Chrome has a limit of 10 active PeerConnections per browser tab. That means by creating 10 concurrent PeerConnections with long ICE credentials every 10 seconds a malicious script can generate a permanent stream with 2.0 to 2.5 mbit/s of UDP traffic out of thin air, which can be directed at any host (without being bound to any constraints like the usual same origin constraint). The throttling of ICE connectivity checks (in Chrome) is done on each PeerConnection separately which means that all 10 concurrent connections can direct their traffic to the same IP and port.

Fortunately Chrome does not allow ICE candidates with ports less than 1024, so you cannot point your STUN gun towards DNS servers. But there are still some interesting UDP based services running on higher ports (e.g. SIP, OpenVPN, …) and apart from targeting particular services the accumulated bandwidth of a few thousand clients can cause some problems (“plugin-free denial of service”). FireFox does not have any constraints regarding the port of ICE candidates.



By using even longer ICE credentials the amount of generated dos bandwith can be scaled, e.g. an ICE username of 5000 characters and an ICE password of 5000 characters result in around 10 mbit/s of UDP traffic. Shouldn’t there be a length limit for ICE credentials???

Reversing SIP digest authentication in JavaScript (node.js)

SIP digest authentication RFC 3261 is based on HTTP digest authentication RFC 2617.
The purpose of digest authentication is not having to exchange the password in plaintext.

When a client issues a request that needs to be authenticatd (e.g. a SIP REGISTER request) the server will respond with a “401 Unauthorized” response, which includes a “WWW-Authenticate” header, for example:

WWW-Authenticate: Digest algorithm=MD5, realm=”asterisk”, nonce=”40787c47″

The client will go through the digest authentication procedure and resend the request, including an “Authorization” header:

Authorization: Digest username=”myusername”, realm=”asterisk”, nonce=”40787c47″, uri=””, response=”3bc3cedaa7ee0f0b9bec12c50c8827cb”,algorithm=MD5

The client has calculated the “repsonse” like this:

HA1 = MD5(“myusername:asterisk:password”)

HA2 = MD5(“”)

response = MD5(HA1+”:40787c47:”+HA2);

On the server side exactly the same computation is performed and the result is compared to the “response” transmitted by the client. If both hashes are equal then the request has been authenticated.

If an attacker can get access to the “Authorization” header of the client request (e.g. by sniffing packets on the client or server LAN) then the password can be recovered by bruteforcing it. Transport layer security (TLS) can protect against packet sniffing. However there are mechanisms for getting access to the “Authorization” header without needing to sniff packets.

Bruteforcing the password is a pretty straightforward process. You generate a list of all possible passwords (depending on which characters you are expecting), run it through the digest authentication procedure and compare the hashes.

The HA2 part of the calculation is constant (it is a hash of the request type and the SIP URI) and can be precomputed. Trying each password involves two MD5 hash operations.

I made another simple node.js application to which you feed the parameters obtained from a sniffed “Authorization” header (username, realm, nonce, SIP URI, response hash) and the length of the password to be bruteforced, e.g.:

node sipdigest-bruteforce.js myuser asterisk 40787c47 ac4d7f3684ec6c176ced252917af449f 5

Depending on the length of the password you can now go and grab a coffee (or go on vacation):

Alphabet: “,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z”
Password length: 5
Number of permutations: 14348907
Precomputed HA2: 2c207d3d673349fe58bb06d3d5f7ae67
found password: keins
real    0m58.892s

It took less than a minute on my laptop to find my super secret password. Please note that i have replaced all values in this example with dummy values. That means if you try to run the above example you will not find the password.

On my 2 Ghz i7 CPU i get around 200k passwords / second (when utilizing just 1 CPU core). You can easily adapt the application to use multiple cores by splitting different regions of permutations over several threads.

The ability to parallelize the computation makes it very suitable to run it on a GPU (or several GPUs). Current of the shelf GPUs (even two or three years old) can achieve crazy performance for MD5 calculation, e.g. a AMD Radeon HD5790 can deliver up to 4 GHash/sec (MD5) which translates to 2 billion passwords per second.

A nice upgrade path exists as node.js has support for OpenCL (through WebGL).

You can download my node.js application here.

Resetting SNOM HTTP credentials for the lazy

Sometimes somebody locks your SNOM phone down to user mode and also sets credentials for the webinterface. You can revoke those changes either by using the HTTP provisioning mechanism or by flashing a recovery image via TFTP.

If you already have a working provisioning mechanism (e.g. by setting the option in your DHCP server) to unlock your phone then it wasn’t a problem in the first place.

Flashing a recovery image via TFTP is no fun either as you need physical access to the phone (for setting up a static ip address, etc) and also it deletes all settings (especially those you might be interested in…).

Fortunately SNOM phones have a plug and play mode which can be (ab)used by somebody on the local network. By default, a phone will send a SIP Subscribe message to the multicast ip address after each boot. A provisioning server can then send a SIP Notify message containg a HTTP URL for provisioning.

I made a small node.js application that listens for those SIP Subscribe messages and feeds them a XML configuration that resets the HTTP username and password to “admin” and also switches from user mode to admin mode.

Just start it with “node snomreset.js <yourIpAddress>” and have fun:

SIP listening
Received SUBSCRIBE from MAC 0004134001F4
Sending Ok
Sending NOTIFY with provisioning URL
Resetting HTTP user and password to “admin”, enabling admin mode.
Resetting HTTP user and password to “admin”, enabling admin mode.
Resetting HTTP user and password to “admin”, enabling admin mode.

You can download the snomreset.js application here.