Category Archives: BRIstuff (* 1.4)

Things I have learned from being hit by a SIP bruteforce attack…

The attac:k

This week it happened to me, too. Somebody (with an IP address from Belize) managed to bruteforce the password of a SIP account on one of our internal Asterisk servers. I noticed it purely by accident. Usually i never look at the CLI of that machine (except when things fail, which is pretty rare). Fortunately, luck is with the stupid, so i noticed it just two hours after the first calls were made.

Calls were made to a single number in the British Indian Ocean Territory. The same number seems to be used by a few malicious Android applications, too. I have learned that the username “droid” and password “android” are not secure enough for a SIP account. ;-)

Immediately i redirected all calls from that account to one of my SIP phones, which started ringing very soon. Unfortunately there was no one at the other end when i answered the call. The other side (which claimed to be an Asterisk) was only sending silence (in g.711 RTP). So, our BRI lines were not abused for call termination but probably for a premium number scam. I can’t wait to get the invoice next month. :-(

Things I have learned:

  • Do not use a dialplan pattern like “_X.” in your context for external calls. Set up an extension for every country you need to call, e.g. “_0049.”. I had this on all of our ITSP machines, but of course not on our internal box (“The shoemaker got the worst shoes.”).
  • Use good usernames and passwords. Try to avoid numeric usernames.
  • Protect you accounts from being bruteforced by using something like fail2ban.
  • When possible use a SIP domain instead of IP addresses. Make sure the domain cannot be guessed from the IP address!

The new setup:

Instead of setting up fail2ban on our Asterisk box, I decided to use kamailio in front of Asterisk. All authentication is done by kamailio and all calls are forwarded to Asterisk (even calls between local subscribers). The “antiflood” feature of kamailio keeps our fellow bruteforces outside.

The authentication between Asterisk and kamailio is done on a trusted IP basis. There are no SIP accounts on the Asterisk box. An very nice side effect is that you can now register the same SIP account on multiple SIP phones without any effort! Before I had to have one SIP account for each of my phones.

Kamailio’s multi-domain support is enabled. All automatic aliases have been removed. It is only listening to requests for the domains configured in the database. The SIP domain we use cannot be guessed from the IP address of the box. This feature alone would probably be sufficient to protect the accounts against bruteforcing!

Update:

I have just made some tests with the metasploit SIP options scanner and SIP enumerator. With kamailio’s multidomain support configured to a non-guessable domain it does not even respond to the SIP options message from the scanner. That way our fellow bruteforcers dont even recognize the kamailio server as a SIP server. And leave it alone. :-)


Asterisk memtesting reloaded

It was about time to re-test up to date Asterisk versions for memory leakage. As these tests take a rather long time, i will update this post when the 1.4 and 1.6 versions have gone through the test.
The test procedure is the same as in my previous test, incoming SIP calls that send a ringing, wait a little and hang up (no call is ever connected, no RTP is flowing).

Asterisk 1.8.4-rc2

Since my last test, things have improved a lot! Asterisk does not consume 200m+ right from the start and it does not leak that much any more. However it still leaks a bit. After 32M+ calls i stopped the test.

Memory consumption before the test: virt 513m  16m 5940 S

Memory consumption after the test: virt 712m res 192m 6432 S

Asterisk 1.6.2.18-rc1

After pushing 37M+ calls Asterisk stopped processing calls because it ran out of RTP ports (it was using the default RTP port range from 10000 to 20000). “netstat -l -n -p  | grep asterisk -c” shows that 10009 ports are in use by asterisk. Calls fail with:

[Mar 27 12:42:03] ERROR[25983] rtp.c: No RTP ports remaining. Can’t setup media stream for this call.
[Mar 27 12:42:03] WARNING[25983] chan_sip.c: Unable to create RTP audio session: Address already in use

Memory consumption before the test: 476m  14m 5412 S

Memory consumption after the test: 742m 197m 5868 S

Asterisk 1.4.41-rc1

After pushing 10M+ calls Asterisk is processing calls very slowly. The machine has a high load because it is constantly swapping pages to and from disk. Retransmissions start to fail:

[Mar 30 10:58:50] WARNING[10667]: chan_sip.c:2070 retrans_pkt: Maximum retries exceeded on transmission a3621613-d54e-122e-43b7-001aa0314ced for seqno 10401673 (Critical Response) — See doc/sip-retransmit.txt.

Memory consumption before the test: 392m  12m 4396 S

Memory consumption after the test: 1472m 358m 1324 S

Asterisk 1.2.40

During the test Asterisk was complaining about “avoiding initial deadlocks” and “avoiding deadlocks” a lot, but it did not deadlock or go into the famous 100% cpu loop. It just works. :-)

After processing 65M+ calls,  I decided to stop the test as i could not see an increasing memory consumption. The 1.2 branch of Asterisk keeps its place as my favourite branch (when it comes to stability).

Memory consumption before the test:  276m  11m 3464 S

Memory consumption after the test:  460m  79m 3768 S S


Asterisk incontinence, leaking all that memory…

During the last week i spent quite some time testing several Asterisk versions in terms of memory usage. My test scenario involves an Asterisk server which receives SIP calls from a call generator (based on sofia-sip) and runs a very simple dialplan:

exten => _+.,1,Wait(15)
exten => _+.,n,Ringing
exten => _+.,n,Wait(10)
exten => _+.,n,Hangup

The call generator will terminate the call right after receiving the Ringing. It is configured to use 1000 SIP channels and has a caps limit of 100. With this dialplan however a caps value of about 66 can be reached. All tests were done “signalling only” without any RTP being transmitted.

This test generates about 1 to 2 MBit/s of IP traffic. Here are the results:

Asterisk 1.4.37-rc1:

Asterisk starts dropping calls and fails to respond on numerous INVITEs after processing about 3.5M calls. SIP registrations time out.

Memory consumption before the test:  virt 392m  res 12m 4396 S
Memory consumption after the test: virt 1125m res 320m 1472 S

Asterisk 1.6.2.14-rc1

After processing more than 5M calls Asterisk is still running fine, the memory usage is not increasing beyond 1094m res 131m 5412 S. This is a good sign for not leaking memory.

Memory consumption before the test: virt 455m  res 13m 4940 S
Memory consumption after the test: virt 637m  res 94m 5412 S

Asterisk 1.8.0

Right after the start it is using 204m of memory! That is almost 20x as much as 1.4 or 1.6 used!

After around 300k calls Asterisk segfaulted and left a 1.2 GB core file, because it could not allocate memory:

#2  0×0000000000523131 in __ast_str_helper (buf=0x7faca3d93948, max_len=8192, append=<value optimized out>,     fmt=0×552938 “Memory Allocation Failure in function %s at line %d of %s\n”, ap=0x7faca3d93900) at strings.c:72

Memory consumption before the test:  virt 717m res 204m 5700 S
Memory consumption after  the test:  none (killed by signall 11)

Executive summary (memory consumption):

Asterisk 1.4 –> BAD

Asterisk 1.6 –> GOOD

Asterisk 1.8 –> WORST


Some things I noticed while testing BRIstuff for * 1.4.31

During the last week i spent a lot of time debugging Asterisk 1.4.31 (BRIstuffed and vanilla). The locking “model” is really … well…let’s say “interesing”. Different threads lock mutexes in different order at quite a few places, which would normally result in insta-deadlocks. This is where the  deadlock avoidance (lock.h) kicks in.

When * fails to aquire a lock (with “ast_mutex_trylock”) while it is holding another lock, it will unlock the held lock, sleep for 1 microsecond and re-lock the lock it held before. This loops until it finally manages to lock the first lock:

while (ast_mutex_trylock(&lock1)) {

ast_mutex_unlock(lock2);
usleep(1);
ast_mutex_lock(lock2)

}

Oh, yes, sometimes it NEVER aquires the first lock! Which turns this “avoided deadlock” into a 100% cpu hog! And it makes it so much harder to actually find a race condition in the code…

In case you are wondering why your D channels sometimes go down for no reason (but might recover after some time) or why your Asterisk process is eating all your CPU cycles although it is only pushing very few calls, then you might have hit an “avoided deadlock”. You will mostlikely experience a degrading voice quality in this case, too.

If you happen to run Asterisk with realtime scheduling priority then your userspace will most likely be gone! You can still ping your machine but cannot login neither remotely nor locally. No, your kernel did not crash, also neither your RAM is dodgy nor your shiny quadBRI card. ;-)

I managed to “fix” some places in chan_dahdi which used the wrong locking order, but there are still a few remaining places which will need much testing after the locking order has been resolve, for example:

If a DAHDI channel receives events from the ISDN, it will have the pri->lock aquired and then will try to aquire the pvt->lock and channel->lock. On the other hand if the Asterisk core calls a function (ast_answer, ast_hangup…) on a DAHDI channel it will call it with the channel->lock locked and will try to lock the pvt->lock and then the pri->lock .

When both things happen at the same time we would have a deadlock (without the “deadlock avoidance”). With the “deadlock avoidance” there is a chance to create an infinite loop. If the Asterisk system is not pushing many calls then the probability of such a loop increases significantly because there will not be many other threads “disturbing” the steady timing of the two threads (they both always sleep for 1 microsecond and most likely will always be scheduled in the same order!). On a loaded Asterisk system the probability of such an event is much lower.

Maybe it might be a good idea to add a little randomness to the sleeping interval. For development and testing I will remove the “while-try-lock” loops and use the deadlocking regular ast_mute_lock function instead.