Discussion:
Linux 2.4.27 SECURITY BUG - TCP Local and REMOTE(verified) Denial of Service Attack
(too old to reply)
Wolfpaw - Dale Corse
2004-09-12 15:45:38 UTC
Permalink
Hi Willy,

No problem :) I run the following, against SSH as the target, and I
can also kill it. (using telnet as the other side of the attack)

***@magik:/etc# telnet 0.0.0.0 22
Trying 0.0.0.0...
Connected to 0.0.0.0.
Escape character is '^]'.
Connection closed by foreign host.
***@magik:/etc# telnet 0.0.0.0 23
Trying 0.0.0.0...
Connected to 0.0.0.0.
Escape character is '^]'.

Magik login: Connection closed by foreign host.
***@magik:/etc#

And from a remote host:

***@maximus:/home/admin# telnet XXXXXXXXXXXXXXX 22
Trying XXXXXXXXXXXXXX...
Connected to XXXXXXXXXXXXXXXXX.
Escape character is '^]'.
Connection closed by foreign host.
***@maximus:/home/admin#

And it gets worse now..:

***@avalon:/root# telnet XXXXXXXXXXXXXXX
Trying XXXXXXXXXXXXX...
Connected to XXXXXXXXXXXXX.
Escape character is '^]'.
telnetd: All network ports in use.
Connection closed by foreign host.
***@avalon:/root# telnet XXXXXXXXXXXXXX 22
Trying XXXXXXXXXXXXXX...
Connected to XXXXXXXXXXXXXXXXX.
Escape character is '^]'.
Connection closed by foreign host.
***@avalon:/root#

Well well.. We have ourselves a Remote Denial of Service tool.

Now.. Do you really want me to post the source code for it?

I wouldn't want to upset David again. This has basically
disabled interactive administration on that machine, by taking
out both ssh and telnet at the same time. If you want to demo
it, I can send it to you privately.. I'm a bit apprehensive
about releasing a 'ready-to-rock' remote DoS exploit on a
list though :(

Just think here a moment.. Lets say I modify it a bit more,
and turn it into a DDOS utility, so now you have (pardon my
language) .. An assload of these coming at your server, how
are you going to stop it? Simply - you can't, and your server
will run out of sockets long before all the remote hosts do.

This one in essence takes a nice working tcp connection application,
and removes the close statements, which as you mentioned will cause
the sockets to end up in that state. What I am attempting to demonstrate
here, is the fact it can relatively easily take out any tcp based app,
and simply saying, something should be done about it. What the actual
bug is, I think I know (and have said), but I will leave that determination
to the actual kernel developers.

I was not attempting to say you should break TCP with the timeouts, or
even make them short, I just would tend to think no timeout at ALL is
a bit of a design flaw, because if the other end is no longer there,
the work will never get done, but one of the ends expects the response
indefinitely, which to me looks like an assumption. I think we have all
learned these days that you can't make anymore assumptions, people use
those to break things.

I also would like to thank you for engaging in a discussion about the
bug with me, in a polite manner, instead of simply writing me off as
some loud mouth neophyte.

Anyway - from my view, this is a bug in the OS, because it should not
occur, if it does, we need to find a way to ensure it doesn't. I know
a few firewall tricks that might stop it, but I'm not sure. If a regular
user can invoke this kind of response so easily, I would say it's a bad
thing.

Regards,
Dale.
-----Original Message-----
Sent: Sunday, September 12, 2004 8:58 AM
To: Dale
Subject: FW: Linux 2.4.27 SECURITY BUG - TCP Local (probable
Remote) Denial of Service
-----Original Message-----
Sent: Sunday, September 12, 2004 4:36 AM
To: Wolfpaw - Dale Corse
Subject: Re: Linux 2.4.27 SECURITY BUG - TCP Local (probable
Remote) Denial of Service
This is the odd part, try the exploit,
I have nothing to try it right here.
they are detached in
the list, but it appears apache isn't aware of that. If you run the
code, and do multiple telnets from another window, you will
see that
there are occurrences where a connection can't be established, and
this is where the problem is. I used a stock version of Mysql 3
(latest stable), stock apache, on an unmoded Linux box
(except it had
GrSecurity) and I was able to see a noticeable slowdown in web
transactions with a browser. I was also the only person hitting the
machine.
How can you be sure that your problem is not simply related
to either apache or mysql not freeing the connection fast
enough ? Apache is very limited in terms of simultaneous
connections, and it is trivial for anyone to block an apache
server by establishing as many connections as it can handle,
sending the start of a request and doing nothing more (and it
has a very long default time out BTW). It might be the same
with mysql.
I am not saying you are incorrect, I'm simply clarifying
what seems to
be occurring with the issue I found.
Do you happen to know of any solution for sockets stuck in
CLOSE_WAIT,
they seem to stick around forever.
Yes, the only solution is to debug the process and make it
sanely close the socket once it does not need it anymore.
Usually, in such circumstances, you'll find that an strace on
- a select loop with your socket in the list of the active FDs, but
nothing in the process will do anything on this FD and the process
will go back to the select loop => bug in FD handling
- a select loop which does not include the FD while it has
not been released
=> bug in FD releasing code (usually a missing close).
This bug may be more Mysql then kernel, I don't know - I still would
tend to think these connections should not be clogging up the
applications connection queue, and that CLOSE_WAIT should have a
settable timeout, regardless of what the RFC says about it.
No, CLOSE_WAIT means that the application still has some work
to do. Under no circumstances, the kernel should destroy its
ability to work normally !
I did experience more CLOSE_WAIT's stuck at one point with
Mysql.. we
had an issue wherein after calling mysql_close with the C
API it was
still leaving the sessions established, so I had moved the
timeout on
that sql daemon to 20 seconds (its all fast transactions) .. This
caused a lot of CLOSE_WAIT issues for some reason.
So you've just demonstrated that it's mysql_close which is
the culprit. If it does not really close the connection while
you expected it to, it is the real problem. If you lower the
mysql timeout, mysql will close on its end, but as long as
the code using mysql_close() will not close, of course the
socket will remain close_wait. And to be clear, even if you
would have a short CLOSE_WAIT time-out, it would not help
because you would still be running out of file-descriptors
after a moment.
We then
added something that would go through and use 'close' on
the fd of the
Mysql connection, after mysql_close was called. This had the odd
effect of the fd being reused by a connection, before it was out of
CLOSE_WAIT and actually closed, so it would close the new
Connection,
and also the old one :P which led us to this discovery that
connect()
appears to reuse FD's before they are actually fully
closed.. This is
how it appears anyway. Thus my use of specifically mysql
and connect
in the PoC code.
If you manage to write a PoC code which does not involve
either apache not mysql, and which still exhibits the
described behaviour, then perhaps kernel developpers will
listen a bit more, but at the moment, you only showed us how
you could trigger a DoS by connecting to a buggy application.
Cheers,
Willy
--------------------------------------------------------------
--------------
-
This message has been scanned for Spam and Viruses by ClamAV
and SpamAssassin
--------------------------------------------------------------
--------------
-
Petri Kaukasoina
2004-09-12 16:47:57 UTC
Permalink
Post by Wolfpaw - Dale Corse
No problem :) I run the following, against SSH as the target, and I
can also kill it. (using telnet as the other side of the attack)
Trying XXXXXXXXXXXXXX...
Connected to XXXXXXXXXXXXXXXXX.
Escape character is '^]'.
Connection closed by foreign host.
Now.. Do you really want me to post the source code for it?
With default sshd_config you can DOS sshd trivially by opening ten
connections using ten times "telnet XXXXXXXXXXXXXXX 22".
Wolfpaw - Dale Corse
2004-09-12 17:29:55 UTC
Permalink
Post by Petri Kaukasoina
Post by Wolfpaw - Dale Corse
No problem :) I run the following, against SSH as the target, and I
can also kill it. (using telnet as the other side of the attack)
Trying XXXXXXXXXXXXXX...
Connected to XXXXXXXXXXXXXXXXX.
Escape character is '^]'.
Connection closed by foreign host.
Now.. Do you really want me to post the source code for it?
With default sshd_config you can DOS sshd trivially by
opening ten connections using ten times "telnet XXXXXXXXXXXXXXX 22".
A fair comment :) But look at it this way:

- The TCP RFC was last updated when?
- What is the average time for a tcp packet to fly even across
the world these days? Maybe 300 ms? 1 second? 5?
- It is not a secret that the TCP protocol has flaws, take for
example the RST bug, which required among other things, BGP4
to use MD5 encryption to avoid being potentially attacked.

So this brings me to:

A) Why are the timeouts so long?
B) CLOSE_WAIT having _no timeout at all_ is still using the
assumption the other side is honest, and will actually
reply. This is a very bad assumption.
C) Socket still re-uses an FD before it is actually completely
closed. This is bad, because by calling a second close in
the case of mysql, you can get the connection to go away,
but in that case, it closes whatever else is on that FD
too. (A more likely analysis is that it closes the current
connection, and then cleans the CLOSE_WAIT on that FD out
of the other pool)

All I am trying to point out is that the Internet in general, and
The Open Source movement has survived, and evolved because of innovation,
and the ability to meet upcoming threats quickly. TCP has some issues
which are blindingly obvious, and they are issues that, in my view (flawed
as it may be) can be at least somewhat minimized by a few simple changes.

I realize daemons have connection queues, and timeouts for a reason, but
really, if a daemon wishes to close a connection, for whatever reason,
sending something to the other side is required, but I can't see why having
the other side send something back is part of the protocol. This could be
implemented with KEEPALIVE much easier, and would avoid the flaws.. No reply
from the host in say 10 seconds, then drop the connection. You could still
clog queues, to which I would say the application needs to cope with one
client
filling the whole queue as best it can, and it wouldn't stop a DDOS (not much
does), but it might help some at least.

Anyway.. That's my 2 cents. I will continue my conversation with Peter from
mysql in regards to mysql_close, which was really, the entire point. It is
sad however to see a maintainer come across in the manner which David has
during the course of this discussion. It doesn't bode well for the future
of open source to tell someone off, whom likely has a valid point.. whether
or not it is a repairable fault.

Regards,
Dale.
--------------------------------
Dale Corse
System Administrator
Wolfpaw Services Inc.
http://www.wolfpaw.net
(780) 474-4095
Alan Cox
2004-09-12 17:04:53 UTC
Permalink
Post by Wolfpaw - Dale Corse
- The TCP RFC was last updated when?
About 2 months ago. The 793 RFC isn't updated instead new ones are added
for the additional features/discoveries.
Post by Wolfpaw - Dale Corse
- What is the average time for a tcp packet to fly even across
the world these days? Maybe 300 ms? 1 second? 5?
- It is not a secret that the TCP protocol has flaws, take for
example the RST bug, which required among other things, BGP4
to use MD5 encryption to avoid being potentially attacked.
This is not a TCP flaw, its a combination of poor design by certain
vendors, poor BGP implementation and a lack of understanding of what TCP
does and does not do. See IPSec. TCP gets stuff from A to B in order and
knowing to a resonable degree what arrived. TCP does not proide a
security service.

(The core of this problem arises because certain people treat TCP
connection down on the peering session as link down)
Post by Wolfpaw - Dale Corse
A) Why are the timeouts so long?
So you don't get random corruption
Post by Wolfpaw - Dale Corse
C) Socket still re-uses an FD before it is actually completely
Pardon ?
Post by Wolfpaw - Dale Corse
sending something to the other side is required, but I can't see why having
the other side send something back is part of the protocol. This could be
Because packet sizes are finite and not doing so requires an infinite
sequence space and thus infinite packet sizes. Reread the TCP
specifications more carefully, also look at RFC1337 which discusses some
of the real world cases of getting this wrong.
Toon van der Pas
2004-09-12 19:23:31 UTC
Permalink
Post by Alan Cox
This is not a TCP flaw, its a combination of poor design by certain
vendors, poor BGP implementation and a lack of understanding of what TCP
does and does not do. See IPSec. TCP gets stuff from A to B in order and
knowing to a resonable degree what arrived. TCP does not proide a
security service.
(The core of this problem arises because certain people treat TCP
connection down on the peering session as link down)
Alan, could you please elaborate on this last statement?
I don't understand what you mean, and am very interested.

Thanks,
Toon.
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
Paul Jakma
2004-09-13 03:18:21 UTC
Permalink
Post by Toon van der Pas
Post by Alan Cox
knowing to a resonable degree what arrived. TCP does not proide a
security service.
(The core of this problem arises because certain people treat TCP
connection down on the peering session as link down)
Alan, could you please elaborate on this last statement?
I don't understand what you mean, and am very interested.
I think he means that BGP treating TCP connections as if they could
reliably and securely indicate link/path status (ie connection
reset/timeout == link down) status was, in retrospect, a very dumb
idea on the part of BGP.
Post by Toon van der Pas
Thanks,
Toon.
regards,
--
Paul Jakma ***@clubi.ie ***@jakma.org Key ID: 64A2FF6A
Fortune:
This restaurant was advertising breakfast any time. So I ordered
french toast in the renaissance.
- Steven Wright, comedian
Paul Jakma
2004-09-13 03:30:36 UTC
Permalink
Post by Paul Jakma
I think he means that BGP treating TCP connections as if they could
reliably and securely indicate link/path status (ie connection
reset/timeout == link down) status was, in retrospect, a very dumb
idea on the part of BGP.
More specifically, BGP should have treated TCP resets as a transient
error, to be expected (indeed, they /cant/ be a sign that a link is
down - if you can receive a RST the link or path is patently quite
ok). The BGP state machine should instead, in normal operation, have
only treated Hold time expired as the definitive sign of "peer is
down" and allowed reconnects.
Post by Paul Jakma
regards,
regards,
--
Paul Jakma ***@clubi.ie ***@jakma.org Key ID: 64A2FF6A
Fortune:
You will attract cultured and artistic people to your home.
Willy Tarreau
2004-09-13 04:18:47 UTC
Permalink
Post by Paul Jakma
More specifically, BGP should have treated TCP resets as a transient
error, to be expected (indeed, they /cant/ be a sign that a link is
down - if you can receive a RST the link or path is patently quite
ok).
The application level does not always distinguish between TCP RST and
error generated by the local system because of a "network unreachable"
due to a link down and a lost route.
Post by Paul Jakma
The BGP state machine should instead, in normal operation, have
only treated Hold time expired as the definitive sign of "peer is
down" and allowed reconnects.
It should not necessarily wait for the time-out, but at least wait for
a few reconnect errors.

Regards,
willy
Paul Jakma
2004-09-13 04:25:00 UTC
Permalink
Post by Willy Tarreau
It should not necessarily wait for the time-out, but at least wait for
a few reconnect errors.
No, it should wait for the timeout. (how many reconnects? maybe use a
time for that? well, you already have one, so use that. if you want
to timeout quicker, lower it.)

Alas though, it wouldnt be BGP.
Post by Willy Tarreau
Regards,
willy
regards,
--
Paul Jakma ***@clubi.ie ***@jakma.org Key ID: 64A2FF6A
Fortune:
You will be called upon to help a friend in trouble.
Tonnerre
2004-09-13 19:07:41 UTC
Permalink
Salut,
Post by Willy Tarreau
Post by Paul Jakma
The BGP state machine should instead, in normal operation, have
only treated Hold time expired as the definitive sign of "peer is
down" and allowed reconnects.
It should not necessarily wait for the time-out, but at least wait for
a few reconnect errors.
Problem there: you can fake connection errors almost as easily as
sending an RST packet, so the DoS might reappear, might it not?

Tonnerre
Willy Tarreau
2004-09-13 19:18:10 UTC
Permalink
Post by Tonnerre
Salut,
Post by Willy Tarreau
Post by Paul Jakma
The BGP state machine should instead, in normal operation, have
only treated Hold time expired as the definitive sign of "peer is
down" and allowed reconnects.
It should not necessarily wait for the time-out, but at least wait for
a few reconnect errors.
Problem there: you can fake connection errors almost as easily as
sending an RST packet, so the DoS might reappear, might it not?
No, as long as you don't keep the routes from the old session until the
new one establishes and fills up (or you reach the timeout). And when I
spoke about "connection errors", I really spoke about connection
establishment. I bet you'll have more difficulties trying to send the
right RST just after a SYN (or an ICMP unreachable with the right payload)
than sending them once the session is already established. It does make
a big difference.

Willy
Paul Jakma
2004-09-13 19:25:35 UTC
Permalink
Problem there: you can fake connection errors almost as easily as
sending an RST packet, so the DoS might reappear, might it not?
Sure, but TCP just isnt going to solve this for you.
Tonnerre
regards,
--
Paul Jakma ***@clubi.ie ***@jakma.org Key ID: 64A2FF6A
Fortune:
All progress is based upon a universal innate desire of every organism
to live beyond its income.
-- Samuel Butler, "Notebooks"
Ville Hallivuori
2004-09-13 20:11:13 UTC
Permalink
Post by Paul Jakma
More specifically, BGP should have treated TCP resets as a transient
error, to be expected (indeed, they /cant/ be a sign that a link is
Actually you can treat TCP session failure as transient error. Just
use BGP graceful restart (witch basically allows re-opening TCP
connection without losing routing tables).

http://www.ietf.org/internet-drafts/draft-ietf-idr-restart-10.txt
--
[Ville Hallivuori][***@iki.fi][http://www.iki.fi/vph/]
[ID 8E1AD461][FP16=C9 50 E2 DF 48 F6 33 62 5D 87 47 9D 3F 2B 07 5D]
[ID 58543419][FP20=8731 941D 15AB D4A0 88A0 FC8F B55C F4C4 5854 3419]
[ID 8061C24E][FP20=C722 12DA 841E D811 DBFE 2FB3 174C E291 8061 C24E]
Paul Jakma
2004-09-14 14:55:06 UTC
Permalink
Post by Ville Hallivuori
Actually you can treat TCP session failure as transient error. Just
use BGP graceful restart (witch basically allows re-opening TCP
connection without losing routing tables).
http://www.ietf.org/internet-drafts/draft-ietf-idr-restart-10.txt
Hmm, yes, I hadnt thought of the attack-mitigating aspects of
graceful restart. Though, without other measures, the session is
still is open to abuse (send RST every second).

regards,
--
Paul Jakma ***@clubi.ie ***@jakma.org Key ID: 64A2FF6A
Fortune:
Wit, n.:
The salt with which the American Humorist spoils his cookery
... by leaving it out.
-- Ambrose Bierce, "The Devil's Dictionary"
Alan Cox
2004-09-14 15:10:36 UTC
Permalink
Post by Paul Jakma
Hmm, yes, I hadnt thought of the attack-mitigating aspects of
graceful restart. Though, without other measures, the session is
still is open to abuse (send RST every second).
Its more than that given port randomization, quite a lot more. Of course
its much easier to just send "must fragment, size 68" icmp replies and
guess them that way. This is spectacularly more effective and various
vendors highly invalid rst acking crap won't save you.
Paul Jakma
2004-09-14 16:26:24 UTC
Permalink
Post by Alan Cox
guess them that way. This is spectacularly more effective and
various vendors highly invalid rst acking crap won't save you.
Ah, well, I dont care about various vendors. I only care about Linux,
BSD and SunOS kernel behaviour ;)

That said, TCP-MD5 signature renders this mostly moot, and deployment
of TCP-MD5 has increased a lot since the last round of "BGP TCP is
insecure!" non-issues came up. Many IXes and peers now require
TCP-MD5.

The rights and wrongs of TCP-MD5 notwithstanding, it'd be nice if
Linux could support this. Anyone running BGP on Linux at moment must
patch their kernel - or else just switch to Free/Open BSD.

regards,
--
Paul Jakma ***@clubi.ie ***@jakma.org Key ID: 64A2FF6A
Fortune:
It looks like it's up to me to save our skins. Get into that garbage chute,
flyboy!
-- Princess Leia Organa
Paul Jakma
2004-09-14 17:17:53 UTC
Permalink
TCP-MD5 has no effect on ICMP based attacks.,
Hmm, good point. Which attacks, and what could be done about them?
(other than IPsec protect all traffic between peers).

regards,
--
Paul Jakma ***@clubi.ie ***@jakma.org Key ID: 64A2FF6A
Fortune:
"You can't get very far in this world without your dossier being there first."
-- Arthur Miller
Florian Weimer
2004-09-20 22:02:46 UTC
Permalink
Post by Paul Jakma
TCP-MD5 has no effect on ICMP based attacks.,
Hmm, good point. Which attacks, and what could be done about them?
(other than IPsec protect all traffic between peers).
You just filter ICMP packets, in the way RST packets are already
filtered (i.e. rate limit).

The only TCP desynchronization attack that has a chance of working
practice is the SYN-based one. The rate limit for RST processing on
Cisco routers is far too low.

(Mixed Cisco/Quagga environments are a different matter, but rather
unusual and easily DoSed anyway, most of the time.)
Herbert Xu
2004-09-21 02:14:48 UTC
Permalink
Post by Florian Weimer
Post by Paul Jakma
TCP-MD5 has no effect on ICMP based attacks.,
Hmm, good point. Which attacks, and what could be done about them?
(other than IPsec protect all traffic between peers).
You just filter ICMP packets, in the way RST packets are already
filtered (i.e. rate limit).
Rate-limiting has no effect on ICMP attacks unless your limit is such
that you're effectively dropping them all. But then you get PMTU
problems...
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <***@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Florian Weimer
2004-09-21 18:32:12 UTC
Permalink
Post by Herbert Xu
Post by Florian Weimer
Post by Paul Jakma
TCP-MD5 has no effect on ICMP based attacks.,
Hmm, good point. Which attacks, and what could be done about them?
(other than IPsec protect all traffic between peers).
You just filter ICMP packets, in the way RST packets are already
filtered (i.e. rate limit).
Rate-limiting has no effect on ICMP attacks unless your limit is such
that you're effectively dropping them all.
Yes, that's the idea. Keep in mind that all this is about traffic
destined to a router interface address, not about forwarded traffic.
Post by Herbert Xu
But then you get PMTU problems...
PMTU discovery is not an issue because it's turned off anyway, at
least by default.
Florian Weimer
2004-09-21 20:04:41 UTC
Permalink
On Tue, 21 Sep 2004 20:32:12 +0200
Post by Florian Weimer
Post by Herbert Xu
But then you get PMTU problems...
PMTU discovery is not an issue because it's turned off anyway, at
least by default.
It's on by default, for both TCP and UDP in the kernel,
and has been so for a long time.
Linux is not the reference TCP/IP stack for routers. 8-)
Why would it be off by default?
Probably because PMTUD is just a DRAFT STANDARD, and these router
folks are usually extremely conservative. Switching the default is
dangerous because it's likely to break existing setups, as Herbert
noted.
If you disable PMTU discovery, say goodbye to TCP performance.
Indeed. On those platforms, the CPU impact is also significant, and
the overall increase in BGP convergence time is measurable.
David S. Miller
2004-09-21 20:25:16 UTC
Permalink
On Tue, 21 Sep 2004 22:04:41 +0200
Post by Florian Weimer
Why would it be off by default?
Probably because PMTUD is just a DRAFT STANDARD,
RFC1191 doesn't look like a draft to me.
Florian Weimer
2004-09-21 20:51:40 UTC
Permalink
Post by David S. Miller
On Tue, 21 Sep 2004 22:04:41 +0200
Post by Florian Weimer
Why would it be off by default?
Probably because PMTUD is just a DRAFT STANDARD,
RFC1191 doesn't look like a draft to me.
It's not a draft document, but it's still a DRAFT STANDRD in the IETF
standards track, see RFC 3700. (I wasn't shouting, I was using IETF
keywords. 8-)

David S. Miller
2004-09-21 19:56:45 UTC
Permalink
On Tue, 21 Sep 2004 20:32:12 +0200
Post by Florian Weimer
Post by Herbert Xu
But then you get PMTU problems...
PMTU discovery is not an issue because it's turned off anyway, at
least by default.
It's on by default, for both TCP and UDP in the kernel,
and has been so for a long time.

Why would it be off by default?

If you disable PMTU discovery, say goodbye to TCP
performance.
Alan Cox
2004-09-14 16:09:35 UTC
Permalink
Post by Paul Jakma
That said, TCP-MD5 signature renders this mostly moot, and deployment
of TCP-MD5 has increased a lot since the last round of "BGP TCP is
insecure!" non-issues came up. Many IXes and peers now require
TCP-MD5.
TCP-MD5 has no effect on ICMP based attacks.,
Willy Tarreau
2004-09-14 19:41:41 UTC
Permalink
Hi Alan,
Post by Paul Jakma
Hmm, yes, I hadnt thought of the attack-mitigating aspects of
graceful restart. Though, without other measures, the session is
still is open to abuse (send RST every second).
Of course its much easier to just send "must fragment, size 68" icmp
replies and guess them that way. This is spectacularly more effective
and various vendors highly invalid rst acking crap won't save you.
Just wondering, I have not checked. Isn't the "must fragment" message
supposed to embed part of the packet it couldn't send in return ? If
this is the case (and if the victim processes it correctly), it would
need to guess a recent valid content. If it's not the case, I suspect
it would simply update the path mtu in the route cache, thus giving
spectacular effects :-)

Cheers,
Willy
Alan Cox
2004-09-14 18:56:41 UTC
Permalink
Post by Willy Tarreau
Just wondering, I have not checked. Isn't the "must fragment" message
supposed to embed part of the packet it couldn't send in return ?
You need to guess no more than for an RST attack, and furthermore in
some cases (buggy stacks) IPsec doesn't save you because the error is
from an untrusted midpoint. The proper response to such messages is to
turn off DF usage but not all stacks get it right
Florian Weimer
2004-09-20 22:03:18 UTC
Permalink
Post by Alan Cox
Post by Paul Jakma
Hmm, yes, I hadnt thought of the attack-mitigating aspects of
graceful restart. Though, without other measures, the session is
still is open to abuse (send RST every second).
Its more than that given port randomization, quite a lot more. Of course
its much easier to just send "must fragment, size 68" icmp replies and
guess them that way.
Is this attack documented anywhere?
Alan Cox
2004-09-20 23:12:55 UTC
Permalink
Post by Alan Cox
ndomization, quite a lot more. Of course
Post by Alan Cox
its much easier to just send "must fragment, size 68" icmp replies and
guess them that way.
Is this attack documented anywhere?
Bugtraq years ago and also in the discussions of the IP sec protocol
design flaws when it was being specified
Willy Tarreau
2004-09-12 17:59:46 UTC
Permalink
Hi Dale,

I've tried your code right here.
The "attacker" was 10.0.3.1, and the victim 10.0.3.2.

I could successfully generate 1 CLOSE_WAIT on the victim with your program.
It was on port 23 and attached to inetd as fd #3. So I killed inetd, the
connection was then freed, and restarted it.

I changed the code slightly to be able to pass IP/ports as arguments.
On the victim, I straced inetd (pid 1013), and captured all TCP traffic
on port 23.

attacker> ./tcpnclose2 10.0.3.2 22 10.0.3.2 23

I stopped it when it was shouting at me :
socket failed.Connecting to 10.0.3.2:22 (FD: -1)... FAILED: UNKNOWN ERROR.
socket failed.Connecting to 10.0.3.2:23 (FD: -1)... FAILED: UNKNOWN ERROR.

Then, on the victim :

victim> sudo netstat -atnp|grep -v LISTEN
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 1 0 10.0.3.2:23 10.0.3.1:34058 CLOSE_WAIT 1013/inetd

victim> tcpdump -Svnr capture-victim.cap tcp port 34058
reading from file capture-victim.cap, link-type EN10MB (Ethernet)
19:05:10.360728 IP (tos 0x0, ttl 64, id 8168, offset 0, flags [DF], length: 48) 10.0.3.1.34058 > 10.0.3.2.23: S [tcp sum ok] 2882867180:2882867180(0) win 15920 <mss 7960,nop,nop,sackOK>
19:05:10.360764 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], length: 48) 10.0.3.2.23 > 10.0.3.1.34058: S [tcp sum ok] 2614211278:2614211278(0) ack 2882867181 win 5840 <mss 1460,nop,nop,sackOK>
19:05:10.360863 IP (tos 0x0, ttl 64, id 8169, offset 0, flags [DF], length: 40) 10.0.3.1.34058 > 10.0.3.2.23: . [tcp sum ok] ack 2614211279 win 15920
19:06:17.668670 IP (tos 0x0, ttl 64, id 8170, offset 0, flags [DF], length: 40) 10.0.3.1.34058 > 10.0.3.2.23: F [tcp sum ok] 2882867181:2882867181(0) ack 2614211279 win 15920
19:06:17.671102 IP (tos 0x0, ttl 64, id 11127, offset 0, flags [DF], length: 40) 10.0.3.2.23 > 10.0.3.1.34058: . [tcp sum ok] ack 2882867182 win 5840

==> We see that the victim (10.0.3.2) did not send the FIN in return.

Now let's take a closer look at inetd :

victim> cat /proc/net/tcp
sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode
16: 0203000A:0017 0103000A:850A 08 00000000:00000001 00:00000000 00000000 0 0 6420 1 d5dac400 1500 20 0 2 -1

==> The socket (state 8 = CLOSE_WAIT) is bound to inode 6420.

victim> sudo ls -l /proc/1013/fd/|grep 6420
lrwx------ 1 root root 64 Sep 12 19:28 3 -> socket:[6420]

==> Again, it's FD #3.

I restarted strace on inetd, and noticed that fd#3 was not in the select fd
list anymore (remember one of the two cases I spoke about a few hours ago ?) :
victim> strace -p 1013
select(22, [4 5 6 7 8 9 11 12 13 14 15 16 17 18 19 20 21], NULL, NULL, NULL <unfinished ...>

Then, I took a look at the strace capture (184 MB !), to which I inserted line
numbers for better readability :

1:1013 accept(10, 0, NULL) = 3
2:1013 fcntl64(10, F_SETFL, O_RDONLY) = 0
3:1013 rt_sigprocmask(SIG_BLOCK, [HUP ALRM CHLD], NULL, 8) = 0
4:1013 fork() = 1108
5:1013 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
6:1013 close(3) = 0

This was the last but one connection assigned to fd #3. As you see, it's
finally closed. But a few lines later :

7:1013 select(22, [4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21], NULL, NULL, NULL) = 1 (in [10])
8:1013 fcntl64(10, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
9:1013 accept(10, 0, NULL) = 3
10:1013 fcntl64(10, F_SETFL, O_RDONLY) = 0
11:1013 rt_sigprocmask(SIG_BLOCK, [HUP ALRM CHLD], NULL, 8) = 0
12:1013 gettimeofday({1095008773, 685550}, NULL) = 0

The FD gets re-used, but is never scanned anymore, so never closed either :

35:1013 select(22, [4 5 6 7 8 9 11 12 13 14 15 16 17 18 19 20 21], NULL, NULL, NULL <unfinished ...>

Conclusion :
============

The problem is within inetd. In my case it could be because it was a bit
old (1999), but since you have it too, it might indicate an old bug. The
fact that it affects mysql too does not prove that the problem is in the
kernel, and I suspect that for whatever reason, there are some race
conditions in these two programs if the connection is either reused or
closed very quickly.

To demonstrate this, I've run your program against my reverse-proxy,
haproxy, which I fortunately happen to know better than these other
programs. I could not manage to get even a CLOSE_WAIT session after
several attempts. All connections are closed normally, and as you'll
see with this extract from strace, the polled file-descriptors are
active once you kill the attacker :

(...)
close(593) = 0
select(684, [3 5 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617
618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636
637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655
656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674
675 676 677 678 679 680 681 682 683], NULL, NULL, {4, 835000}) = 81 (in [603
604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622
623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641
642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660
661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679
680 681 682 683], left {4, 836000})
gettimeofday({1095011266, 506966}, NULL) = 0
recv(603, "", 4096, 0x4000) = 0
recv(604, "", 4096, 0x4000) = 0
recv(605, "", 4096, 0x4000) = 0
(...)
close(605) = 0
close(604) = 0
close(603) = 0
select(6, [3 5], NULL, NULL, NULL <unfinished ...>

So I believe you'll have to dig into some programs because at least you found
a vulnerability in both inetd and mysql :-)

Regards,
Willy
Willy Tarreau
2004-09-12 18:18:43 UTC
Permalink
Hi again, Dale,

I forgot to say that you don't need to fear releasing your exploit. I
developped its equivalent 4 years ago to stress-test web servers and
proxies, and if I launch it against victim:23, I get the exact same
result within seconds : a CLOSE_WAIT socket :

attacker> ./connectdata 10.0.3.2 23 200 1
ERROR: connect()=-1, nbconn=134 : Connection refused
ERROR: connect()=-1, nbconn=135 : Connection refused
ERROR: connect()=-1, nbconn=136 : Connection refused
ERROR: connect()=-1, nbconn=137 : Connection refused

The program connects 200 sockets to the same IP:port, and sends the begining
of an HTTP request.

victim> sudo netstat -atnp|grep -v LISTEN
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 17 0 10.0.3.2:23 10.0.3.1:38214 CLOSE_WAIT 1333/inetd

It's even not necessary to send data, then even faster to block my very old
inetd :

attacker> ./connectdata-nb 10.0.3.2 23 200
200 connections established.
Press any key so exit.

This time, it sends 200 non-blocking connect() calls without any data. It
takes a fraction of a second with the same result. Hopefully, it'll will
help Peter and you reproduce the problem faster on mysql.

Both programs have been freely available here for two years ; I didn't think
they would be useful again !

http://w.ods.org/tools/connect/

Regards,
Willy
Alan Cox
2004-09-12 17:17:21 UTC
Permalink
Post by Willy Tarreau
The problem is within inetd. In my case it could be because it was a bit
old (1999), but since you have it too,
Ancient inetd had several fd leak bugs fixed over time and some other
problems with built in services. Not much of a suprise that a 1999 inetd
has it.

Alan
Wolfpaw - Dale Corse
2004-09-12 19:42:26 UTC
Permalink
Hey,

I'm not Alan, just trying to save him some typing :) The
issue being referenced is that BGP4 as you know uses TCP
communication to check link status. If the TCP session is
severed in any way, BGP assumes the link down, and drops
the advertisements, or the table depending which side
(at least in Cisco's implementation).

This basically leaves you dead in the water for a few
seconds while the BGP session is re-established, and
the advertisements are send out, and table rebuilt. It
can also cause other routers on the net to see you as
"flapping", and dampen your routes.. Which again leaves
you dead in the water (at least from things behind them).

This is accomplished by guessing the correct TCP Sequence
number, and sending RST packets to drop the TCP connection.

BGP does not actually check the layer 2 status of the
connection to make sure the link is still UP before
it assumes you have dropped. I believe this is the
poor implementation he is referring to.

MD5 encryption was added to the sessions between
routers to make hijacking the stream more difficult
(if not next to impossible)

D.
-----Original Message-----
Sent: Sunday, September 12, 2004 1:36 PM
To: 'Dale'
Subject: FW: Linux 2.4.27 SECURITY BUG - TCP Local and
REMOTE(verified) Denial of Service Attack
-----Original Message-----
Sent: Sunday, September 12, 2004 1:24 PM
To: Alan Cox
Kernel Mailing List
Subject: Re: Linux 2.4.27 SECURITY BUG - TCP Local and
REMOTE(verified) Denial of Service Attack
Post by Alan Cox
This is not a TCP flaw, its a combination of poor design by certain
vendors, poor BGP implementation and a lack of
understanding of what
Post by Alan Cox
TCP does and does not do. See IPSec. TCP gets stuff from A to B in
order and knowing to a resonable degree what arrived. TCP does not
proide a security service.
(The core of this problem arises because certain people treat TCP
connection down on the peering session as link down)
Alan, could you please elaborate on this last statement?
I don't understand what you mean, and am very interested.
Thanks,
Toon.
--
"Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as
possible, you are, by definition, not smart enough to debug
it." - Brian W. Kernighan
--------------------------------------------------------------
--------------
-
This message has been scanned for Spam and Viruses by ClamAV
and SpamAssassin
--------------------------------------------------------------
--------------
-
Willy Tarreau
2004-09-12 19:53:31 UTC
Permalink
Post by Wolfpaw - Dale Corse
MD5 encryption was added to the sessions between
routers to make hijacking the stream more difficult
(if not next to impossible)
Correction : MD5 *signature* was added from the beginning since the problem
was identified from start, but seeing that certain people did not implement
it, others found interesting to turn this into a "generic TCP vulnerability"
to get some credits, or perhaps to make them react positively.

Regards,
Willy
Continue reading on narkive:
Loading...