We’ve recently installed a couple of Juniper J-Series routers that
have the new JUNOS with Enhanced Services installed on them. During the
transition from our existing Linux routers, we started moving internal
subnets to the new routers, but when we moved the first subnet, we
discovered a problem with hosts that had addresses on two different
subnets. Connections would either connect for a minute and work and then
get a connection reset, or packets would come in to the server and leave
again, but then get swallowed in the ether.
I spent quite a bit of time this week reading about the security
features of the new routers, and finally came up with a solution. The
first clue was that I was getting connection reset from something on the
network, but carrying out packet sniffing on our existing routers and
the end points showed that they weren’t generating it. I eventually
found the tcp-rst option, which generates a reset packet for
any non-SYN packet that doesn’t match an existing flow session. JUNOS ES
does stateful packet inspection by creating a session when it sees an
initial SYN packet and then does filtering and routing based on that
flow session so it doesn’t have to do it for every packet. When I turned
off the tcp-rst option on the trust zone, my connection that
worked for a minute worked again only for a minute, but this time, it
just hung, rather than dying with a connection reset. This cemented the
idea that the Juniper routers were the cause.
It turned out that the problem was that there was asynchronous
routing going on. A packet was coming in to 10.0.0.1/24, but the server
was also on 10.0.1.1/24 and the default route nexthop was 10.0.1.2.
Depending on which subnet we moved depended on the resulting behavour.
If we moved 10.0.0.0/24 to the new routers, they would only see the
incoming side of the conversation. If we moved 10.0.1.0/24, they would
only see the outgoing side of the conversation. If we then think how
this would work with the session-based routing and firewalling, in the
first case, the router would see the initial SYN packet, but would never
see the returning SYN-ACK packet, and after an initial timeout, decide
the flow never established and destroy the session info, resulting in
further incoming packets to be dropped. In the second case, it would
never see the SYN packet, only the SYN-ACK. This packet wouldn’t belong
to an existing session, so would be blocked. The solution is to turn off
the SYN check, using:
[edit] user@host# set security flow tcp-session no-syn-check
After commiting that, sessions work correctly, even without the
router seeing both sides of the connection:
user@host> show security flow session destination-prefix 10.0.0.1 Session ID: 201341, Policy name: default-permit/4, Timeout: 1798 In: 192.168.0.1/61136 --> 10.0.0.1/22;tcp, If: ge-0/0/0.7 Out: 10.0.0.1/22 --> 192.168.0.1/61136;tcp, If: ge-0/0/1.7 1 sessions displayed
Sadly, I couldn’t find much information on the no-syn-check
option, so hopefully people will find this explaination useful.