Many of us in the networking world use IPSEC VPNs over the Internet. The ISP connection is, or at least can be, cheaper than alternatives like MPLS, and of course we all need to connect our networks to the Internet (unless you’re the DoD, CIA, or some other secretive organization with a classified network). This mystery begins with a VPN outage.
Problem: IPSEC VPN Down
At 2:44am CT the primary 10Mbps IPSEC VPN went down, but the 3Mbps MPLS worked flawlessly after route reconvergence. As the day progressed, the level of traffic between the two sites increased and began causing performance problems for users at Site B.
As we continued to troubleshoot what had happened, we found this syslog entry in Splunk that came from FW A:
Oct 14 02:44:33 fw.fw.fw.21 Oct 14 2010 02:44:33: %ASA–4–106023: Deny protocol 47 src inside:a.a.a.1 dst outside:b.b.b.254 by access–group “inside_access_in“
(Note: IP addresses have been changed here for security reasons.)
Nobody had made any changes at 2:44am. So what changed? After digging some more into our change management system, we found this change to FW A that was made back on 9/23:
BEFORE
|
AFTER
|
Last Month – 9/23/2010 12:00:18 AM
|
ADDS 0, DELETES 0, CHANGES 1
|
access-list inside_access_in extended permit gre host a.a.a.1 host b.b.b.254
|
access-list inside_access_in extended permit gre host a.a.a.254 host b.b.b.254
|
- Oct 14 02:44:26 fw.fw.fw.21 Oct 14 2010 02:44:26: %ASA-3-713123: Group = [FW B InternetIP], IP = [FW B InternetIP], IKE lost contact with remote peer, deleting connection (keepalive type: DPD)
- Oct 14 02:44:26 fw.fw.fw.21 Oct 14 2010 02:44:26: %ASA-5-713259: Group = [FW B InternetIP], IP = [FW B InternetIP], Session is being torn down. Reason: Lost Service
- Oct 14 02:44:26 fw.fw.fw.21 Oct 14 2010 02:44:26: %ASA-4-113019: Group = [FW B InternetIP], Username = [FW B InternetIP], IP = [FW B InternetIP], Session disconnected. Session Type: IPsec, Duration: 21d 15h:00m:15s, Bytes xmt: 181785169, Bytes rcv: 3049561298, Reason: Lost Service
Mystery solved!