Tuesday, May 6, 2014

Fortigate VIPs ate my packets.

We traded our Cisco ASAs for Fortinet Fortigates (FGT). So far, the trade-off seems to be a pretty, useable interface, for the rock solid (albeit annoying) functionality of the Cisco. One major issue (for us) that came up during our rollout was related to Virtual IPs (VIPs), essentially Fortinet parlance for destination NAT.

We have a very odd NAT situation. For a particular service we offer, we have clients that are incapable of connecting to the listening port (more accurately, the amount of red tape required to change a port number in a their script requires hundreds of hours of meetings, and many thousands of developer hours).

As a result, we have supported these clients by using a port redirection, but only for certain source addresses, because the port they MUST connect to is in use by another application (confused yet? I am).  On the Fortigate, we solved this by creating a pair of VIPs. One broad, for all ports; The other specific to the goofy awfulness. Here is the example of how we made it work.

edit "srv-v4-redir"
set src-filter "" ""
    set extip
    set extintf "any"
    set portforward enable
    set mappedip
    set extport 77
    set mappedport 10077
edit "srv-v4"
    set extip
    set extintf "any"
    set mappedip

As horrible as it looks, it actually works. The result is that clients and connect to port 77, but they actually get DNAT to port 10077. Anyone else connecting to port 77 goes to port 77.

It is worth noting that our original configuration was more awful, and broken. When I originally configured this bit of NAT, I was still learning my way around the FGT. I mistakenly configured the "src-v4-redir" VIP with an extintf of "vlan7", our outside interface. This broke other services using the broader "src-v4" VIP, but in amazingly random ways. All traffic from the outside worked fine. However, we discovered breakage for some users who connect to another service on that VIP from "vlan4"… but only for users coming from some source networks (networks unrelated to the "redid" sources).

In these cases, the traffic would just vanish into the FGT, as confirmed by the sniffer, and flow traces. In the latter case, traffic would fail with the following cryptic trace messages.

fortinet-1a # id=12 trace_id=26 msg="vd-root received a packet(proto=6,> from vlan4."
id=12 trace_id=26 msg="allocate a new session-00e3addb"
id=12 trace_id=26 msg="find SNAT: IP- IPPOOL), port-0"
id=12 trace_id=26 msg="use addr/intf hash, len=13"
id=12 trace_id=26 msg="pre_route_auth check fail(id=0), drop"

After escalating the ticket several times with Fortinet, and two weeks of broken connections (I'm calling you out here Fortinet, two weeks for an answer is unacceptable), we finally got assigned to a foul-mouthed engineer (the best kind). He identified the extintf problem, in between bouts of telling me what a kludgy setup this is…Yes, I know. I don't like it either, and offered a number of possible solutions, with the caveat of "I can't guarantee it, because nobody does this." We tested the change, which annoyingly required us to remove all references to the "redid" VIP, and it worked. Life goes on, I'm pretty satisfied with the FGT.

