When designing Active Directory Federation Services (AD FS) my actual involvement with the networking guys who handle the load balancer configuration is generally limited to a few calls and emails. We provide some requirements in the forms of availability and persistence or stickiness and they do what needs to be done. Truthfully, you get a high-level architectural view of the various solutions which helps with subsequent designs as you have a bit of experience of pitfalls from past engagements. This post is nothing prescriptive. I’m no expert with BigIP, Cisco or Brocade. But I do know AD FS and I do want to talk about a nuance that can change the basic persistence recommendations.
AD FS. Generally you’ll have an internal (corpnet) Virtual IPv4 (VIP) for federation service (FS) servers and an external (Internet-facing) VIP for the Federation Service Proxy (FS-P). The VIP often supports both HTTP and HTTPS protocols, although most deployments are 95-100% HTTPS. The persistence or stickiness setting is generally NONE. AD FS is stateless and does not require client session affinity. The load balancer needs to do what it does to keep the TLS connection alive, but beyond that there is nothing application specific that requires affinity.
MFA Server. This topology can vary a little, but I typically have two corpnet VIPs – one for the SDK WS, one for the user portal; and one external VIP for the user and mobile portals. These VIPs only really need to be HTTPS but I like to use both HTTP and HTTPS and redirect HTTP to HTTPS. These VIPs require session-based affinity, and generally we use source-IP.
Let me draw what an AD FS IdP with MFA Server looks like to summarise the above.
What am I trying to convey here?
- DC#1-WAP and DC#1-PTL(EXT) are the external (Internet-facing) VIPs – WAP is the FS-P and PTL(EXT) is the User and Mobile portals
- DC#1-FS, DC#1-MFA and DC#1-PTL(INT) are the internal (corpnet) VIPs for AD FS, MFA (SDK WS) and User portal
- The MFA-related VIPs have persistence/affinity/stickiness; the AD FS VIPs do not
Deviation from typical in a complex deployment
This is all well and good for most deployments. However I have recently seen an intermittent issue with SMS OTP in this topology that resulted in us having to introduce source-IP based stickiness at the corpnet FS VIP. Here’s why…
In my deployments there are too many FS servers to consider installing the MFA Server onto, so I always use the MFA Server secondary authentication provider, a.k.a. “AD FS Adapter”. This talks to the SDK WS, which is load balanced.
The FS-P (WAP) talks to any AD FS server via the FS VIP. Payload is sent, response received and job done. FS-P uses connection pooling and can and will talk to one of several FS servers and with round-robin distribution and no stickiness the FS can and will change.
Now, when a user is authenticating from the Internet and is therefore interacting with the FS-P and MFA is invoked and the user’s chosen MFA mechanism is one-way SMS there can be a valid delay between the initial authentication with username and password and the OTP being entered. What we saw was that sometimes the FS-P would send the user-input OTP to a different FS than the one that the primary authentication happened with. This is not at all desirable. The FS that performed the primary authentication invoked the MFA and as such is the only server that can accept the OTP because only the MFA Server that initiated the OTP from the cloud service holds the OTP – the OTP is not written to the data file. When the FS-P uses a different connection to another FS the FS sends the OTP to a different MFA Server which rejects it because it has no knowledge of that OTP, so authentication fails.
The solution is to turn on stickiness at the FS VIP so that the FS-P servers maintain connections with the same FS. The FS servers are already seeing consistent and persistent sessions with the MFA Server servers.
Let me try and draw this just to be super clear.
- First, the blue arrow shows our on-premises federated user authenticate via FS-P (WAP01) which talks to FS01 which in turn talks to MFA01 which invokes one-way OTP.
- Secondly, the red arrow shows our user enter her OTP at WAP01 which communicates with FS03 which in turn communicates with MFA03 which has no knowledge of the OTP.
If you are using a deployment of AD FS 2012 R2 or AD FS 2016 and Azure MFA Server and the secondary authentication provider for MFA Server known as the AD FS adaptor and you allow one-way SMS as a means of authentication then you have to ensure SSL/TLS stickiness/affinity/persistence between FS-P (WAP) and FS (AD FS) in addition to stickiness on the MFA SDK web services VIP.
I first encountered this issue with a hybrid IdP (Azure AD Premium and AD FS as described above) with a significant user base (~100,000 active users across multiple geographies) and a medium to large number of federated applications (~100) during the post go-live phase of rolling out the MFA Server and Conditional Access Policies. Our concern at the time was that implementing stickiness on the FS VIP could result in multiple FS-P servers getting a sticky session to a single FS which would result in an uneven load distribution. We weren’t too bothered that the FS servers themselves couldn’t handle it – they can, we just didn’t want to see big spikes on a small set of the servers. Our concerns were unfounded. We had the luxury of Azure AD Connect Health giving us a rich set of easily digestible performance data and we monitored it closely when we made the change to the VIP and for a week or so after and we saw no ill effects. Hence the reason I’m happy to post this recommendation as I’ve since deployed another similar hybrid IdP and am now designing another and have to commit this recommendation to those designs.
I hope this helps. (And I hope it makes sense.)