microsoft / service-fabric-issues

This repo is for the reporting of issues found with Azure Service Fabric.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Azure AD Service Fabric Explorer authentication via AAD suddenly stopped working - Azure Gov

sladeedmonds opened this issue · comments

We've had a cluster running in Azure Gov for quite some time (over a year), and it was deployed with the AAD integration so that users could authenticate to SFX with AAD credentials. Cluster version is 6.4.658.9590. We are using the same CA-acquired SSL certificate for the cluster and the reverse proxy.
AAD authentication to SFX is now failing, prompting users for a client certificate. It has worked fine since the cluster was originally deployed. The AAD applications created to handle authentication were created with the AADTool repository (SetupApplications.ps1). Those apps still existing in our Gov AAD tenant and have been untouched. According to MS documentation, the reason this issue happens is because users haven't been granted the application rights and SF fails back to client certificates; however, that would not be our problem as all of the users still have rights to the application, including myself.

As a way to better understand what might be happening, I deployed a new cluster in to my Sandbox environment in Gov. I have used this same template numerous times in the past without issue. This sandbox deployment also has AAD authentication configured and I used the same SetupApplications.ps1 script (with parameters for the sandbox cluster) to create the AAD applications. The sandbox applications specific to my sandbox deployment have existed for roughly the same amount of time as the ones created for our live cluster and I have never encountered issues logging in to SFX in my sandbox environment. Now though, when I deploy my sandbox cluster, I experience the same issue... I am presented with a dialog to select a client certificate.

I then destroyed my sandbox cluster and the sandbox applications used to authenticate to AAD. I re-created new AAD applications for the sandbox cluster using SetupApplications.ps1 and re-deployed the SF sandbox cluster--same issue.

Next, I modified my ARM deployment to force the oldest version of SF that is allowable (6.4.617.9590), thinking perhaps an upgrade somewhere along the way has broken AAD authentication. Same issue.

Unfortunately, I am not sure when the issue started because we don't actually need to log in to SFX very often. Sometime around June 1 2019 is our best guess as to the last time we've logged in to SFX on the production cluster.

I have a gut feeling something has changed specific to the Azure Gov back-end environment that is causing this. I have raised a ticket, but so far there has been little progress. What is causing this to suddenly fail in Azure Gov?

Also, I have confirmed that after upgrading to 6.5.641.9590, the issue still exists. I also deployed a new Sandbox environment, completely skipping over 6.4.658.9590 and the issue still exists.

For anyone else that may encounter the issue, something on the AAD Gov backend apparently broke the functionality. I have a case with Microsoft open on the issue, and was told that 6.5.658.9590 would resolve the issue. Unfortunately, that is not the case--the issue exists with 6.5.658.9590 and I have not received any updates in regards to the status of the bug. The workaround for now is to modify the cluster:

"fabricSettings": [
                    {
                        "parameters": [
                            {
                                "name": "ClusterProtectionLevel",
                                "value": "[parameters('clusterProtectionLevel')]"
                            },
                            {
                                "name": "AADTokenEndpointFormat",
                                "value": "https://login.microsoftonline.us/{0}"
                            },
                            {
                                "name": "AADCertEndpointFormat",
                                "value": "https://login.microsoftonline.us/{0}/federationmetadata/2007-06/federationmetadata.xml"
                            }
                        ],
                        "name": "Security"
                    },

You can also add these parameters using the Custom fabric settings blade in the portal.

Just to share... 6.5.664.9590 also does not resolve the issue.

They've corrected a backend provider-level issue in the Gov cloud and the issue has been resolved. I confirmed with a new deployment.