eIAM Geo-Redundancy
Geo-Redundancy of eIAM
The control over which data center is addressed for eIAM-Core is managed via DNS entries. Normally, the hostnames (FQDN) of eIAM-Core point to the load balancers in Primus; after a switch, they point to Campus. As a result, a switch leads to an interruption due to the DNS query validity period (DNS TTL). Currently, an interruption of about 30 minutes is to be expected during a switch.Under eIAM-Core, services 1–10 and 12 are grouped according to eIAM Services. The service eIAM RP-PEP (Service 11) is treated separately because the RP-PEP must, from a network perspective, lie in the communication path between the browser and the application backend.
Geo-Redundancy of Applications
Depending on the eIAM failure scenario, different effects arise for applications. For geo-redundant applications, eIAM assumes that these have been set up according to the specifications in Chapter 4.3 and 5 of the IT guideline on implementing availability classes and geo-redundancy (German)(Perma-Link: )
Integration Pattern BTB-Direct
Applications connected via STS-BTB access eIAM through defined hostnames, which point to components in Primus (by default) or Campus (if Primus has completely failed as a data center), depending on operational status. Thus, a switch of eIAM-Core is fundamentally independent and transparent from a possible switch of a geo-redundant application. Whether an application should or must also switch depends not on eIAM, but on the incident itself (e.g. data center failure).-
- STS-BTB integrated geo-redundant application, active in campus
Scenario | Application in Primus | Application in Campus | Geo-redundant Application (Primus + Campus) | Application outside Primus / Campus (Other DC Public Cloud) | Application outside Primus / Campus geo-redundant (in 2 Cloud Regions) |
---|---|---|---|---|---|
Failure of DC Primus (eIAM-Core switches to Campus) | Application fails | Logins still possible | Logins still possible. Application: If the application was active on Primus, it must also switch as part of application operations. | Logins still possible | Logins still possible |
Application failure | Application fails | Application fails | Logins still possible. Application: Switches to the other DC | Application fails | Logins still possible. Application switches to another region (or is already active there) |
Integration Pattern STS-PEP
Scenario | Application in Primus | Application in Campus | Geo-redundant Application (Primus + Campus) | Application outside Primus / Campus (Other DC Public Cloud) | Application outside Primus / Campus geo-redundant (in 2 Cloud Regions) |
---|---|---|---|---|---|
Failure of DC Primus (eIAM-Core switches to Campus) | Application fails | Logins no longer possible (*) | Logins no longer possible (*) | Logins no longer possible (*) | Logins no longer possible (*) |
Application failure | Application fails | Application fails | Logins still possible. Application: Switches to the other DC | Application fails | Logins still possible. Application switches to another region (or is already active there) |
Integration Pattern RP-PEP
Also note: Migration RP-PEP to STS-PEPIn contrast to STS-BTB integration, the RP-PEP integration creates a dependency between eIAM and the application, as the RP-PEP is an eIAM component placed in the communication path between the browser and the application. Therefore, the RP-PEP must always run in the same data center as the application itself. For geo-redundant applications with RP-PEP, eIAM always considers the RP-PEPs of both data centers to be active so that an application can switch independently of eIAM. RP-PEPs are not offered outside of Primus / Campus.
-
- RP-PEP integrated geo-redundant application, active on campus
Scenario | Application in Primus | Application in Campus | Geo-redundant Application (Primus + Campus) |
---|---|---|---|
Failure of DC Primus (eIAM-Core switches to Campus) | Application fails | Logins still possible | Logins still possible. Application: If the application was active on Primus, it must also switch as part of application operations. |
RP-PEP or Load Balancer failure | Application fails | Application fails | Logins still possible. Application: Switches to the other DC as part of application operations |
Application failure | Application fails | Application fails | Logins still possible |
Fundamentally, application operations are responsible for deciding whether to switch based on situational assessment and coordination with eIAM operations, or—during a major incident—with incident management, or whether to wait for the issue to be resolved in the data center.
Switch by application operations for geo-redundant applications with RP-PEP
During integration, front load balancers are ordered by the eIAM SIE and their FQDN / CNAMEs are handed over to the application operations. Example scenario:
- FQDN: app.amt.admin.ch
- CNAME Primus: app.primus.amt.admin.ch
- CNAME Campus: app.campus.amt.admin.ch
- Open a Critical Incident in Remedy requesting mutation to now point the FQDN to the CNAME of Campus.