Tech Brief: Local Host Cache / High Availability mode for Citrix Desktops-as-a-Service (DaaS)

Overview

Local Host Cache (LHC), in the context of the Citrix Desktops-as-a-Service (DaaS), can be thought of as an insurance policy. This insurance policy comes into play when, for whatever reason (outages, connection issues, internet blackouts and so on.), the Citrix Cloud Connectors are not able to communicate with the Citrix brokering service (part of Citrix DaaS and henceforth referred as the cloud broker). A communication breakdown between a resource location and the cloud broker can lead to end user impact – Local Host Cache is designed to mitigate such end user impact.

Local Host Cache is a combination of several services and components which come together to take over the brokering responsibilities until the connection to the cloud broker can be reestablished.

Citrix Virtual Apps and Desktops - Normal Operations

Figure 1: Conceptual representation of Citrix DaaS showcasing components relevant for HA mode

Citrix DaaS Local Host Cache Requirements

The following checklist outlines the required components, settings, and configurations for Local Host Cache (LHC) when in outage mode. The checklist provides a number of sizing and communication considerations that require consideration. The checklist focuses on a single resource location deployment, however the same steps should be applied for all Citrix DaaS resource locations.

Cloud Connectors

  • Cloud Connector Resources:Each cloud connector has at least 4 vCPU and 6GB RAM.
  • Elected Connector:The elected Cloud Connector for brokering registrations and sessions will be the first Cloud Connector name (hostname) in alphabetical order in the same resource location. If the first connector is not available, the second Cloud Connector in alphabetical sequence will become the elected Cloud Connector for brokering registrations and sessions.
  • Outage Mode:During outage mode, Administrators will be able to run limited range of Broker module PowerShell Remote SDK commands pointed to localhost of the elected Cloud Connector. This will allow Administrators to determine registration status and session brokering for that resource location when it is in outage mode.
  • Event Logs:Administrators should verify there are no Error or Warning Events logged for Citrix High Availability Service and Citrix Configuration Sync Service in Cloud Connectors. - Verify that Events 503 and 504 (Config Sync start and end) are processed within regular intervals; this can be done by placing a machine in maintenance mode and this will trigger a configuration sync within 1 minute.

Citrix NetScaler Gateway

  • When configuringSecure Ticket Authority (STA)servers in Gateway configuration, it is recommended that all Cloud Connectors, that will act as brokers during outage mode, are added to the STA configuration of the gateway.
  • The STA servers in Gateway Configuration list should also be configured in Citrix StoreFront Gateway configuration for STA server; both lists should be the same when configuring STA server between components.
  • If there are multiple tenants, it is required that only Cloud Connectors from one tenant only are configured in Citrix Gateway and Citrix StoreFront

Load Balancers

The Citrix ADC appliance provides a built-in monitor,CITRIX-XD-DDCmonitor, which monitors the Citrix Virtual Apps and Desktop Delivery Controller servers. In the context of Citrix DaaS, the Cloud Connectors are equivalent to the Delivery Controller servers. The monitor sends a probe to the configured controller/connector servers in the form of an XML message. If the server responds to the probe with the identity of the farm, the probe is considered to be successful and the server’s status is marked as UP. If the response does not have a success code or the identity of the server farm is not present in the response, the probe is considered to be a failure and the server’s status is marked as DOWN.

  • When using Load Balancers for XML Services (Citrix StoreFront Delivery Controllers), it is recommended that the CITRIX-XD-DDC Monitor is configured instead of the default TCP-SYN Monitor; this will allow the load balancer to mark all other non-elected Cloud Connectors DOWN when outage mode is in effect.

More information about the CITRIX-XD-DDC monitor is available inCitrix ADC documentation.

Citrix StoreFront

  • All StoreFront servers (standalone or cluster) should be configured withAdvanced Health Check setting for zone availability;this will allow multiple catalogs in different zones that are configured under single delivery groups to be brokered to the right zone resource location even in outage mode events.
  • The STA servers in Gateway Configuration list should also be configured in Citrix StoreFront Gateway configuration for STA server; both lists should be the same when configuring STA server between components.
  • If multiple tenants exist, it is required that only Cloud Connectors from one tenant only are configured in Citrix Gateway and Citrix StoreFront.

Citrix DaaS

  • Pooled Delivery Groups:For security reasons, Pooled Delivery Groups are not usable when in HA mode. If you are comfortable with the security tradeoff, this setting can be overridden and configured so that they do not automatically shut down during an outage mode event. Since power-on-for-launch is not supported in outage mode, machines should be turned on even after the session logs off.

    • This can be configured as part of Broker Site and Delivery Groups settings.
    • Configure Broker Site “ReuseMachinesWithoutShutdownInOutageAllowed” to True.
    • Configure Delivery Group “ReuseMachinesWithoutShutdownInOutage” to True.
    • To enable the setting for all new Delivery Groups, you can set Broker Site setting “DefaultReuseMachinesWithoutShutdownInOutage” to True.
    • These configurations should be performed before outage mode event or testing beings to allow the configurations to be propagated to the Cloud Connectors
  • ForStatic Delivery Groups, only assigned desktops are supported during outage mode event; auto-assignment of desktops for new users is not supported during outage mode event. -Delivery Group Tag Restrictions:Tag restrictions where tags are used to designate resource locations are not supported for session launches. When such tag restrictions are configured, and a StoreFront store’s advanced health check option is enabled, sessions might intermittently fail to launch.

Testing Local Host Cache

Local Host Cache is designed to work without any user intervention - - it is fully autonomous. You can however verify that all the Cloud Connectors are correctly synced and ready to take over. The following steps are recommended:

  • Every connector performs synchronization of site configuration independently. The results of the sync are available in the Event Viewer. Refer to theEvent Logs sectionof the product documentation for details of the events.
  • An outage can be simulated to test the Local Host Cache solution in an environment. Guidance on how toForce an outageis available in the product documentation. When forcing an outage, take special care to set all the connectors in a resource location to the forced outage mode.

Cloud Connector Components

There are several components within the Citrix Cloud Connector which are required for the Local Host Cache operations.

  • Configuration Synchronizer Service:The Configuration Synchronizer Service (CSS) periodically checks with the cloud broker (every 60 seconds) to see if any configuration changes were made. The changes can be administrator-initiated (such as changing a delivery group property) or system actions (such as machine assignments). When changes are detected, CSS synchronizes the changes from the cloud broker to the connector machines.
  • LocalDB:The CSS imports the configuration data into a Microsoft SQL Server Express LocalDB database. A new instance of the database is created for every sync operation. Once the sync is completed successfully, the latest DB instance replaces the prior DB instance.
  • High Availability Service: The High Availability Service (HA Service) is a specialized Broker Service that provides the runtime brokering functionality during an outage. The HA Service is also referred to as the secondary broker.
  • Remote Broker Provider:The Remote Broker Provider performs several important functions:
    • 它作为一个代理之间传达沟通the Citrix Virtual Delivery Agent (VDA) and the Cloud Broker
    • 它作为一个代理之间传达沟通an on-premises StoreFront or an on-premises ADC and the various Citrix Cloud services
    • It determines when to switch a resource location between HA mode and normal operation

Citrix Virtual Apps and Desktops - Normal Operations

Figure 2: Connector components and services that play a part with HA mode

Proper sizing of the Cloud Connector machines is an important step to ensure that appropriate resources are available for the services when in High Availability mode. Reviewscale and size considerationsarticle to learn more.

High Availability Mode

Citrix Cloud Connectors are capable of entering or exiting HA mode automatically without administrator intervention. HA mode can be triggered by any of the following:

  • Failure of StoreFront enumerations or launch requests
  • Failure to relay communications between the VDA and the cloud broker
  • Failure to present Secure Ticket Authority (STA) requests to Citrix DaaS on behalf of an on-premises ADC during a launch

During HA mode, the HA Service takes over several important brokering functions, it enumerates resources, brokers session launches, and accepts VDA registrations. In addition, the HA Service acts as a STA provider. In a resource location with multiple Cloud Connectors, the HA Services communicate with one another as part of anelection process. This election process determines which instance of the HA Service takes over if HA mode is triggered.

Citrix Virtual Apps and Desktops - High Availability Mode

Figure 3: Resource Location operating in HA mode

Entering/Exiting High Availability Mode

The decision to transition to HA mode is dependent upon enumeration and launch traffic flowing through a given Cloud Connector instance. Only connector machine which have been configured as a Delivery Controller in StoreFront will support HA mode detection and transition. This optimization is necessary to prevent unnecessary VDA registrations.

There are several states during the entire cycle of entering and exiting HA mode. During theWorking Normallystate, all components are healthy and all brokering transactions are handled by the cloud broker. The CSS is actively replicating the configurations from the cloud broker to the connector machines.

In case some of the components fail to report healthy, the connector transitions to thePending HAstate. When in this state, a comprehensive health check is initiated to determine the next course of action. The connectors interact with other connectors in the resource location to determine their health status. The decision to move from Pending HA to Initial HA is based on the health status of all the connectors in a given resource location. If the health checks are successful, the connectors transition back to the Working Normally state. Alternatively if the health checks continue to fail, the connectors transition to the Initial HA state.

LHC State Diagram

Figure 4: Connector states for entering/existing HA mode

During theInitial HAconnec状态,高可用性服务tor takes over brokering responsibilities. All VDAs in the current resource location that were registered with the cloud broker will register with the HA Service / secondary broker on the connector. At the end of Initial HA, health checks are initiated. If all health checks succeed, the state transitions to Pending Recovery, otherwise the state transitions to Extended HA.

Health checks continue during theExtended HAperiod and when all the health checks succeed, the state transitions to Pending Recovery. There is no maximum time duration for a connector to remain in the Extended HA state.

Pending Recoveryserves as a waiting period, where all components are healthy, before handing off brokering back to the cloud broker. If any of the health checks fail during Pending Recovery, the state transitions back to Extended HA. If all the health checks succeed during the entirety of the Pending Recovery period, then the state transitions to Working Normally. With this transition, HA mode has exited, and all the VDAs in the resource location that were registered with the secondary Broker now re-register with the cloud broker.

Citrix DaaS instance with multiple Resource Locations

The cloud broker is designed to have a view of the whole deployment – across multiple resource locations. However, when in HA mode, each resource location becomes its own independent pod, and the elected secondary broker in each resource location will manage the brokering transactions only for the VDAs within that resource location. This design is a critical reason to ensure that the StoreFront is configured to include all the Cloud Connectors from all the resource locations that contain VDA workloads. The StoreFront can then distribute launch requests and effectively load balance users across multiple resource locations.

VDA Registrations

When the outage begins, the elected secondary broker (read the section onmultiple connectors in a resource locationto know more about the election process) does not have current VDA registration data, but when a VDA communicates with it, a registration process is triggered. During that process, the elected secondary broker also gets current session information for that VDA. The VDA communicates with the broker at least every 5 minutes. Depending on when the last heartbeat was completed, it may take a VDA up to 5 minutes to realize the change from the cloud broker to the elected secondary broker and trigger the registration with the elected secondary broker.

而当选的二级代理处理欺诈nections, the remote broker provider monitors the connection to Citrix Cloud. When the connection is restored, the remote broker provider instructs the elected secondary broker to stop listening for connection information, and resumes conveying brokering operations to the cloud broker. The next time a VDA communicates with the remote broker provider, another registration process is triggered. The elected secondary broker removes any remaining VDA registrations from the previous outage. The CSS resumes synchronizing information when it learns that configuration changes have occurred in Citrix Cloud.

Multiple connectors in a Resource Location

Citrix recommends a minimum of 2 connectors in every resource location / zone. In each zone, there is an election process constantly running to make sure the HA Services know which connector machine would take over brokering responsibilities if there is an interruption. This election always happens – both during normal operations and when running in HA mode.

The CSS routinely provides the secondary broker with information about all Cloud Connectors in the resource location. Having that information, each connector knows about all peer connectors running in the resource location. The secondary brokers communicate with each other on a separate channel. Those services use an alphabetical list of FQDN names of the machines they’re running on to determine the elected secondary broker in the zone if an outage occurs. When in HA mode, the elected secondary broker takes over brokering responsibilities while the other secondary brokers in the zone actively reject incoming connection and VDA registration requests.

If an elected secondary broker fails during an outage, another secondary broker is elected to take over, and VDAs register with the newly elected secondary broker. During HA mode, if a connector is restarted:

  • If that connector is not the elected secondary broker, the restart has no impact.
  • If that connector is the elected secondary broker, a different Cloud Connector is elected, causing VDAs to register with the new elected secondary broker. After the restarted Cloud Connector powers on, it automatically takes over brokering, which causes VDAs to register again. In this scenario, performance can be affected during the registrations.

The event log provides information about elections. For more information on the associated events, review theevent logsarticle from the product documentation.

Local Host Cache with Multiple Resource Locations

Load balancing across connectors in a resource location

The on-premises StoreFront sends a heartbeat message to all the Cloud Connectors configured in its store every 60 seconds by default. Only healthy Cloud Connectors (that respond successfully to the heartbeat) are considered for load balancing app enumeration and launch requests. The same heartbeat request to the Cloud Connectors also activates the connector to participate in the HA mode algorithm described in the preceding sections. To ensure that all resource locations are enabled to perform in HA mode, it is critical to ensure that the on-premises StoreFront has all the Cloud Connectors identified as Delivery Controllers in the StoreFront configuration. Failure to have appropriate StoreFront configurations might result in loss of capacity when the site enters HA mode.

Citrix Virtual Apps and Desktops - Normal Operations

Figure 5: Deployment with multiple Resource Locations where one RL is not HA ready due to missing configurations

HA mode for resource locations publishing the same apps/desktops

One of the Citrix DaaS deployment models include multiple resource locations – all publishing identical applications and desktops across the resource locations. For example, a deployment containing applications from a single multi-session image or pooled VDI desktops may be deployed uniformly across all resource locations.

When such a deployment is operating in HA mode, users may be directed to any of the VDAs in the various configured resource locations. In this scenario, the StoreFront load balances requests to all configured Cloud Connectors across various resource locations.

HA Mode for resource locations publishing different apps/desktops

A Citrix DaaS deployment may also have certain applications available only in a specific subset of resource locations. For example, a Japanese OS desktop may be available only on the VDAs running in Japan. Another example is with static/assigned desktops that are user specific and tied to a specific resource location after assignment.

When such a deployment operates in HA mode, the application or desktop launch requests need to be routed to the appropriate Cloud Connector in the specific resource locations where the apps and desktops reside since cross-zone brokering is not available in HA mode. TheAdvancedHealthCheckfeature offered by StoreFront 1912 LTSR Cumulative Update 1 or later facilities such deployments as described in the following paragraph.

The StoreFront enumerates applications and desktops from Cloud Connectors in any region. The enumeration information now contains a mapping between the resource (an application or a desktop) and the resource locations where the application/desktop resides. This mapping is used to direct the user launch requests to specific resource locations. Review the configuration steps listed in theproduct documentationto enable the StoreFront to utilize this functionality.

Architectures involving Citrix ADC

Citrix ADC for resource locations publishing different apps/desktops

For architectures involving Citrix ADC with resource locations publishing different apps and desktops, the following configurations need to be performed.

  • Aggregate the Cloud Connectors in each resource location to a unique VIP in the ADC load balancer.
  • Enable the StoreFrontAdvancedHealthCheckfeature as describedhere.
  • Map each zone / resource location to an ADC Virtual IP (VIP)
  • Add all ADC VIPs as Delivery Controllers to the StoreFront.
  • Set up the ADC load balancer to monitor the Cloud Connectors in each resource location via the CITRIX-XD-DDC monitor.

Citrix Virtual Apps and Desktops - Normal Operations

Figure 6: Deployment with multiple Resource Locations and Citrix ADC

Pooled Desktop VDA Workload Considerations

When a user logs off a pooled desktop VDA, the VDA’s image is reset to remove any user specific data on the VDA. When a site is running in HA mode, the reset operation is not available. And hence when a user logs off from a pooled desktop VDA, the machine is placed into maintenance mode. This reset prevents a tainted image being made available to another user.

这取决于implementat的安全需求ion, this behavior can be modified by applying a site-wide and a per-delivery-group update. More information about how to override the default behavior is available in theproduct documentation.