Failover
Configure failover to provide backup user request processing and activity logging.
Configure backup request processing
To configure backup request processing, specify the backup policy server hosts by using the submitmasters keyword in /etc/pb.settings on the submit host. The order of attempts is either the order in the submitmasters keyword list, or the order specified by RNS, or possibly random if the randomizesubmitmasters keyword is set.
Failover for submitmasters means that if a host doesn’t respond within masterdelay milliseconds, a second connection is attempted to the next host in the (possibly randomized) list. Once a connection is made, the EPM-ULEPM-L protocol is negotiated. Those negotiations include SSL/Kerberos/networkencryption and other protocol information.
- If those negotiations succeed, that connection is used for request approval and further failover does not happen.
- If the connection is not answered, or the negotiations do not succeed, more hosts are tried in succession until one is successful or the list is exhausted.
Connections are attempted in order, however depending on the masterdelay value and the speed/busyness of the network and hosts, may finally succeed to a server specified later in the list.
Configure backup logging
To configure backup logging, specify the backup log hosts by using the logservers keyword in /etc/pb.settings on the policy server host.
- EPM-ULv22.3 introduces a second level of failover for logservers that takes place after the protocol negotiations are successful, but then an error happens (for example, disk full or insecure log file/directory). This mechanism tries the listed logservers in succession until a logserver reports that it has successfully logged the event. When randomizelogservers is used, only the first logserver attempt is randomized. Diagnostic messages indicating the failed servers are logged. The transparentfailover keyword controls whether the end user sees the diagnostic messages.
- Previous versions behave similarly to the policy server connection, in which the relevant keywords are logservers, logserverdelay, and randomizelogservers. Connection attempts are made until the protocol negotiations succeed.
For more information on failover, see submitmasters, logservers.
Fine tuning policy server and failover connection timing
masterdelay
- Version 4.0.0 and later: masterdelay setting available.
When a request is submitted, the policy server hosts that are listed in the submitmasters line are tried in the order they appear, from left to right. The masterdelay setting enables the administrator to adjust the amount of time between failover attempts.
Without a specified time-out, the client tries the first policy server host on the submitmasters line. If it does not receive a response within 500 milliseconds, then the client adds the second policy server host. If neither responds in the next 500 milliseconds, then the client adds the third policy server host, and so on. By specifying a masterdelay, you can change the 500 millisecond waiting period before the client goes on to the next policy server host.
If each connection fails to connect within the masterdelay number of milliseconds, then the connect code uses select() to wait for any of the connections to complete. The amount of time allowed for this wait depends on whether the client is a standard client or a cached client.
- Standard client: The total time allowed for the select() is masterdelay*5
- Cached client: The time allowed is masterdelay*3. The availability of the cached client means that a backup method is in place to prevent the client from failing; therefore a smaller wait for the select() following failure of individual connection attempts is reasonable.
With a masterdelay of 0 milliseconds, you get the fastest possible connection, but the policy server you connect to may not be predictable. You might also increase network traffic, depending on the number of connections that are opened.
With a larger masterdelay, you can increase the predictability, but you might also increase the time needed to form a failover connection. The longer the delay, the more predictable the sequence is.
Example
masterdelay 200
Default
masterdelay 500
Used on
Submit hosts
masterprotocoltimeout
- Version 4.0.0 and later: masterprotocoltimeout setting available.
After a connection is established, the programs perform some protocol checks to verify a proper and working connection. Some types of protocol failures could take a long time to determine (for example, wrong service running on the policy server port, or mismatched encryption types/keys).
The masterprotocoltimeout setting enables the administrator to control the maximum time to wait for protocol completion. If a protocol step does not complete within the specified number of milliseconds, then the client continues to try the next policy server host in sequence. A value of -1 indicates no protocol timeout.
Example
masterprotocoltimeout 2000
Default
masterprotocoltimeout 500
Used on
- Policy server hosts
- Run hosts
- Submit hosts
Fine tuning log servers and failover connection timing
logserverdelay
- Version 4.0.0 and later: logserverdelay setting available.
When a log request is processed, the log servers that are listed in the logservers line are tried in the order they appear, from left to right. The logserverdelay setting enables the administrator to adjust the amount of time between failover attempts.
Without a specified time-out, the logging program (for example, pbrun, pbmasterd, pblocald, etc.) tries the first log server on the logservers line. If it does not receive a response within 500 milliseconds, then it adds the second log host. If neither responds in the next 500 milliseconds, then it adds the third log host, and so on. By specifying a logserverdelay, you can change the 500 millisecond waiting period before the logging program goes on to the next log server.
With a logserverdelay of 0 milliseconds, you get the fastest possible connection, but the log server that you connect to may not be predictable. You might also increase network traffic, depending on the number of connections that are opened.
With a larger logserverdelay you can increase the predictability, but you might also increase the time needed to form a failover connection. The longer the delay, the more predictable the sequence is.
Example
logserverdelay 2500
Default
logserverdelay 500
Used on
- Policy server hosts
- Run hosts
- Submit hosts by pbksh and pbsh when a policy server is not available
logserverprotocoltimeout
- Version 4.0.0 and later: logserverprotocoltimeout setting available.
After a connection is established, the programs perform some protocol checks to verify a proper and working connection. Some types of protocol failures can take a very long time to determine. For example, the wrong service running on the log server port, or mismatched encryption types/keys.
The logserverprotocoltimeout setting enables the administrator to control the maximum time to wait for protocol completion. If a protocol step does not complete within the specified number of milliseconds, then the logging program continues to try the next log server in sequence. A value of -1 indicates no protocol timeout.
If the iologack setting is used, then the logserverprotocoltimeout setting also controls how long a submit host should wait for an acknowledgment from the log host.
Example
logserverprotocoltimeout 2000
Default
logserverprotocoltimeout 500
Used on
- Log hosts
- Policy server hosts
- Run hosts
- Submit hosts by pbksh and pbsh when a policy server host is not available
For more information, see iologack.
randomizelogservers
- Version 9.2.0 and earlier: randomizelogservers setting not available.
- Version 9.3.0 and later: randomizelogservers setting available.
The randomizelogservers setting forces the policy server/submit host/run host to choose a log server host at random, rather than choosing the first available log server host that is specified in the logservers setting. This feature balances the load among multiple log server hosts.
Note
The use of randomizelogservers can cause accept and finish events to be located on different log servers if the log servers are configured with eventdestinations set to a flat file (authevt=) or an SQLite Database (authevt=db).
However, if eventdestinations is set to authevt= (same ODBC Oracle or MySQL database on all the log servers), then the accept and finish events are stored on the same Oracle or MySQL server. The default randomizelogservers setting is no.
Note
The randomizelogservers keyword should not be used with the use of DNS SRV lookups. The randomizelogservers keyword can result in accept and finish events logged on different logservers, causing the need to merge iologs.
Example
randomizelogservers yes
Default
randomizelogservers no
Used on
- Submit hosts
- Run hosts
- Policy servers
Acknowledge failovers
transparentfailover
- Version 5.1.1 and earlier: transparentfailover setting not available.
- Version 5.1.2 and later: transparentfailover setting available.
A transparentfailover occurs when an initial connection to a policy server or logserver host has failed and the program performs a failover to another available policy server or logserver host in the list. To acknowledge that a user failover has occurred, error messages from the failed connection are displayed to the user.
The transparentfailover setting enables you to suppress the following failover error messages:
- Any Kerberos initialization error
- 3084 initMangle failure during startup
- 3089 Could not send initial protocol header to Policy Server
- 3090 Did not receive initial protocol header from Policy Server
- 8534 Policy Server on %s is not SSL enabled
- 1913 Invalid Policy Server daemon on Policy Server host %s
When transparentfailover is set to yes, failover error messages listed above are suppressed. To display failover error messages, set transparentfailover to no in the pb.settings file.
Example
transparentfailover yes
Default
transparentfailover yes
Used on
Submit hosts
Updated 5 days ago