diff options
Diffstat (limited to 'contrib/ntp/html/prefer.htm')
-rw-r--r-- | contrib/ntp/html/prefer.htm | 332 |
1 files changed, 332 insertions, 0 deletions
diff --git a/contrib/ntp/html/prefer.htm b/contrib/ntp/html/prefer.htm new file mode 100644 index 0000000000000..edb51520dcbc8 --- /dev/null +++ b/contrib/ntp/html/prefer.htm @@ -0,0 +1,332 @@ +<HTML> +<HEAD> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> + <META NAME="GENERATOR" CONTENT="Mozilla/4.01 [en] (Win95; I) [Netscape]"> + <TITLE>Mitigation Rules and the ``prefer'' Keyword +</TITLE> +</HEAD> +<BODY> + +<H3> +Mitigation Rules and the <TT>prefer</TT> Keyword</H3> + +<HR> +<H4> +Introduction</H4> +The mechanics of the NTP algorithms which select the best data sample from +each available peer and the best subset of the peer population have been +finely crafted to resist network jitter, faults in the network or peer +operations, and to deliver the best possible accuracy. Most of the time +these algorithms do a good job without requiring explicit manual tailoring +of the configuration file. However, there are times when the accuracy can +be improved by some careful tailoring. The following sections explain how +to do this using explicit configuration items and special signals, when +available, that are generated by some radio clocks. + +<P>In order to provide robust backup sources, primary (stratum-1) servers +are usually operated in a diversity configuration, in which the server +operates with a number of remote peers in addition to one or more radio +or modem clocks operating as local peers. In these configurations the suite +of algorithms used in NTP to refine the data from each peer separately +and to select and weight the data from a number of peers are used with +the entire ensemble of remote peers and local peers. As the result of these +algorithms, a set of <I>survivors</I> are identified which can presumably +provide the most reliable and accurate time. Ordinarily, the individual +clock offsets of the survivors are combined on a weighted average basis +to produce an offset used to control the system clock. + +<P>However, because of small but significant systematic time offsets between +the survivors, it is in general not possible to achieve the lowest jitter +and highest stability in these configurations. This happens because the +selection algorithm tends to <I>clockhop</I> between survivors of substantially +the same quality, but showing small systematic offsets between them. In +addition, there are a number of configurations involving pulse-per-second +(PPS) signals, modem backup services and other special cases, so that a +set of mitigation rules becomes necessary to select a single peer from +among the survivors. These rules are based on a set of special characteristics +of the various peers and reference clock drivers specified in the configuration +file. +<H4> +The <TT>prefer</TT> Peer</H4> +The mitigation rules are designed to provide an intelligent selection between +various peers of substantially the same statistical quality. They is designed +to provide the best quality time without compromising the normal operation +of the NTP algorithms. The mitigation scheme in its present form is not +an integral component of the NTP Version 3 specification RFC- 1305. but +is to be included in the version 4 specification when it is published. +The scheme is based on the concept of <I>prefer peer</I>, which is specified +by including the <TT>prefer</TT> keyword with the associated <TT>server</TT> +or <TT>peer</TT> command in the configuration file. This keyword can be +used with any peer or server, but is most commonly used with a radio clock. +While the scheme does not forbid it, it does not seem useful to designate +more than one peer as preferred, since the additional complexities to mitigate +among them do not seem justified from on-air experience. + +<P>The prefer scheme works on the set of peers that have survived the sanity +checks and intersection algorithms of the clock selection procedures. Ordinarily, +the members of this set can be considered <I>truechimers</I> and any one +of them could in principle provide correct time; however, due to various +error contributions, not all can provide the most accurate and stable time. +The job of the clustering algorithm, which is invoked at this point, is +to select the best subset of the survivors providing the least variance +in the combined ensemble average, compared to the variance in each member +of the subset separately. The detailed operation of the clustering algorithm, +which is given in the specification, is not important here, other than +to point out it operates in rounds, where a survivor, presumably the worst +of the lot, is discarded in each round until one of several termination +conditions is met. + +<P>In the prefer scheme the clustering algorithm is modified so that the +prefer peer is never discarded; on the contrary, its potential removal +becomes a termination condition. If the original algorithm were about to +toss out the prefer peer, the algorithm terminates right there. The prefer +peer can still be discarded by the sanity checks and intersection algorithms, +of course, but it will always survive the clustering algorithm. If it does +not survive or for some reason it fails to provide updates, it will eventually +become unreachable and the clock selection will remitigate to select the +next best source. + +<P>Along with this behavior, the clock selection procedures are modified +so that the combining algorithm is not used when a prefer peer is present. +Instead, the offset of the prefer peer is used exclusively as the synchronization +source. In the usual case involving a radio clock and a flock of remote +stratum-1 peers, and with the radio clock designated a prefer peer, the +result is that the high quality radio time disciplines the server clock +as long as the radio itself remains operational and with valid time, as +determined from the remote peers, sanity checks and intersection algorithm. +<H4> +Peer Classification</H4> +In order to understand the effects of the various intricate schemes involved, +it is necessary to understand some arcane details on how the algorithms +decide on a synchronization source, when more than one source is available. +This is done on the basis of a set of explicit mitigation rules, which +define special classes of remote and local peers as a function of configuration +declarations and reference clock driver type: +<OL> +<LI> +The prefer peer is designated using the <TT>prefer</TT> keyword with the +<TT>server</TT> or <TT>peer</TT> commands. All other things being equal, +this peer will be selected for synchronization over all other survivors +of the clock selection procedures.</LI> + +<BR> +<LI> +When a PPS signal is connected via the PPS Clock Discipline driver (type +22), this is called the <I>PPS peer</I>. This driver provides precision +clock corrections only within one second, so is always operated in conjunction +with another peer or reference clock driver, which provides the seconds +numbering. The PPS peer is active only under conditions explained below.</LI> + +<BR> +<LI> +When the Undisciplined Local Clock driver (type 1) is configured, this +is called the <I>local clock peer</I>. This is used either as a backup +reference source (stratum greater than zero), should all other synchronization +sources fail, or as the primary reference source (stratum zero) in cases +where the kernel time is disciplined by some other means of synchronization, +such as the NIST <TT>lockclock</TT> scheme, or another synchronization +protocol, such as the Digital Time Synchronization Service (DTSS).</LI> + +<BR> +<LI> +When a modem driver such as the Automated Computer Time Service driver +(type 18) is configured, this is called the <I>modem peer</I>. This is +used either as a backup reference source, should all other primary sources +fail, or as the (only) primary reference source.</LI> + +<BR> +<LI> +Where support is available, the PPS signal may be processed directly by +the kernel, as described in the <A HREF="kern.htm">A Kernel Model for Precision +Timekeeping</A> page. This is called the <I>kernel discipline</I>. The +PPS signal can discipline the kernel in both frequency and time. The frequency +discipline is active as long as the PPS interface device and signal itself +is operating correctly, as determined by the kernel algorithms. The time +discipline is active only under conditions explained below.</LI> +</OL> +Reference clock drivers operate in the manner described in the <A HREF="refclock.htm">Reference +Clock Drivers</A> page and its dependencies. The drivers are ordinarily +operated at stratum zero, so that as the result of ordinary NTP operations, +the server itself operates at stratum one, as required by the NTP specification. +In some cases described below, the driver is intentionally operated at +an elevated stratum, so that it will be selected only if no other survivor +is present with a lower stratum. In the case of the PPS peer or kernel +time discipline, these sources appear active only if the prefer peer has +survived the intersection and clustering algorithms, as described below, +and its clock offset relative to the current local clock is less than a +specified value, currently 128 ms. + +<P>The modem clock drivers are a special case. Ordinarily, the update interval +between modem calls to synchronize the system clock is many times longer +than the interval between polls of either the remote or local peers. In +order to provide the best stability, the operation of the clock discipline +algorithm changes gradually from a phase-lock mode at the shorter update +intervals to a frequency-lock mode at the longer update intervals. If both +remote or local peers together with a modem peer are operated in the same +configuration, what can happen is that first the clock selection algorithm +can select one or more remote/local peers and the clock discipline algorithm +will optimize for the shorter update intervals. Then, the selection algorithm +can select the modem peer, which requires a much different optimization. +The intent in the design is to allow the modem peer to control the system +clock either when no other source is available or, if the modem peer happens +to be marked as prefer, then it always controls the clock, as long as it +passes the sanity checks and intersection algorithm. There still is room +for suboptimal operation in this scheme, since a noise spike can still +cause a clockhop either way. Nevertheless, the optimization function is +slow to adapt, so that a clockhop or two does not cause much harm. + +<P>The local clock driver is another special case. Normally, this driver +is eligible for selection only if no other source is available. When selected, +vernier adjustments introduced via the configuration file or remotely using +the <TT><A HREF="ntpdc.htm">ntpdc</A> </TT>program can be used to trim +the local clock frequency and time. However, if the local clock driver +is designated the prefer peer, this driver is always selected and all other +sources are ignored. This behavior is intended for use when the kernel +time is controlled by some means external to NTP, such as the NIST <TT>lockclock</TT> +algorithm or another time synchronization protocol such as DTSS. +In this case the only way to disable the local clock driver is to mark +it unsynchronized using the leap indicator bits. In the case of modified +kernels with the <TT>ntp_adjtime()</TT> system call, this can be done automatically +if the external synchronization protocol uses it to discipline the kernel +time. +<H4> +Mitigation Rules</H4> +The mitigation rules apply in the intersection and clustering algorithms +described in the NTP specification. The intersection algorithm first scans +all peers with a persistent association and includes only those that satisfy +specified sanity checks. In addition to the checks required by the specification, +the mitigation rules require either the local-clock peer or modem peer +to be included only if marked as the prefer peer. The intersection algorithm +operates on the included population to select only those peers believed +to represent the correct time. If one or more peers survive the operation, +processing continues in the clustering algorithm. Otherwise, if there is +a modem peer, it is declared the only survivor; otherwise, if there is +a local-clock peer, it is declared the only survivor. Processing then continues +in the clustering algorithm. + +<P>The clustering algorithm repeatedly discards outlyers in order to reduce +the residual jitter in the survivor population. As required by the NTP +specification, these operations continue until either a specified minimum +number of survivors remain or the minimum select dispersion of the population +is greater than the maximum peer dispersion of any member. The mitigation +rules require an additional terminating condition which stops these operations +at the point where the prefer peer is about to be discarded. + +<P>The mitigation rules establish the choice of <I>system peer</I>, which +determine the stratum, reference identifier and several other system variables +which are visible to clients of the local server. In addition, they establish +which source or combination of sources control the local clock. +<OL> +<LI> +If there is a prefer peer and it is the local-clock peer or the modem peer; +or, if there is a prefer peer and the kernel time discipline is active, +choose the prefer peer as the system peer and its offset as the system +clock offset. If the prefer peer is the local-clock peer, an offset can +be calculated by the driver to produce a frequency offset in order to correct +for systematic frequency errors. In case a source other than NTP is controlling +the system clock, corrections determined by NTP can be ignored by using +the <TT>disable pll</TT> in the configuration file. If the prefer peer +is the modem peer, it must be the primary source for the reasons noted +above. If the kernel time discipline is active, the system clock offset +is ignored and the corrections handled directly by the kernel.</LI> + +<LI> +If the above is not the case and there is a PPS peer, then choose it as +the system peer and its offset as the system clock offset.</LI> + +<LI> +If the above is not the case and there is a prefer peer (not the local-clock +or modem peer in this case), then choose it as the system peer and its +offset as the system clock offset.</LI> + +<LI> +If the above is not the case and the peer previously chosen as the system +peer is in the surviving population, then choose it as the system peer +and average its offset along with the other survivors to determine the +system clock offset. This behavior is designed to avoid excess jitter due +to clockhopping, when switching the system peer would not materially improve +the time accuracy.</LI> + +<LI> +If the above is not the case, then choose the first candidate in the list +of survivors ranked in order of synchronization distance and average its +offset along with the other survivors to determine the system clock offset. +This is the default case and the only case considered in the current NTP +specification.</LI> +</OL> + +<H4> +Using the Pulse-per-Second (PPS) Signal</H4> +Most radio clocks are connected using a serial port operating at speeds +of 9600 bps or higher. The accuracy using typical timecode formats, where +the on-time epoch is indicated by a designated ASCII character, like carriage-return +<TT><cr></TT>, is limited to a millisecond at best and a few milliseconds +in typical cases. However, some radios produce a PPS signal which can be +used to improve the accuracy with typical workstation servers to the order +of a few tens of microseconds. The details of how this can be accomplished +are discussed in the <A HREF="pps.htm">Pulse-per-second (PPS) Signal Interfacing</A> +page. The following paragraphs discuss how the PPS signal is affected by +the mitigation rules. + +<P>First, it should be pointed out that the PPS signal is inherently ambiguous, +in that it provides a precise seconds epoch, but does not provide a way +to number the seconds. In principle and most commonly, another source of +synchronization, either the timecode from an associated radio clock, or +even one or more remote NTP servers, is available to perform that function. +In all cases, a specific, configured peer or server must be designated +as associated with the PPS signal. This is done using the <TT>prefer</TT> +keyword as described previously. The PPS signal can be associated in this +way with any peer, but is most commonly used with the radio clock generating +the PPS signal. + +<P>The PPS signal can be used in two ways to discipline the local clock, +one using a special PPS driver described in the <A HREF="driver22.htm">PPS +Clock Discipline</A> page, the other using PPS signal support in the kernel, +as described in the <A HREF="kern.htm">A Kernel Model for Precision Timekeeping</A> +page. In either case, the signal must be present and within nominal jitter +and wander error tolerances. In addition, the associated prefer peer must +have survived the sanity checks and intersection algorithms and the dispersion +settled below 1 s. This insures that the radio clock hardware is operating +correctly and that, presumably, the PPS signal is operating correctly as +well. Second, the absolute offset of the local clock from that peer must +be less than 128 ms, or well within the 0.5-s unambiguous range of the +PPS signal itself. In the case of the PPS driver, the time offsets generated +from the PPS signal are propagated via the clock filter to the clock selection +procedures just like any other peer. Should these pass the sanity checks +and intersection algorithms, they will show up along with the offsets of +the prefer peer itself. Note that, unlike the prefer peer, the PPS peer +samples are not protected from discard by the clustering algorithm. These +complicated procedures insure that the PPS offsets developed in this way +are the most accurate, reliable available for synchronization. + +<P>The PPS peer remains active as long as it survives the intersection +algorithm and the prefer peer is reachable; however, like any other clock +driver, it runs a reachability algorithm on the PPS signal itself. If for +some reason the signal fails or displays gross errors, the PPS peer will +either become unreachable or stray out of the survivor population. In this +case the clock selection remitigates as described above. + +<P>When kernel support for the PPS signal is available, the PPS signal +is interfaced to the kernel serial driver code via a modem control lead. +As the PPS signal is derived from external equipment, cables, etc., which +sometimes fail, a good deal of error checking is done in the kernel to +detect signal failure and excessive noise. The way in which the mitigation +rules affect the kernel discipline is as follows. + +<P>In order to operate, the kernel support must be enabled by the <TT>enable +pll </TT>command in the configuration file and the signal must be present +and within nominal jitter and wander error tolerances. In the NTP daemon, +the PPS discipline is active only when the prefer peer is among the survivors +of the clustering algorithm, and its absolute offset is within 128 ms, +as in the PPS driver. Under these conditions the kernel disregards updates +produced by the NTP daemon and uses its internal PPS source instead. The +kernel maintains a watchdog timer for the PPS signal; if the signal has +not been heard or is out of tolerance for more than some interval, currently +two minutes, the kernel discipline is declared inoperable and operation +continues as if it were not present. +<HR> +<ADDRESS> +David L. Mills (mills@udel.edu)</ADDRESS> + +</BODY> +</HTML> |