








             TTiimmeedd IInnssttaallllaattiioonn aanndd OOppeerraattiioonn GGuuiiddee


         _R_i_c_c_a_r_d_o _G_u_s_e_l_l_a_, _S_t_e_f_a_n_o _Z_a_t_t_i_, _J_a_m_e_s _M_. _B_l_o_o_m
                 Computer Systems Research Group
                    Computer Science Division
    Department of Electrical Engineering and Computer Science
               University of California, Berkeley
                       Berkeley, CA 94720

                           _K_i_r_k _S_m_i_t_h
                  Engineering Computer Network
              Department of Electrical Engineering
                        Purdue University
                    West Lafayette, IN 47906



IInnttrroodduuccttiioonn

     The  clock synchronization service for the UNIX 4.3BSD oper-
ating system is composed of a collection of time daemons  (_t_i_m_e_d)
running  on the machines in a local area network.  The algorithms
implemented by the service is based  on  a  master-slave  scheme.
The  time daemons communicate with each other using the _T_i_m_e _S_y_n_-
_c_h_r_o_n_i_z_a_t_i_o_n _P_r_o_t_o_c_o_l (TSP) which is built on the DARPA UDP  pro-
tocol and described in detail in [4].

     A  time  daemon  has a twofold function.  First, it supports
the synchronization of the clocks of the various hosts in a local
area  network.  Second, it starts (or takes part in) the election
that occurs among slave time daemons when, for  any  reason,  the
master  disappears.   The synchronization mechanism and the elec-
tion procedure employed by the program  _t_i_m_e_d  are  described  in
other  documents  [1,2,3].  The next paragraphs are a brief over-
view of how the time daemon works.  This document is mainly  con-
cerned  with  the  administrative and technical issues of running
_t_i_m_e_d at a particular site.

     A _m_a_s_t_e_r _t_i_m_e _d_a_e_m_o_n measures the time  differences  between
the  clock of the machine on which it is running and those of all
-----------
This  work  was  sponsored  by  the  Defense  Advanced
Research Projects Agency (DoD), monitored by the Naval
Electronics   Systems   Command   under  contract  No.
N00039-84-C-0089, and  by  the  CSELT  Corporation  of
Italy.   The  views  and conclusions contained in this
document are those of the authors and  should  not  be
interpreted  as representing official policies, either
expressed or implied, of the Defense Research Projects
Agency, of the US Government, or of CSELT.









SMM:11-2                         Timed Installation and Operation


other machines.  The master computes  the  _n_e_t_w_o_r_k  _t_i_m_e  as  the
average of the times provided by nonfaulty clocks.1 It then sends
to each _s_l_a_v_e _t_i_m_e _d_a_e_m_o_n the correction that should be performed
on  the  clock of its machine.  This process is repeated periodi-
cally.  Since the correction is expressed as  a  time  difference
rather  than  an absolute time, transmission delays do not inter-
fere with the accuracy of the synchronization.   When  a  machine
comes  up  and  joins  the network, it starts a slave time daemon
which will ask the master for the correct time and will reset the
machine's  clock  before  any  user activity can begin.  The time
daemons are able to maintain a single network time  in  spite  of
the  drift of clocks away from each other.  The present implemen-
tation keeps processor clocks synchronized  within  20  millisec-
onds.

     To  ensure that the service provided is continuous and reli-
able, it is necessary to implement an election algorithm to elect
a new master should the machine running the current master crash,
the master terminate (for example, because of a run-time  error),
or  the  network be partitioned.  Under our algorithm, slaves are
able to realize when the master has stopped  functioning  and  to
elect  a  new  master  from among themselves.  It is important to
note that, since the failure of the  master  results  only  in  a
gradual  divergence  of clock values, the election need not occur
immediately.

     The machines that are gateways between distinct  local  area
networks require particular care.  A time daemon on such machines
may act as a _s_u_b_m_a_s_t_e_r.  This artifact  depends  on  the  current
inability  of  transmission protocols to broadcast a message on a
network other than the one to which the broadcasting  machine  is
connected.   The submaster appears as a slave on one network, and
as a master on one or more of the other networks to which  it  is
connected.

     A  submaster  classifies each network as one of three types.
A _s_l_a_v_e _n_e_t_w_o_r_k is a network on which the  submaster  acts  as  a
slave.  There can only be one slave network.  A _m_a_s_t_e_r _n_e_t_w_o_r_k is
a network on which the submaster acts as a  master.   An  _i_g_n_o_r_e_d
_n_e_t_w_o_r_k  is  any  other network which already has a valid master.
The submaster tries periodically to become master on  an  ignored
network, but gives up immediately if a master already exists.

GGuuiiddeelliinneess

     While  the  synchronization  algorithm is quite general, the
election one, requiring a broadcast mechanism,  puts  constraints
on  the  kind of network on which time daemons can run.  The time
daemon will only work on networks with broadcast capability  aug-
mented   with  point-to-point  links.   Machines  that  are  only
-----------
  1 A  clock is considered to be faulty when its value
is more than a small specified interval apart from the
majority of the clocks of the other machines [1,2].









Timed Installation and Operation                         SMM:11-3


connected to point-to-point, non-broadcast networks may  not  use
the time daemon.

     If  we  exclude submasters, there will normally be, at most,
one master time daemon in a local area internetwork.   During  an
election,  only one of the slave time daemons will become the new
master.  However, because of the characteristics of its  machine,
a  slave can be prevented from becoming the master.  Therefore, a
subset of machines must be designated as  potential  master  time
daemons.  A master time daemon will require CPU resources propor-
tional to the number of slaves, in general,  more  than  a  slave
time  daemon, so it may be advisable to limit master time daemons
to machines with  more  powerful  processors  or  lighter  loads.
Also,  machines with inaccurate clocks should not be used as mas-
ters.  This is a purely administrative decision: an  organization
may well allow all of its machines to run master time daemons.

     At the administrative level, a time daemon on a machine with
multiple network interfaces, may be told to ignore  all  but  one
network  or to ignore one network.  This is done with the _-_n _n_e_t_-
_w_o_r_k and _-_i _n_e_t_w_o_r_k options respectively at start-up time.  Typi-
cally,  the time daemon would be instructed to ignore all but the
networks belonging to the local administrative control.

     There are some limitations to the current implementation  of
the  time  daemon.  It is expected that these limitations will be
removed   in   future   releases.    The   constant   NHOSTS   in
/usr/src/etc/timed/globals.h   limits   the   maximum  number  of
machines that may be directly controlled by one master time  dae-
mon.  The current maximum is 29 (NHOSTS - 1).  The constant  must
be changed and the program recompiled if a  site  wishes  to  run
_t_i_m_e_d on a larger (inter)network.

     In addition, there is a _p_a_t_h_o_l_o_g_i_c_a_l _s_i_t_u_a_t_i_o_n to be avoided
at all costs, that might occur when time daemons run on multiply-
connected  local  area  networks.  In this case, as we have seen,
time daemons running on gateway machines will be  submasters  and
they  will  act on some of those networks as master time daemons.
Consider machines A and B that are both gateways between networks
X  and  Y.   If time daemons were started on both A and B without
constraints, it would be possible for submaster time daemon A  to
be  a  slave on network X and the master on network Y, while sub-
master time daemon B is a slave on network Y and  the  master  on
network  X.   This  _l_o_o_p of master time daemons will not function
properly or guarantee a unique time on both  networks,  and  will
cause  the submasters to use large amounts of system resources in
the form of network bandwidth and CPU time.  In fact,  this  kind
of _l_o_o_p can also be generated with more than two master time dae-
mons, when several local area networks are interconnected.

IInnssttaallllaattiioonn

     In order to start the time daemon on a  given  machine,  the
following  lines  should be added to the _l_o_c_a_l _d_a_e_m_o_n_s section in









SMM:11-4                         Timed Installation and Operation


the file _/_e_t_c_/_r_c_._l_o_c_a_l:


          if [ -f /etc/timed ]; then
               /etc/timed _f_l_a_g_s & echo -n ' timed' >/dev/console
          fi


In any case, they must appear after the network is configured via
ifconfig(8).

     Also,  the  file  _/_e_t_c_/_s_e_r_v_i_c_e_s should contain the following
line:


          timed          525/udp        timeserver


The _f_l_a_g_s are:

-n network   to consider the named network.

-i network   to ignore the named network.

-t           to place tracing information in  _/_u_s_r_/_a_d_m_/_t_i_m_e_d_._l_o_g.

-M           to  allow  this  time  daemon to become a master.  A
             time daemon run without this option will  be  forced
             in the state of slave during an election.

DDaaiillyy OOppeerraattiioonn

     _T_i_m_e_d_c_(_8_)  is used to control the operation of the time dae-
mon.  It may be used to:

+o    measure the differences between machines' clocks,

+o    find the location where the master _t_i_m_e_d is running,

+o    cause election timers on several machines to expire  at  the
     same time,

+o    enable  or   disable   tracing   of   messages  received  by
     _t_i_m_e_d.

See the manual page on _t_i_m_e_d(8) and _t_i_m_e_d_c(8) for  more  detailed
information.

     The _d_a_t_e_(_1_) command can be used to set the network date.  In
order to set the time on a single machine, the  _-_n  flag  can  be
given to date(1).












Timed Installation and Operation                         SMM:11-5


RReeffeerreenncceess

1.   R.  Gusella  and  S. Zatti, _T_E_M_P_O_: _A _N_e_t_w_o_r_k _T_i_m_e _C_o_n_t_r_o_l_l_e_r
     _f_o_r _D_i_s_t_r_i_b_u_t_e_d _B_e_r_k_e_l_e_y _U_N_I_X _S_y_s_t_e_m, USENIX Summer  Confer-
     ence Proceedings, Salt Lake City, June 1984.

2.   R.  Gusella  and  S. Zatti, _C_l_o_c_k _S_y_n_c_h_r_o_n_i_z_a_t_i_o_n _i_n _a _L_o_c_a_l
     _A_r_e_a _N_e_t_w_o_r_k, University of California, Berkeley,  Technical
     Report, _t_o _a_p_p_e_a_r.

3.   R.  Gusella  and  S. Zatti, _A_n _E_l_e_c_t_i_o_n _A_l_g_o_r_i_t_h_m _f_o_r _a _D_i_s_-
     _t_r_i_b_u_t_e_d _C_l_o_c_k _S_y_n_c_h_r_o_n_i_z_a_t_i_o_n _P_r_o_g_r_a_m, University of  Cali-
     fornia, Berkeley, CS Technical Report #275, Dec. 1985.

4.   R.  Gusella and S. Zatti, _T_h_e _B_e_r_k_e_l_e_y _U_N_I_X _4_._3_B_S_D _T_i_m_e _S_y_n_-
     _c_h_r_o_n_i_z_a_t_i_o_n _P_r_o_t_o_c_o_l, UNIX Programmer's Manual, 4.3  Berke-
     ley Software Distribution, Volume 2c.











































