Wednesday, October 24, 2012

NTP security cisco

A lot of Network Engineer teams like to  configured NTP. NTP is great for time sync when used right. 1st let's understand what NTP does.

   NTP provides the protocol mechanisms to synchronize time in principle
   to precisions in the order of nanoseconds while preserving a
   non-ambiguous date, at least for this century.  The protocol includes
   provisions to specify the precision and estimated error of the local
   clock and the characteristics of the reference clock to which it may
   be synchronized.  However, the protocol itself specifies only the
   data representation and message formats and does not specify the
   synchronizing algorithms or filtering mechanisms. 
 rfc958 
 
NTP Network Time Protocol
Protocol UDP
Port 123



Okay so this means without some way to validate client/peers/server, you are open to spoof'd attacks from misbehavior or unethical sources or from intentional actions.

Okay NTP  typical uses a stratum level for accuracy. How much accuracy does one need in a UNIX/WINDOW/NETWORK? (is a topic up for some later debate)

Unless your working  2 space right of the decimal,  most stratum  levels 2-5 is more than adequate.
The lower number in the stratum is consider a more accurate time-source btw

e.g

stratum 1 is more accurate than 2
stratum 2 is more accurate than 4
stratum 3 is more accurate than 5
stratum 4 is more accurate than 6
and so on.....

Let's have some fun with NTP & TIME


Okay so how many levels of stratum do we have ? That's a good question, per the NTP protocol it's 16.

At stratum level 16, your considered as a UN_SYNCRONIZED clock.

Statum 0 clock sources are typically a  atomic clock or high degree of accuracy such as a  LORAN , liked the one on Okinawa that I worked at when  station and in the USAF military :)

These systems; military, planes,missile/space programs, martime marines, etc.... need highly accurate clocks. In some case we are talking about nanosecs vrs millisec. LORAN systems btw, are slowing being phased, due to GPS sat and systems are now more common.

Okay, so now we have an understanding of the stratum level & that's the lowest # are more accurate. Each device,  regardless if it's a local-reference or timed from another device,  uses this time source as it's local reference for time.  Since stratum time sources are  hierarchical, a time-source synchronize to a lower number, will be +1 in his/her rating.

What that means, if I'm sync to a stratum 3 source,  a peer sync'd with me  directly, would see me as a stratum 4. A peer sync'd with my peer, would see him as a stratum 5 and so on. You can use  unix tools like ntptrace , to trace ntp paths.

e.g
 ntptrace  a1.dca1
Unknown option: v
a1.dca1: stratum 4, offset -0.000614, synch distance 0.047490
m1.dca1: stratum 3, offset -0.000342, synch distance 0.046768
198.82.1.202: timed out, nothing received
***Request timed out
NTPtrace also allows for the unethical attacker to identify what routers are  open for "open access" and what  your using for a time sources. For example, a trace route from my office to terremark.com, lists are of the routers in the path. Execution of ntptrace against these routers, will Identified possible hosts that are enabled as ntp-server/peers.

e.g ( output tunacated )
 4  96.120.37.49 (96.120.37.49)  28.997 ms  42.195 ms  28.961 ms
 5  xe-0-1-0-0-sur01.hallandale.fl.pompano.comcast.net (68.85.228.85)  19.081 ms  49.646 ms  13.833 ms
 6  162.151.2.101 (162.151.2.101)  54.124 ms
    te-0-9-0-1-ar03.northdade.fl.pompano.comcast.net (68.87.162.185)  24.058 ms
    te-0-9-0-3-ar03.northdade.fl.pompano.comcast.net (69.139.181.177)  22.514 ms
 7  he-0-8-0-0-cr01.miami.fl.ibone.comcast.net (68.86.93.85)  24.122 ms  30.657 ms  37.148 ms
 8  xe-8-0-0.edge2.miami1.level3.net (4.59.85.45)  17.289 ms  13.370 ms  15.901 ms
 9  ae-1-51.edge2.miami2.level3.net (4.69.138.77)  35.879 ms
    ae-2-52.edge2.miami2.level3.net (4.69.138.109)  42.095 ms
    ae-1-51.edge2.miami2.level3.net (4.69.138.77)  73.704 ms
10  data-return.edge2.miami2.level3.net (4.71.212.66)  25.518 ms  29.417 ms  20.224 ms
11  t9-1.gw1.mia.terremark.net (66.165.161.94)  17.071 ms  17.627 ms  32.947 ms
12  66.165.170.13 (66.165.170.13)  20.253 ms  22.208 ms  16.415 ms
13  72.46.239.73 (72.46.239.73)  15.549 ms  15.146 ms  25.279 ms
14  app.terremark.com (208.39.96.115)  22.889 ms  26.611 ms  17.418 ms
IN NTPdaemon, we typically see items that uses a local time source or has a clock-input make references to a 127.x.x.x address. Will cisco does the same thing in their deployment;


In the above traceroute, the datareturn edge router at hop#10 is open as a possible NTP server;


e.g 
kfelix-waffen01:~ kfelix$ ntptrace 4.71.212.66
data-return.edge2.miami2.level3.net: stratum 3, offset -0.000380, synch distance 0.067326
ntp.terrenap.net: stratum 2, offset -0.000003, synch distance 0.043791
10.1.2.18: timed out, nothing received
***Request timed out

So what this means;

if I wanted to orchestra an attack on that router, a slew of hosts that executed a ntp query, could impact the router performance or memory consumption.  That router should be closed to external queries and filter, and only allows queries from a trusted segment and they should deploy ntp-authentication & ACLs.

To reconfirm it's open, I can use any of my cisco routers and add it as ntp server entries and monitor my association to that host;

e.g

config t
   ntp server 4.71.212.66
end
anonymous-waffen-rtr1#sh ntp ass | i 4.71.212.66
+~4.71.212.66      66.165.160.189    3     0    64  377     3.4    1.28     0.6
anonymous-waffen-rtr1#


As you can see, I gain  association to the open ntp-server.


Moving on, let's look at cisco and it's NTPv3 setup;

Here's my CCIE lab router setup;


 e.g
ccie01#show ntp status
Clock is synchronized, stratum 1, reference is .LOCL.
nominal freq is 250.0000 Hz, actual freq is 250.0000 Hz, precision is 2**18
reference time is D432C93C.D7A64D3C (20:08:28.842 UTC Wed Oct 24 2012)
clock offset is 0.0000 msec, root delay is 0.00 msec
root dispersion is 0.02 msec, peer dispersion is 0.02 msec
ccie01#show ntp ass  

      address         ref clock     st  when  poll reach  delay  offset    disp
*~127.127.7.1      .LOCL.            0    52    64  377     0.0    0.00     0.0
 * master (synced), # master (unsynced), + selected, - candidate, ~ configured
ccie01#


Okay this router think and is acting as a local clock source and at a stratum 1 or is it ?
Will yes, but it's a far no as well. 

We told it to be a ntp master and announced as a stratum level 1 if a host should query it, but in reality it's far from being a accurate clock source, much less  at a stratum 1 level. The onboard  cpu, chip, and clock functions are not very accurate and the time function pips would drift and sway.

We used the following command to make it a master;
:)

ntp master 1

Okay so how would a peer that's associated with the router see the clock source?

      address         ref clock     st  when  poll reach  delay  offset    disp
*~1.1.1.1          .LOCL.            1   179    64  374     1.6    1.88     0.7


Notice the reference clock says LOCL? Okay so fine, we are associated to a guy that has LOCL clock.  Since it can't query or ntptrace to the source of the clock, it's listed as a LOCL_REFERENCE.

Okay how about if we where sync to somebody else as in another ntp source?

mia-sm01>sh ntp ass

      address         ref clock     st  when  poll reach  delay  offset    disp
+~10.100.100.201   38.17.88.1       4    38   512  377     1.0  -40.54     7.3
+~10.100.100.202   38.17.88.1       4    58   512  377     1.1  -41.10     7.3
*~38.17.88.1      38.104.95.25      3   214   512  377    15.3  -49.84    10.5
 ~38.17.88.3      0.0.0.0          16     -  1024    0     0.0    0.00  16000.
 * master (synced), # master (unsynced), + selected, - candidate, ~ configured

This host sm01 in miami has 4 ntp association, 3 out of four have validate clock sources and the 3rd  at host 38.17.88.3 is not validate and is running in the wild, we can determine that by the ref_clock field being 0.0.0.0 and stratum 16.

Remember when we said earlier, that a stratum 16 is not confirm or a validate clock?

Okay how about if we where not sync'd to a master ? Notice how the  "*" is missing on any peers that are not the sync?

  address         ref clock     st  when  poll reach  delay  offset    disp
 ~1.1.1.1          .LOCL.            1   992    64    0     1.6    1.88  16000.

A router can only be sync to ONE master and that's typically the one with the lowest stratum or if defined, it will follows your prefer statement.  This allows you to select one peer over the other if one host is considered more better, accurate,etc.....

So okay we got that far, now let's look at the security aspect of the NTP. We use NTP to control our time or what we think is time. 1st off time is relevant. What  that means;

Time was created to help humans being to  place some value of time & towards our purpose on planet earth. Yes that means ;

"hey Eric I will meet you at the train station in 10minutes"

or

"I'm 44 years 2months and 1 week old on  June 1 2014"

or

"500 BC"

Okay have I lost you ? :)

Time in the universe could be different and is different depending on what we are measuring and the use for it. I'm a  FCC radio operator,  and we always joke that we use  time to measure distance.

Example, if I want to sent a radio transmission to the sun, it would take  0.000159  light years or 8-9 light minutes depending where we are in orbit to the  Sun :)

Okay see what I did, we used a time measurement to measure distance.  Are you really confused now ? :)

We even use time in calendar events; the  Chinese calendar, Arabic or Indians recording the lunar cycle. How about the julian calendar? Or the fiscal calendar? Then we can compare Absolute and Relative dating.

Okay back on track, we use time within our cisco routers/switches  for a host of reasons. Here's some of those reasons;

time based ACLs
system logging & timestamp
md5 key expiration for dynamic routing protocols
determining system or peer uptime
etc.....


Okay so now back to ntp and time. We need to evaluate  time accuracy and security of ntp. The later is easily to controlled , by using best practices and by using security features within the cisco IOS codeset.

1st we always enable ntp authentication on the server

e.g

ntp authenticate

and creation of keys

 e.g
ntp authentication-key 10 md5 ciscocasio

And if we are using a client it must be set for trusted key

 e.g
ntp trust-key 10

With a md5 key, you now have the means to select some type of hash for md5 authentication between peers. Key are easy to setup. The simple key above, could be deploy against your peers to control NTP queries and updates. Then you reference that key on all of your server or peers.   A peer allow for it to synchronize to you as a peer or server or both ways, where a server statement means you syncrhronize to that source only.

Here's a  few samples;



e.g
config t
  ntp server 1.2.3.4 key 10

or 
  ntp peer 1.2.3.4 key 10

end



2nd we control who and associate to me by ntp access-groups

ntp access-group ?
  peer        Provide full access
  query-only  Allow only control queries
  serve       Provide server and query access
  serve-only  Provide only server access

The options are wide and many. But peer controls  are for  either peers & serve for servers operations.

( stole  this from a INE  blog ;)  )

1) Peer – permits router to respond to NTP requests and accept NTP updates. NTP control queries are also accepted. This is the only class which allows a router to be synchronized by other devices.
2) Serve – permits router to reply to NTP requests, but rejects NTP updates (e.g. replies from a server or update packets from a peer). Control queries are also permitted.
3) Serve-only – permits router to respond to NTP requests only. Rejects attempt to synchronize local system time, and does not access control queries.
4) Query-only – only accepts NTP control queries. No response to NTP requests are sent, and no local system time synchronization with remote system is permitted.

That pretty much determine what ACLs we can provide & how we impact the NTP interaction.

Next, how about we disable peer updates request per interface? Simple, we issue  ntp disable statement on interfaces that we expect no queries on. This is a simple method to  reject external ntp queries.

e.g
 interface Serial0/0/0
 ip address 1.1.1.1 255.255.255.252
 encapsulation ppp
 ntp disable
 ipv6 address 2002:100::1/64
 ipv6 enable
 ipv6 flow ingress
 ipv6 flow egress
end

Next md5 keys, these simple key are very easy to deploy and must match between the 2 NTP hosts. You can use  different keys between the hosts, but typically we deploy a universal key and place some type of life-cycle on these keys  thru out our domain.

If your using keys and have authentication issues, you can issue a "debug ntp authentication" to monitor any authentication issues. The key needs to match all parties that's using that specific key.

So to recap;

ntp can be a configured as server,peer or client function
ntp support simple md5 authentication
ntp can be disable peer interfaces
ntp source interfaces can be selected
ntp keys should  be expired on a regular cycle
ntp max associations should be set to ensure we don't DoS a router/switch time-server

One more tidbit of information pertain if you ever deploy access-group for the server function,  and if your router is configured as a master. You will find that your own router peers with it's self using a virtual ipv4 loopback address 127.127.x.x

So you will need to apply that ip_address locally into your ACLs or you will have problems;

r1#sh run | sec access-list 55
access-list 55 permit 127.127.7.1
access-list 55 permit 10.0.0.2

r1#sh run | sec ntp          
ntp logging
ntp authentication-key 1 md5 00090A0D0142 7
ntp authenticate
ntp trusted-key 1
ntp source Loopback0
ntp access-group serve 55
ntp master 1



So r1 is  a master clock source  and uses ACL-55, it has 2 client entries, one being an external host @ 10.0.0.2 and the other it'self;

r1#show ntp status
Clock is synchronized, stratum 1, reference is .LOCL.
nominal freq is 250.0000 Hz, actual freq is 250.0000 Hz, precision is 2**24
reference time is D49C7477.43835520 (23:47:03.263 UTC Sat Jan 12 2013)
clock offset is 0.0000 msec, root delay is 0.00 msec
root dispersion is 0.02 msec, peer dispersion is 0.02 msec
 

r1#show ntp ass det
127.127.7.1 configured, our_master, sane, valid, stratum 0
ref ID .LOCL., time D49C7477.43835520 (23:47:03.263 UTC Sat Jan 12 2013)
our mode active, peer mode passive, our poll intvl 64, peer poll intvl 64
root delay 0.00 msec, root disp 0.00, reach 377, sync dist 0.015
delay 0.00 msec, offset 0.0000 msec, dispersion 0.02
precision 2**18, version 3
org time D49C7477.43835520 (23:47:03.263 UTC Sat Jan 12 2013)
rcv time D49C7477.43835520 (23:47:03.263 UTC Sat Jan 12 2013)
xmt time D49C7477.43835520 (23:47:03.263 UTC Sat Jan 12 2013)
filtdelay =     0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
filtoffset =    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
filterror =     0.02    0.99    1.97    2.94    3.92    4.90    5.87    6.85
Reference clock status:  Running normally
Timecode:


I spent over 1 hour debugging why my master lost NTP sync to it's self , when deploy ACLs. I hope this tip saves you from having to do the same.

I hope you find this information helpful and useful now, or later

Ken Felix

Freelance Security & Network Engineer
kfelix " at "  hyperfeed.com


 

No comments:

Post a Comment