Archive

Posts Tagged ‘monitoring’

Security Lessons from Nature – Status monitoring

October 13th, 2009 Josh No comments

I weigh between 150 and 155 pounds. What's interesting is that, under ideal conditions, it is exactly between 150 and 155. I weigh myself regularly, and I have noticed that if my weight ever drops below 150, I get sick within a day. The same applies if it holds steady over 155 for more than a couple of days. Similarly, I have an average temperature range, and any significant variance typically bodes ill(ness).

The human body (really, all mammals) has many such metrics. In addition to weight and temp, there is an average heart rate, normal EKG, bone density and typical levels of vitamins, minerals and hormones. These can be measured in many ways, but they generally fall into two categories. Some things can be measured at a surface level (weight and temp), others require special equipment, a tolerance of invasive procedures and significant amounts of time. Of course, the more time you devote to it, the better the data you get, so these scans are generally only done when a problem is suspected.

The same applies to IT systems. There are certain metrics that are easily determined and if they vary, it can indicate a problem. Just like weight and temperature, some can be easily gathered, gathering others can impact the system, and some require the system to be down before they can be gathered.

Just like we generally don't send people in for a full body scan on a regular basis, we aren't in the habit of shutting down servers for a day each week and performing precautionary forensic analysis upon them. Instead, we prefer to check surface-level data: Disk, CPU and RAM usage, network connection statistics. If one of these indicate a problem, then and only then do we begin to dig more deeply and run scans that might impact system performance.

The key, just like my regular monitoring of my weight and temp, is to regularly monitor system performance metrics. Otherwise, you only catch problems after they've already impacted the system. Just as it's easiest to deal with a cold before it really sets in, it's easier to identify an attack at the beginning of the process.

Tags: , ,

Related posts

Real Life Lessons: Monitoring

January 29th, 2008 Josh No comments

[flickr]photo:2194849199(small)[/flickr]The second lesson to learn from my incident is the importance of monitoring. The concept behind monitoring is where you have a service that periodically checks the status of your resource and if there is a problem, it lets you know. These are commonly seen in physical security (where you have a device that knows when doors/windows open or if there is movement where there should not be) and in I.T. (where you periodically look at a web or email server and make sure that things are running properly).

In my case, I had three monitoring systems. My security system is aware of when doors or windows open, and if that occurs, it sounds an alarm and notifies the security company. This is highly (99%) reliable, when it is active. The fatal flaw in the system is that it does this whether a criminal comes in the house or if I leave the house. Thus, it is easy to leave it off when I am home. The second monitoring system is that of my watch cats. In theory, if someone enters the house, the watch cats will start hissing and clawing and otherwise alert me to the individual's presence. In practice, the proper operation of watch cats is directly proportional to how tired they are... and how likely the intruder is to give them yummy food.

They're not 100% reliable.

The third monitoring system was me. On some level I was aware that something wasn't right, and the smell of cigarette smoke did wake me. However, while the monitoring was effective (I woke up), the monitor was not (I ignored the problem and went back to sleep).

Thus, all three of my monitoring systems failed, largely due to operational problems. I have corrected this by making sure that my security system is on, even when I am home. Like many operational challenges, the problem is taking the same action often enough to make it become a habit. Once you reach that point the operational costs are effectively zero.

My questions to you:

  1. What are your primary resources that need protection?
  2. How do you ensure that you know when they are affected?
Tags: ,

Related posts