Hardware monitoring

Abstract

Whilst all computer hardware will eventually fail there are many measures which can be taken to reduce the likelihood of such a failure or to predict when such a failure is imminent. These days all computer hardware such as motherboards, CPUs, memory expansion modules, graphics adapters and harddisks have some inbuilt capabilities to monitor the health of the system. These capabilities vary but usually consist of measuring some combination of voltage, temperature, vibration or rotation speed to predict when each device requires routine maintenance or is reaching the end of its life and requires replacing.

In this section we shall look at the hardware monitoring capabilities of the Linux kernel and install and configure software to monitor and record voltages, temperatures and fan speeds. We shall also examine the use of applications to monitor and log SMART data from harddisk drives. Finally we shall explain how to configure applications to use the Simple Network Management Protocol, or SNMP as it is more commonly known, to make this information available over a network as well as how to monitor and log this information.

Contents