Diagnose Linux Load Average

1

What is a "load average"?

In linux machines, the " top " command is one of the most frequently used commands to get information about your machine performance. It gives you information about your CPU usage, RAM usage and the load average of the machine. The load average represents the average system load over a period of time and in nomal cases it should be under 1.0. You can get the load average by typing "top" or "uptime" in your shell and it will give you 3 numbers which represent the load in the last 1, 5 and 15 minutes. An IDLE CPU will result in 0 load, while a load of 2.5 represents that the cpu was overloaded by 150% in the last minute.

How do I diagnose the load average?

You can know whether your computer is dealing with a heavy load simply by noticing that  it is slow and the response time is more usual. To find out  what is causing the problem, you have to type the "top" command in your shell and focus on the first 3 lines of the output. The important thing for us in the 1st line is the load average; notice the 3 numbers. If they are between 0 and 5, then your computer is fine. If they are between 5 and 10, this is a high load, but if they are more than 10, then you have some problems that need more investigation. If your load is more than 10, you should now take a look at the 3rd line  "cpu(s):" : this is an ideal view of the 3rd line in the "top" command Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

1- check the the first 2 numbers as they represent the percentage of the total time the CPU is spending to process stuff. If they are always 99-100%, you have to upgrade your CPU.

2- check the 5th number as it represents the percentage of time a CPU takes in waiting for an I/O. If the number is above 90%, this means that the CPU is spending a lot of time waiting for I/O. This could be due to an HDD problem or a Network problem or an application trying to access data at a rate higher than what the network or HDD are designed for. To know what process is causing this problem, write this command in your shell: " ps faux". It will list all the running processes in your system. Have a look at the stat column and you should see letters like these: R = running S = sleeping D = waiting for something, so look for any process that has D and you can check it and see what is the problem with this process.

Written By:

Mohamed Sanad

Comments

1

Nice article about Top..
also top have a lot of brothers including ntop which displays network monitoring in the same form that top shows the results..

Post a Comment

eSpace podcast Prodcast

RSS iTunes