Having recently purchased a high-end PC with an ATI 4870 overclocked video board, I discovered something very strange: every 2 or 3 of minutes, the on-board fan would start up at full speed, then slow down and stop after 20 secs. Very annoying. More surprisingly, I discovered that many users had the same problem. I know that in the PC world, the user is not the center of interest of the designers - the hardware is - but still, come on, it's an high-end board, with everything to do a nice regulation automatically. You would expect some minimal user-friendly behavior for this quality. Who would want to have a B747 take off in his office every couple of minutes?... It seems the ATI engineers don't mind. Nothing in the Catalist Control Center can be found to control this.
And on top of this, it creates thermic cycles! It is not a good idea to go back and forth between two temperatures. I don't think this is the best way to improve your equipement life expectancy.
Anyway, I found the nice RivaTuner to do the regulation job in Windows. Kind of ironic, when you think of it, to end up using a utility made for nVidia boards to control my ATI board. Oh well...
Now, what about my Linux Ubuntu session? Hmmm, nothing in aticonfig. Or very little. After some research, I found a nice article on the French Ubuntu forum with the start of a PERL script. Cool. Now I can play. The result is the atifan script, which I wrote to go to the bottom on this.
I hope this will be useful to other users as well.
After reading the legal notice at the bottom of the page, download the atifan file:
Download atifan (last update: June 7, 2009)
Decompress it (gzip -d atifan.gz) and you are ready to play.
As you will see if you read the script, it simply uses the aticonfig tool, and performs some stats around it. No rocket science there.
Basically, the tool gives you 3 main features:
Let's get the status of the fan and the temperature:
$ ./atifan --status
Fan: 0% - Temperature: 56.50 C - GPU: 2%
Now, let's watch our fan going crazy every two minutes by using the monitoring option:
$ ./atifan --monitoring
Fan: 0% - Temperature: 57.50 C - GPU: 2%
Fan: 0% - Temperature: 57.00 C - GPU: 2%
Fan: 0% - Temperature: 56.50 C - GPU: 2%
Fan: 0% - Temperature: 56.50 C - GPU: 0%
Fan: 0% - Temperature: 58.00 C - GPU: 2%
Fan: 0% - Temperature: 57.50 C - GPU: 2%
Fan: 0% - Temperature: 58.00 C - GPU: 2%
Fan: 0% - Temperature: 58.00 C - GPU: 2%
Fan: 0% - Temperature: 59.00 C - GPU: 2%
Fan: 0% - Temperature: 59.50 C - GPU: 2%
Fan: 0% - Temperature: 59.50 C - GPU: 2%
Fan: 0% - Temperature: 60.50 C - GPU: 2%
Fan: 0% - Temperature: 60.50 C - GPU: 1%
Fan: 0% - Temperature: 60.50 C - GPU: 2%
Fan: 0% - Temperature: 61.00 C - GPU: 2%
Fan: 0% - Temperature: 61.50 C - GPU: 0%
Fan: 0% - Temperature: 62.00 C - GPU: 2%
Fan: 0% - Temperature: 62.50 C - GPU: 2%
Fan: 0% - Temperature: 62.50 C - GPU: 2%
Fan: 0% - Temperature: 63.00 C - GPU: 0%
Fan: 0% - Temperature: 63.50 C - GPU: 2%
Fan: 100% - Temperature: 64.00 C - GPU: 2%
Fan: 100% - Temperature: 63.50 C - GPU: 2%
Fan: 31% - Temperature: 62.50 C - GPU: 2%
Fan: 31% - Temperature: 61.50 C - GPU: 2%
Fan: 31% - Temperature: 60.50 C - GPU: 2%
Fan: 13% - Temperature: 60.50 C - GPU: 1%
Fan: 13% - Temperature: 59.50 C - GPU: 2%
Fan: 13% - Temperature: 59.50 C - GPU: 2%
Fan: 13% - Temperature: 58.50 C - GPU: 2%
Fan: 0% - Temperature: 57.00 C - GPU: 2%
Fan: 0% - Temperature: 57.50 C - GPU: 2%
Fan: 0% - Temperature: 57.00 C - GPU: 0%
Here we get a sample every 2 secs. As you can see, with the fan at 0%, the temperature increases steadily. Then at 64 C, the board safety kicks in and blows the fan at 100%, then slows down as temperature cools down: 31%, the 13%, then 0%. Until the temperature goes back on again, etc.
OK, I'm tired of hearing this noise. Let's set the fan speed to a low 10% hum:
$ ./atifan 10
Much better. Do another ./atifan --monitor to watch your temperature. If you are not doing anything fancy, you should see your temperature stabilize and then cool down. On my well cooled PC, it goes down to 45.50 C after a while. Nice.
Our next task is to learn a little bit more about our machine. Every PC has a different cooling, with a different video board. So we are going find out what are the "usual" temperature levels for each fan speed. We are going to do this with an idle machine and then with a busy machine.
$ ./atifan --analyze
Fan: 100% - Temperature: ...........................42.075 C - GPU: 0%
Fan: 90% - Temperature: ....................41.475 C - GPU: 2%
Fan: 80% - Temperature: ....................41.8 C - GPU: 2%
Fan: 70% - Temperature: ....................41.875 C - GPU: 0%
Fan: 60% - Temperature: ....................42.05 C - GPU: 0%
Fan: 50% - Temperature: ....................42.75 C - GPU: 0%
Fan: 40% - Temperature: ....................44 C - GPU: 0%
Fan: 30% - Temperature: .....................45.325 C - GPU: 0%
Fan: 20% - Temperature: ....................45.925 C - GPU: 2%
Fan: 10% - Temperature: ....................46.025 C - GPU: 2%
This took some time (about 14 minutes)! This is because for each fan speed, the scripts waits for the temperature to stabilize over a span of at least 20 samples made every 4 seconds. So you will wait at least for 1 minute and 20 secs for each fan speed.
As you can see, the various temperatures are quite similar (at least in my personal case). The temperature goes up as the fan slows down, but not by much. This means that if you are doing regular desktop tasks, you can leave the fan at only 10% and not worry about it.
Now, let's try under some stress conditions. To do this, we are going to launch in the background the atiode application, which is ATI's stress utility. Let's set it to run for 20 minutes:
$ atiode -P 1200 -H localhost:0 &
Now run the anlyze again: (note: if during this test, the temperature runs out of control and goes too high, the script will immediately abort the analysis and set the fan to 100% in emergency.
$ ./atifan --analyze
Fan: 100% - Temperature: .....................47.625 C - GPU: 36%
Fan: 90% - Temperature: ....................47.625 C - GPU: 37%
Fan: 80% - Temperature: ....................48.025 C - GPU: 37%
Fan: 70% - Temperature: ....................48.375 C - GPU: 36%
Fan: 60% - Temperature: ....................49 C - GPU: 37%
Fan: 50% - Temperature: ....................49.975 C - GPU: 37%
Fan: 40% - Temperature: .....................51.825 C - GPU: 37%
Fan: 30% - Temperature: .....................53.4 C - GPU: 36%
Fan: 20% - Temperature: ....................54.1 C - GPU: 37%
Fan: 10% - Temperature: ....................54.475 C - GPU: 36%
Interesting: at 36% of GPU load, we are able to keep the temperature to reasonable values even with low fan speeds. Then again, maybe this is because my PC has lots of fans and cooling? Surely, your own values will vary. But it still gives you some numbers to think about. Too bad atiode does not allow us to make more GPU-intensive tests.
Time to make serious stuff and have some fun : let's use the Phoronix Test Suite!
The Phoronix Test Suite is running too fast for the analyze mode to stabilize, so let's set manually the fan to a 100% and run the test. Then run it to 75%. Then 50%. Everytime, watch the temperature and the GPU stress, to get a sense of the relation ship between fan speed, temperature and GPU load. For example, in my case, at fan=50%:
Fan: 50% - Temperature: 55.50 C - GPU: 72%
Fan: 50% - Temperature: 55.50 C - GPU: 62%
Fan: 50% - Temperature: 55.50 C - GPU: 75%
Fan: 50% - Temperature: 56.00 C - GPU: 78%
Fan: 50% - Temperature: 57.00 C - GPU: 87%
Fan: 50% - Temperature: 57.00 C - GPU: 95%
Fan: 50% - Temperature: 57.00 C - GPU: 94%
Fan: 50% - Temperature: 57.50 C - GPU: 97%
Fan: 50% - Temperature: 57.50 C - GPU: 99%
Fan: 50% - Temperature: 55.50 C - GPU: 73%
Fan: 50% - Temperature: 55.50 C - GPU: 58%
Fan: 50% - Temperature: 56.50 C - GPU: 61%
Fan: 50% - Temperature: 57.00 C - GPU: 78%
Fan: 50% - Temperature: 57.00 C - GPU: 88%
Fan: 50% - Temperature: 58.00 C - GPU: 91%
Fan: 50% - Temperature: 56.50 C - GPU: 87%
Fan: 50% - Temperature: 54.50 C - GPU: 61%
Great. We now have a good idea of the kind of temperature that we have when we push the GPU. We can define a profile.
We should now be able to draw a profile: for that temperature/GPU load, I need to run the fan at that speed.
For example, in my personal case:
0 C: 10%
50 C: 20%
53 C: 30%
56 C: 50%
64 C: 70%
68 C: 100%
(don't use these values as is, your case will vary!)
Edit the atifan file (sorry: I was not able to find a clean way in PERL to set a table in a configuration file, so you have to hack the atifan file directly. If anyone can tell me how to do it, I'd be very greatful).
At the beginning of the file, change these two lines:@regulation_temp = (0, 50, 53, 56, 64, 68);
@regulation_speed = (10, 20, 30, 50, 70, 100);
The first array is the list of temperature steps, and the second one the fan speed. Very easy.
Now test your regulation:
./atifan -r -v
Fan: 20% - Temperature: 52.00 C - GPU: 2%
Fan: 20% - Temperature: 52.00 C - GPU: 1%
Fan: 20% - Temperature: 52.00 C - GPU: 0%
Fan: 20% - Temperature: 52.00 C - GPU: 2%
Do some GPU testing while keeping an eye on the output of atifan, and you will see the speed adapting itself automatically as the temperature goes up. Nice. With some experience, you might have to adjust some values.
Happy with it? Now is time to run it automatically in the background and forget about it. Here I found a small difficulty with my Ubuntu configuration. For some reason, I could not make atifan run properly as a daemon. It seems that aticonfig will work only if the current user session has been launched and initialized properly. If someone has some ideas about this problem, it would been nice to tell me! So here is what I did:
Congratulations, you're done!
Here are the options for atifan:
-s or --status : give the current fan speed, temperature and GPU load
-m or --monitoring : give the status every 2 seconds. Ctrl-C to stop
-a or --analyze : analyze mode: start at full fan speed, then reduce speed and check temperature
-r or --regulate : regulation mode: using the internal tables, adapt fan speed to temperature
-v or --verbose: : use if you want to see status output while regulating
You can change some default settings by editing the /etc/atifan file:
max_temp = 64 # Maximum allowed temperature during an analysis or a monitoring. Go to 100% in emergency.
sample_count = 20 # Number of samples to use for computing an average temperature
sample_pause = 2 # Number of seconds to pause between each sample
sample_spread = 1.5 # Relative error allowed around the average value to consider a value as stable. E.g 1.5 means +/-1.5%
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, BUT WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
IMPORTANT NOTICE: THIS WORK IS PROVIDED FOR EXPERIMENTAL PURPOSES ONLY. THERE ARE ABSOLUTELY NO WARRANTY WHATSOEVER THAT COMES WITH THIS SOURCE CODE AND THE INFORMATION DESCRIBED IN THIS PAGE. YOU MAY LOOSE YOUR DATA AND/OR HAVE COMPUTER HARDWARE OR SOFTWARE DAMAGES, OR SECURITY BREACHES. YOUR COMPUTER COULD BURN OUT, SET YOUR HOUSE ON FIRE, THEN BURN YOUR NEIGHBORHOOD, WOUND OR KILL YOU, YOUR FAMILY, YOUR FRIENDS AND YOUR PETS. YOU ARE USING ALL THIS AT YOUR OWN RISKS. YOU HAVE BEEN WARNED !