Message boards : Number crunching : Does overclocking adversley affect rosetta?
Author | Message |
---|---|
dondee Send message Joined: 8 Dec 05 Posts: 5 Credit: 4,233,251 RAC: 0 |
I am new at Rosetta and have seen some mention about overclocking affecting the credits for the work done. But does overclocking affect Rosetta results in a negative way? I have not overclocked because there seems to be a conceses that it negatively impacted classic seti. But I would like to overclock while running Rosetta as long as there are no problems with the results. |
![]() ![]() Send message Joined: 17 Sep 05 Posts: 161 Credit: 162,253 RAC: 0 |
I am new at Rosetta and have seen some mention about overclocking affecting the credits for the work done. I don't recall seeing it mentioned but two of my AMD CPUs are overclocked and are doing fine. It's a matter of finding the right balance - I have had some problems with work units crashing when I took the overclocking too far. *** Join BOINC@Australia today *** |
![]() ![]() Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,700,566 RAC: 3 |
dondee, there is "overclocking", and there is "OVERclocking". Most CPUs sold wind up in PCs put together on an assembly line using the cheapest possible parts. That same CPU, put into a carefully-built machine with quality parts, can be overclocked a great deal without raising the temperature to dangerous levels, and without increasing the number of RAM failures. If done correctly, overclocking can turn a $1000 PC into a box capable of turning out credits as fast or faster, and with the same accuracy, as a $3000 PC. On the other hand, when people get carried away by getting that 'last couple of megahertz' out of a CPU, even though it may not _crash_ every five minutes, it _is_ throwing error after error into the calculations that it's doing. THAT is the negative effect you hear about from overclocking. (Well, that and melting the CPU...) If you look at my computers, the Windows box is an AMD 3700+, which normally runs at 2.2GHz. Mine is running at 2.53GHz, and if I hadn't used RAM that I had already laying around and had actually bought decent RAM, could go a _lot_ farther, as it's only running at 47C. The rational temperature limit for this CPU is "under 60C", and my goal was to run it as fast as I could (without causing errors) but stay under 50C, without using any fancy liquid-cooling, etc. However, when I go much higher than 2.53GHz, I start getting RAM failures, so until I can replace the RAM, this is where I sit. Very easy to do, and I've run SETI, Einstein, Predictor, CPDN, Rosetta, and SZTAKI on this PC with zero errors that I can attribute to the overclocking. But it is true, that I've seen some overclocked PCs that returned error after error after error. They sure did it real quick though! ![]() |
dondee Send message Joined: 8 Dec 05 Posts: 5 Credit: 4,233,251 RAC: 0 |
I am looking at replacing one or two of my units and the budget is s little tight right now. So I thought to get a "slower", less expensive cpu and tweak for speed a little to make up for not having a "faster" more expensive cpu. I have overclocked before, but it has been a while. There are quite a few forums and boards on the Internet that I have been checking out to get up to par. With all the choices it hard to pick one. But, AMD is the only way to fly. |
![]() ![]() Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,700,566 RAC: 3 |
With all the choices it hard to pick one. But, AMD is the only way to fly. The 3700+ "San Diego" was the one I wound up with, and I have zero complaints. (Okay, one... don't save money on the power supply, get a good one. Lost a week's crunching and a video card.) ![]() |
![]() ![]() Send message Joined: 17 Sep 05 Posts: 161 Credit: 162,253 RAC: 0 |
The 3700+ "San Diego" was the one I wound up with, and I have zero complaints. I'll second that. They have 1MB L2 Cache and can be overclocked to 2.5GHz or more without much effort. I have one running at 2.7GHz with stock heatsink and fan - halfway between an FX-55 and FX-57 If the 3700+ is beyond your reach, look for a cheaper Socket 939 CPU. That way you still have an upgrade path to an Athlon X2 later. If you want cheaper than that, maybe look at the Socket 754 Sempron 2800+ or 3100+ *** Join BOINC@Australia today *** |
![]() Send message Joined: 3 Nov 05 Posts: 24 Credit: 2,005,763 RAC: 0 |
The 3700+ "San Diego" was the one I wound up with, and I have zero complaints. Most of my problems has been with memory. twin xeon [email protected] |
![]() Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
Then there is Paul's contrarian opinion... If you want faster CPU, buy one. If, again this is *MY* personal opinion, you are really serious about accurate science you would not be risking "soft-errors" by running the system past posted speeds. Historically, IBM's reputation for reliability came from never runing parts anywhere near design limits. Note, I am not making accusations or demonizing ... I have listened to all the arguments about how "safe" it is ... but, I did 20 years in Avionics maintenance and I have some slight amount of knowledge in electronics (training equivelent to a college degree in electrical engineering - granted it was from the 1970s, but, I have not forgotten it all yet), and I am not convinced. And the "proof" of results by redundency simply means that all the computers in the quorum returned the same story. Which could be a lie ... :) If you want to convince me that the "testing" is valid you have to follow the same logic used in calibration of aircraft components. The testing of each part can be traced through each test instrument all the way back to the NIST standards (or whatever they are calling themselves this year). We don't test the systems on known problems, we mearly see if all the lies match ... Anyway, just Paul's contrarian opinion ... Oh, do you really want to take a drug that was raced through the testing process? :) |
![]() ![]() Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0 |
I am new at Rosetta and have seen some mention about overclocking affecting the credits for the work done. The problem is that occasional random errors creep in, and you do not really know at what stage they start to matter. By the time errors crash WU you have gone way too far. Once Rosetta starts runnung redundancy checks there will be a fair test - if your results fail validation more than once a month you are overdoing the overcloccking (assuming everyone else's results are going through OK). My advice is, out of fairness to the project, don't overclock on a project that does not apply redundancy. At present this implies don't overclock on Rosetta. Just my two pence worth, we will have to wait till NewYear for the official project policy on this. River~~ ![]() |
dondee Send message Joined: 8 Dec 05 Posts: 5 Credit: 4,233,251 RAC: 0 |
I realise that an increase in performance beyond designed specs can introduce random anomallies, but there is also always a "safety factor" engineered into a product. AMD seems to have a 20-30% overclockability in a majority of there cpus. With this in mind I would assume (you know what that means) that one could mildly massage the speed of a cpu without getting into an area of "false production". Just a thought, I will do what is best for the project. I am a stickler for accuracy, but if I can do it a little faster and stay within an acceptable level accuracy then I am all for it. It would interesting to know how much overclocking affects the results and if there is a point that one could speed up a little and without sacrificing quality. I am not a programmer, but the thought that a person can overclock a computer and run programs and play games without "lockups", "freezes" and any other bad results why would there be an affect on Rosetta (or other like prorams)? This is just reasoning on the level of an average user. I am not implying or encouraging the use of overclocking, just thinking out loud. If Rosetta comes out with an official policy on overclocking, I will respect it. I appreciate the responses to this thread and hope that I may get some other "thinking out loud". |
![]() ![]() Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,700,566 RAC: 3 |
This is interesting... I can count the number of times I've disagreed with Paul on one hand. River I haven't "known" nearly as long or well, but I've come to respect his opinion a great deal just from what he's said on these various boards. (Paul, I'm going to get long-winded here, and you may well want this for the Wiki somewhere, although I'm sure you'll have a rebuttal. :-) ) Let me discuss my perspective, and rather than being "general", I'll tie it specifically to Rosetta, and the redundancy issue. First, computing is inherently chaotic. That may sound strange when everything is supposedly "binary" and deterministic, but at the levels of speed and size (both physical in the electronics, "very small", and in software complexity, "very large") we're dealing with, random effects make a difference in any system whether overclocked or not. The very fact that you can run the same benchmarks on the same system ten times in a row and get ten different results shows this. The _best_ we can hope for is some statistical level of "certainty" in a result. The way we improve "certainty" is scaling, both in the software and in the processes. Integer mathematics is inherently more "stable" than floating point, because there are many floating point numbers that cannot be represented exactly in binary. This is why any _good_ accounting package will deal solely in "cents" and will stick the decimal point in _after_ all the mathematics is complete. Scientific computing doesn't have this luxury, and it is obvious from what we've seen overall in BOINC that steps are necessary to counter the uncertainties involved; thus both general redundancy and heterogenous redundancy, to counter differences created by running even the same software on different platforms. The experience of the optimizers on SETI points this out as well; changing a compiler switch can make the difference between the output being "strongly similar" or "weekly similar". Even the existence of the _terms_ points out that no two systems can be guaranteed to produce the exact same result. From this perspective, overclocking simply adds one more variable to the equation; but it's not really a _new_ variable, it's just a change to the value of an _existing_ variable, and that is the stability of the CPU/bus/RAM "system". We don't know the value of this "stability" variable. We can attempt to measure it, by running diagnostics like Prime95, and RAM tests, and so on. When we reach a "failure point" with these diagnostics, we know the system is _unstable_ at that speed. This could be at some extreme overclock level, or it could be at "stock" speeds - there are too many physical variables, too many "parts" involved in a modern computer for it to be any other way. Below this unstable point, we cannot say that the system is "stable", we can only say that it is "less unstable". If my PC shows no visible errors during testing at 2.5GHz, and your PC shows no visible errors during testing at 2GHz, we have no way of knowing which is more likely to insert a minor error into a calculation tomorrow. This is where the robustness and design of the software has to take over. The "easy way out" is to have redundancy. Have three or four or twenty-seven computers do the same calculation, and accept the "majority vote" as being correct. (That's a whole other topic, that I won't get into...) With this approach, the software "system" itself is irrelevant, you're basing your ability to certify the accuracy of the results on the redundancy. Rosetta's system however is enough "different" from, say, SETI's, that redundancy simply is not necessary to be able to certify the accuracy. Rosetta is not analyzing a set of data off of some radio telescope tape. They are running an algorithm thousands of times with a set of _random_ inputs, in an attempt to locate the "best" input values. The output is not a "yes/no" "this signal exists" result, it is a statistical plot of "this input graphs to be here, _this_ input graphs to be there". Effectively, _every_ WU issued in a series is part of one giant "WU", and we are _wanting_ every returned value to be different, because each is based on a different random number input. It is not necessary to test every single random number in the input range to have a useful outcome. Once a few thousand have been tested, the ones clustering at the "lower left" are known to be the most "likely candidates". From those, the project can either re-run the algorithm with a tighter "range" of random input values, or move on to some other method of analysis. Some input values will simply not be tested, just by the nature of a random number generator. Some input values that _would_ have been tested, will not be returned by the host assigned to test it. Some input values that are tested will be returned with the "wrong" result by the host assigned, for whatever reason. If that "wrong" result is "high and right" instead of "low and left", then it won't be part of the valuable set of results, and effectively it might as well not have been returned at all. Which is okay, because the system isn't needing _every_ value to be tested. This leaves two possible concerns for an overclocked system (or any other system that might not return a "perfect" answer). If there are _enough_ systems returning "wrong" answers to be statistically significant, so the graphs no longer show the expected clustering, but instead are returning effectively random results, then something is seriously wrong, either with the programs being used, or with our entire computer industry. If a _particular_ system returns a result that is _THE_ lowest-left value for that run, then if no further checking was done, it would be a significant problem for that result to be wrong. The project has this covered. They re-run the algorithm with that particular input value, on their own systems, and see if they get the same result. If not, they throw it out and pick another. Thus, returning a wrong result that happens to be in "just the right spot", might cost the project some computer time, but will not affect the accuracy of the project's outcome. And if a result is "wrong", the chance that it would be "wrong" in just such a way that it would be "the answer", is vanishingly small. So. If your computer, overclocked or not, is "acceptably functional" - ie; doesn't crash, doesn't fail Prime95, etc.; then it's results running Rosetta are just as likely to be useful as those from any _other_ system that is equally functional. Might overclocking cause you to return a "wrong" result twice out of a thousand instead of once in a thousand? Sure. But the value of returning a thousand results, 998 of which are "perfect", instead of returning 800 results, 799 of which are "perfect", in the same time period, is greater than the "cost". The project has (MUST have) the necessary systems in place to deal with a small number of incorrect results. If your PC is unstable enough to return a _large_ number of incorrect results, or even any more than a very small percentage of incorrect results, then it is likely to be unstable enough to cause "errors" in the processing, instead of successfully running but giving the wrong output value. (A comment: _one_ way to test the stability of your system is to run one of the other BOINC projects that _does_ use redundancy. If you have more than one "successful but invalid" results in a given month, then you know you are past the "unstable" point. I recommend this in _addition_ to periodic "local" tests such as Prime95, RAM check, and so forth.) ![]() |
![]() ![]() Send message Joined: 15 Dec 05 Posts: 761 Credit: 285,578 RAC: 0 |
An interesting argument Bill. My point, and if I understand correctly Paul's as well, is that if you run a CPU faster its results get more random. Your point if I understand it is that we are running a random method anyway, so a bit more randomness doesn't make it much worse. This feels wrong to me. Yes, Rosetta is a 'Monte Carlo' program, it uses random numbers to try to cover a representative sample of possibilities. However the randomised choices are only part of the code. At the other points in the code you want the result to be predictable. A Casino uses randomised equipment to select whether you win or not. As we know the odds are fixed in favour of the House. The wheel needs to be random or they are cheating. If we simulate the roulette wheel with slight random errors it does not affect the overall odd at all. But if the calcualtion of the payout at 30:1 contains random errors that spoils the stats. If they randomly payout at 33:1 too often the house goes broke. That is because the game demands that the calculaton of the payout is deterministic even tho the decision to make a payout is not. Similarly, if we look at the way biology does it - the immune system is an inricate blend of deterministic and randomised components, as are the attacking viruses and bacteria.
Tempting. Certainly I'd prefer that than no tests at all. I'm still not happy tho, because I don't know (and I don't think anyone actually knows) just how similar the effects on one lot of code are on another. How the random jiggles propagate differs from one set of code to another. A non-chaotic system can tolerate much bigger random errors than a chaotic one. But I don't actually know. I'd leave it to Jack or David K to say, as they know their own code and they get the benefits of a good call and they get to grapple with any red herrings introduced if either of us call it wrong. River~~ PS - thanks for the nice things you said at the top of your posting. I respect your views greatly too, and any disagreement we have doesn't change that at all. And it is an interesting argument you put forward. |
![]() ![]() Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,700,566 RAC: 3 |
Your point if I understand it is that we are running a random method anyway, so a bit more randomness doesn't make it much worse. That's a part of it; the other point is that we don't know if overclocking _really_ makes any given system worse than some _other_ system that _isn't_ overclocked. Yes, it may (or may not!) increase the number of errors on the specific overclocked system - but even that increased number may be lower than the number of errors given by a different PC that's running "stock". I guess the "short version" of my point would be that if the Rosetta system isn't already robust enough to deal with PCs that throw out errors, then the project needs to change something. (I think it already deals with this quite well, but I'm not 'project staff', so I can't swear to that.) If the Rosetta system _is_ already robust enough, then simply raising the number of errors your particular PC is going to throw from, say, 0.1%, to 0.2%, by overclocking, or turning on the heat in the room for the winter, or whatever, is irrelevant. I think reasonable testing is enough to insure that your PC isn't going to be throwing 0.5% errors, and I think it would crash long before the numbers got any higher than that. I don't trust ANY computer's accuracy. And hopefully the project doesn't either. I would be _more_ likely to trust my overclocked AMD64 3700+ than I would someone else's non-overclocked "Wal-Mart special". Would I trust mine more if it wasn't overclocked? Only slightly. It's still running Windows. :-P ![]() |
![]() ![]() Send message Joined: 17 Sep 05 Posts: 68 Credit: 3,565,442 RAC: 0 |
The only project that I can recollect, that actually came out with a complaint to overclockers was LHC@Home. This was a year or so ago and they asked all overclockers to detach from the project because thier results roughly 6 or 7 places to the right of the decimal point were way off. I also remember them having issues with different processors returning differing results for the same workunit in this 7 places to the right realm. Anyway just a diferent angle on the discussion..... Ciao...... ![]() |
![]() ![]() Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,700,566 RAC: 3 |
LHC@Home. <snip> roughly 6 or 7 places to the right of the decimal point <snip> issues with different processors Yep! LHC never got their Mac client working, because the values were "different" and didn't match to the level of accuracy they needed. Different projects need different levels of precision, and LHC needs a lot. ![]() |
![]() Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
We had this debate lo these many years ago on the SETI@Home message boards. And, I promised back then that I would capture the arguments in one place. Well, one down. See the BOINC-Wiki for the result. THe original SETI@Home Thread can also be read for more. I also have a link in the WIki to the WIkipedia entry which should also be read ... For those on both sides of the debate, I will take suggestions on changes. THe pro part is weak, and one of the reasons I have trouble with the idea is that the only purported advantage is that the calculations are done faster. Gospel in - Instability != Gospel out ... Anyway, I am not saying you absolutely should not and if you do I will consider you evil. *I* just don't buy the argument... One of the lines in the write up is from JKeck where he points out for some to run CPDN some people have to underclock ... YMMV So, if I did not state something fairly, let me know. Oh, and screaming at me is not going to help make the point I should change the article ... but, if I truly did mis-represent the Pro side, I am willing to listen to rational suggestions |
![]() ![]() Send message Joined: 25 Oct 05 Posts: 576 Credit: 4,700,566 RAC: 3 |
For those on both sides of the debate, I will take suggestions on changes. Looks good to me - I did make some spelling corrections while I was there... otherwise, I have no problem with any of it. ![]() |
![]() Send message Joined: 17 Sep 05 Posts: 815 Credit: 1,812,737 RAC: 0 |
For those on both sides of the debate, I will take suggestions on changes. And I checked spelling 23 times ... :) |
j2satx Send message Joined: 17 Sep 05 Posts: 97 Credit: 3,670,592 RAC: 0 |
I'm of the opinion that overclocking does not necessarily cause any accuracy problems. I overclock some systems because they will take it and still pass diagnostics. Would the counter argument be to "underclock" all computers to make them more accurate? |
![]() ![]() Send message Joined: 17 Sep 05 Posts: 211 Credit: 4,246,150 RAC: 0 |
A Casino uses randomised equipment to select whether you win or not. As we know the odds are fixed in favour of the House. The wheel needs to be random or they are cheating. If we simulate the roulette wheel with slight random errors it does not affect the overall odd at all. I work in a casino and either of those ratios on the roulette wheel would get us in trouble, too much profit not enough chance for the player. You mean 35:1 and 38:1 or 39:1 most likely. 38:1 is a dead even american (double zero) wheel, 35:1 is the standard payout. Edit: A biased wheel is a bigger deal than you would think as well. The advantage can quickly drop from 5.26% to zero or worse if a number is more favorable and a player notices. Quick hint to roulette players bet the numbers that have hit recently rather than numbers that have not hit. You have better odds of finding a biased wheel or a sloppy dealer than of guessing which number that has not hit is coming up next. Back to the topic. I am biased against overclocking in general but not adamently against it. I think this comes from history, most notably the early pentiums from ~100mhz to ~300mhz. There was no physical difference in these chips they merely failed quality control tests at different levels. This was standard operating proceedure before then as well. I know this has changed these days and it is more likely that a chip is underclocked to meet demand for a specific speed, but I can not shake this early training. BOINC WIKI ![]() ![]() BOINCing since 2002/12/8 |
Message boards :
Number crunching :
Does overclocking adversley affect rosetta?
©2025 University of Washington
https://www.bakerlab.org