Putting Networked P Cs On The Buddy System
A longstanding problem in maintaining large computer net works is keeping every machine up to date on the status of all its cohorts. If computer A has failed, for instance, other machines in the net must be alerted quickly so they don't waste time and resources trying to seek its collaboration. Assigning one machine to watch over the others is one obvious solution, but in huge networks that leads to a deluge of status messages clogging the network. Ronald Bianchini Jr., a professor of computer engineering at Carnegie Mellon University, has worked out an "adaptive" computer program, or algorithm, that cuts that message overload to a minimum.
The program runs on all computers in the network and gets each one to monitor a bunch of others. Should one computer fail, the program instantly reassigns the machines under that one's care to another group. As a result, one network of 150 Unix workstations at Carnegie Mellon can check itself out in just 30 seconds, instead of the normal hour and a half. The university plans to commercialize the program with the help of Houston-based TechSource Inc. A TechSource official says IBM is already negotiating a license to use and sell the technology.