Availability management is important to insure that the system is up and ready to go. A very important metric and a very important feature is that you have predictable recovery time whether it’s building access, compute storage, cooling, ...
View more
Availability management is important to insure that the system is up and ready to go. A very important metric and a very important feature is that you have predictable recovery time whether it’s building access, compute storage, cooling, network, etc.
The recovery time is known to be 12 minutes plus or minus 1 minute from any failure. If people know this , then it is established how quickly they will be back up and running. If you can measure these criteria, you can now make the 12 minutes. Again, recovery times of plus and minus 1 minute are not done with human running around trying to figure out how to bring it back; it is done by the computer.