What is a word for the number of failures of a system?

Normally systems are rated with an availability metric which is use to describe downtime, as opposed to counts of failure (counts of failures alone does not communicate how long it's out of action.)

Here is a snippet from the wikipedia entry for availability.

In reliability theory and reliability engineering, the term availability has the following meanings:

  • The degree to which a system, subsystem or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at an unknown, i.e. a random, time. Simply put, availability is the proportion of time a system is in a functioning condition. This is often described as a mission capable rate. Mathematically, this is expressed as 100% minus unavailability.
  • The ratio of (a) the total time a functional unit is capable of being used during a given interval to (b) the length of the interval.

For example, a unit that is capable of being used 100 hours per week (168 hours) would have an availability of 100/168. However, typical availability values are specified in decimal (such as 0.9998). In high availability applications, a metric known as nines, corresponding to the number of nines following the decimal point, is used. With this convention, "five nines" equals 0.99999 (or 99.999%) availability.


I'm not sure if it would fit your needs. However, you may use failure rate.

Failure rate is the frequency with which an engineered system or component fails, expressed in failures per unit of time.

Source: Wikipedia


As total effort required to correct, this could be design debt or technical debt

Physical systems can have a service life which has related concepts of "mean time to failure"/"mean time between failures". This functional assessment of the overall fitness of a system is more meaningful than the count of components at risk of failure.

Any system can be assessed for defects giving a count of items needing to be corrected to put the system in a state of good repair.


Finally, something I have a unique perspective on.
Failure modes are a countable method of failure identification and classification. The term can have different meanings depending on viewpoint. For a missile launch officer in a silo, failure modes basically correspond to the list of fault indications on their fault panel. To missile maintenance Job Control, there is finer granularity because fault combinations are understood as separate modes and generate particular responses. To the guy wrench-bending in the silo, there are thousands of failure modes. Maintenance Technical Orders (A wall full of books in the case of Minuteman) provide an ordered list based on fault indications and statistical analysis of past occurrences. If that didn't work, they would contact me. I could call up every maintenance action ever performed on the system since it was built, and every one had been reviewed and a failure code assigned after the repair had been made.

See here - https://en.wikipedia.org/wiki/Failure_mode_and_effects_analysis


retries

After a predetermined amount of failures, the safety chain on equipment such as furnaces will cause an, "ignition lockout due to retries." This will be indicated by the diagnostic indicator (an LED), giving you an error code.

retry

(of a system) transmit data again because the first attempt was unsuccessful. –Google