Errors can occur for several reasons when performing an IO operation. On a read of a disk you could have:
- a failed write operation
- the sector go bad due to age/use
Additionally you could have issues with data being corrupted during transmission, especially over a long network. Therefore a parity bit is added to a series of bits to ensure that the total number of 1’s in the string is even (even parity) or odd (odd parity).
There is always a chance that a parity bit is also incorrectly coded. And we now have to deal with additional transmissions which slows down the data transfer.
An error correcting code (ECC) includes multiple parity bits to permit the detection and automatic correction of some number of erroneous bits. The number of bits that can be detected and corrected depends on the number of parity bits and the encoding scheme employed.
Bad block detection and handling
When a sector is determined to be bad, it can be flagged as bad and ignored by the OS from then on. If too many sectors go bad, the system should alert the user.
When a sector goes bad, you can use sector forwarding (a pointer to the new sector) or sector slipping (skip and move to the next one) to handle what to do.
Stable Storage
This is the idea that some data cannot go bad. The book suggest “banking data” but lots of data falls into this category – health records, banking, company finance, etc.
Stable storage is an approach to data management that uses redundancy, along with a strict protocol for reading, writing, and error recovery, to guarantee that all data remains consistent in the presence of media and crash failures.
A common method is to use a RAID (Redundant Array of Independent Disks) system. RAID systems allow for faster access, and/or duplication for safety.
Error Handling with IO was originally found on Access 2 Learn