Electronics : Fault Tolerance

From bitrary
Jump to: navigation, search

Hardware : Electronics


The threat model in this chapter consists of EMP, natural radiation, lightning, short-circuits by electrically conductive liquids and dust, temperature related malfunction, loss of contact due to mechanical issues (shock-wave, PCB-bending, low-quality solder contacts, bullet-holes, bomb/grenade projectiles, etc.).


Shielding

Sea-floor-tolerance

To make electronics sea-floor-tolerant, the electronics must be drowned to some non-conductive liquid, something that has low viscosity and that does not expand like ice, nor should it shrink. It is important that the liquid goes under the microchips, because otherwise the water pressure will crush the microchips onto the PCB, possibly breaking contacts or the crystal inside the packaging. If the liquid contained aluminium powder, then it might be a bit more radiation shielding and might also help with EMP and conduct heat. It would be nice if the liquid stayed at its liquid form at all operating temperatures and pressures, because then the convection would help with the cooling. The low viscosity liquid should be without any aluminium flakes, because those would just settle down, but a glue-like liquid that hardens might contain various flakes.


May be some mineral oil, sold by pharmacies, mixed with ethanol might be a fine cooling liquid that does not conduct electricity and stays liquid at low temperatures while being non-toxic. Toxic options can include "all-climate mineral oil" from car spare parts dealers. For air transport, avionics and space applications is important that the liquid will not boil in the vacuum at the operating temperature of the circuit. To remove air bubbles from the liquid and to increase the likelihood that the liquid fills all cavities of the PCB the manufacturing process for drowning the circuit into the liquid should take place in a near-vacuum.


Radiation

The 2. layer should be lead or something else against gamma radiation. The heavier the atoms at the 2. layer, the lighter the shield, because it takes that much less material for getting the same amount of shielding.

The 3. layer, the outer-most layer, must consist of some very light atoms, preferably hydrogen atoms, in practice may be helium, because the lead is almost transparent to neutrons, which should be slowed down by collisions with something very light. The 2. layer also shields against the gamma radiation that is created during the slow-down of neutrons at the 3. layer.


Mad Ideas to Evaluate

May be in the case of cars the control computer might reside in a acid resistant container filled with gearbox oil and the container might reside inside lead-acid battery. That way the lead of the battery acts as a radiation shield and the heat from the computer also helps to heat up the battery a bit. To increase the heat exchange between the oil in the container and the liquid in the battery, the surface area of the container might be increased by making the container walls mimic the shape of vacuum cleaner HEPA filters.


Digital Electronics

It all boils down to some form of redundancy, be it the duplication of channels or in the form of time multiplexing, where the same thing is calculated by the same unit multiple times and the answer that was the most frequent is considered to be the "correct" one. In stead of having the same unit execute a program segment multiple times, multiple units can perform the operation in parallel and then the most frequent result might be considered to be the correct one. The units might reside at different regions of the robot/device so that a single bullet hits only one of the computing units. Given the relatively low price of microcontrollers in 2016 that kind of duplication can be financially feasible, provided that the application is such that the speed of light that limits the signal propagation from one "corner" of the robot or "pyramid door control unit" to another is tolerable.


Low-level Error Correction

In stead of a single wire/fiberoptic channel there are multiple and at the end points there are error calculation units. The error correction is totally transparent from assembler point of view.

With the exception of long distance communication (radio links, cables, laser links, etc.), that form of error correction technique is usually available only to those, who can develop their own microchips, which does not include the "lay developer" due to project setup costs at silicon foundries. If the money were available, then there would still be the issue of lack of flexibility, because the error correction should be designed in to be part of the crystal and that rules out the possibility to use available, very elaborate, commercial digital electronics components.


Error Correction Codes

...has its own chapter


Redundancy by Repeated Run-time Testing

There are multiple units, preferably manufactured by different manufacturers at different silicon foundries, and there is a slow, primitive, signal multiplexer that is radiation tolerant due to being assembled from primitive, HUGE, discrete components. There is some form of primitive, radiation tolerant, read-only-memory (fuse memory, PCB with plotter applied conductive paint at junctions, a PCB with drilled holes in stead of burnt fuses, etc.) that contains test data and test results. A discrete transistor based system pseudo-randomly selects the test case and feeds it to all of the units. The result is tested by using a primitive AND-array that is implemented by using the HUGE and SLOW, radiation resistant, discrete transistors. The units that fail, are switched off and the multiplexer will direct the output signals from one of the functional units to the control lines.


The inputs to the units can be without any DEMUX, because there might be a relatively huge, both, in value and physical size, discrete, resistor at every input of every unit and the current at the input bus might be provided by radiation-resistant discrete transistors. That way it is guaranteed that if a faulty computation unit, may be a microcontroller, switches its input pins to output pins and starts to sink or source current through those pins, then it can still not interfere the signal on the input bus.


Analog Electronics

Electrostatic Discharge(ESD)

ESD References


References