Distributed Hash Table

Distributed Hash Table #

It is similar to a normal HashMap, but distributed over several nodes. You could try to build a distributed cloud storage with it. One product for this is DynamoDB. When a new node is added, only a small part of the data should be moved and not all of it. Keyword for this is consistent hashing. But you could also build a distributed cache. A product for this is called Memcached.

The whole system is based on a common address space. All objects and all nodes use the same address space and are logically arranged in a ring. The node with the smallest key k, which is greater or equal to the key of the object, is responsible for the object. This is also called (consistent hashing). In the diagram below, node 4 would be responsible for all objects from 2 to 4 inclusive.

Chord Diagramm

The table next to the ring is the finger table, which ensures a logarithmic access time to all nodes.

But this hash table has some problems. If the nodes, usually computers, can choose the key themselves, a malicious user can delete objects from the hash table. All that is needed is to choose a key that makes the malicious user responsible for these objects, which causes the old responsible to transfer its data to the new node. Another attack results in the disruption of a node by placing the malicious user directly in front of it, so that the old responsible is no longer responsible for any objects. In both cases, incorrect information can be returned.

The data of the “successor” and the “predecessor” must always be up-to-date. Whether the “successor” is still current can be found out with a trick. The node asks for the “predecessor” of the “successor” and if this is larger than the node itself but smaller than the current “successor”, the node can take it over as “successor”.

The “predecessor” can keep the “successor” up to date via regular notifications. The successor takes it over if it is larger than the current predecessor but smaller than the node itself.

If nodes disappear unplanned, the “successor” might not be correct anymore. The affected node must then simply switch to the next successor by searching for the responsible node for the object with “Node-Key + 1”.

Calendar September 21, 2021