WaDeD

Network and algorithm

Network and algorithm

Network

A WaDeD network finds its use whenever one finds it too costly, or too difficult, to deploy a stronger kind of network, such as cellphones repeaters or wifi. It offers a way to send short text messages, carried by the users of the network and fixed radios. Deployment is immediate (just give a WaDeD radio to everyone and leave a few at some places to serve as relay) and a few batteries are enough to make the network work for a long time. It makes no sound and the users can send and recieve messages simply by plugging their devices into a host computer and running the client, so it is particularily useful when discretion is required.

Message delivery works on an inundation principle: whenever a device meets another, they exchange all of their messages automatically, so that at the end the two devices hold the same messages. When plugged to a host, the embedded application automatically sends him all the messages that are intended for him, and will continue to send every rightful message it picks up until unplugged.

The network can also connect a peripheric to the network, allowing it to regularily send messages by using the movements of the people passing by, such as a thermometer in a remote meteo station using a WaDeD network to send temperature records to a twitter application somewhere else.

Storing messages

Radio communication consumes energy. A lot of it. In designing the way that two WaDeD boards communicate, we needed to make sure they did as few radio exchanges as the possibly could. To achieve this, we had to think of a clever way to store data so that two devices could quickly compare the state of their memories.

To store data, we use an open hash table, augmented by a hash tree. Messages in the network are identified by a unique 64-bits ID. It is use to determine the place of insertion of the message in the hash table, so that every device stores the same messages at the same places (inside the collision list of the hash table, messages are sorted by ID).

The buckets of the hash table are then viewed as the leaves of a hash tree. A hash tree is a simple data structure in which data is found in the leaves, and every internal node of the tree contains the hash of the concatenation of all of its sons. This means that two sets of messages are identical if and only if the hashes stored in the roots of the correponding trees are identical. This property is still valid for any subtree, making it extremely fast to identify which messages differ in two different memories. Here is a picture illustrating our structure:

Storage of the messages inside WaDeD

The algorithm

Say two devices meet. We want to design a procedure which, at the end, leave the two devices with the same messages in memory, while making them communicate as few as possible, and we can use the augmented hash trees that we have implemented.

In the following description, we need a bit of terminology. The nodes of the tree are divided in two categories: the leaves, which have no sons, and the internal nodes, which have sons. Among those, the penultimate layer (that is, the internal nodes whose sons are leaves) will deserve special treatment. When we talk about broadcasting an internal node, we mean transmitting an information about its position in the tree, and its hash. When we talk about broadcasting a leaf, we mean transmitting an information about its position, and the IDs of all the elements in the list that corresponds to the leaf.

Here are the rules that drive the behaviour of a WaDeD in its communication:

  1. If nothing has happened over a certain period of time, broadcast your root.
  2. If getting a root, compare it with your own. If the hash match, do nothing, otherwise broadcast all the sons of the root.
  3. If getting a node, compare it with the corresponding nodes in your own tree. If they match, do nothing. Otherwise:
    1. If the node is not on the penultimate layer, broadcast its sons.
    2. If it is on the penultimate layer, its sons are leaves. For each son, broadcast the leaf.
  4. If getting a leaf, compare it with the matching list in your own tree. For every ID that you hold and is not in the distant list, broadcast the corresponding message. If no message was broadcasted, broadcast your own corresponding leaf.
  5. If getting a message, check whether a message with this ID already exists in memory. If it does, do nothing, otherwise insert it in memory and update the tree.

This algorithm has good properties: when two devices are in a conversation, they are strictly moving down their respective trees, so there can never be any loop in the conversation. A typical exchange will appear as this:

    WaDeD A sends a ROOT
    root: 1D5DD10E

    WaDeD B sends a ROOT
    root: B04424B1

    WaDeD A sends a NODE N0
    sons: 21E5338F 21E5338F 21E5338F 21E5338F 21E5338F 21E5338F 21E5338F 21E5338F

    WaDeD B sends a NODE N3
    sons: EEDFA82 EEDFA82 3D7B7AC4 EEDFA82 EEDFA82 EEDFA82 EEDFA82 EEDFA82

    WaDeD A sends a NODE N11
    sons: 0 0 0 0 0 0 0 0

    WaDeD B sends a LIST N144
    list size: 1
    Element 1: 1804A114

    WaDeD A sends a LIST N144
    list size: 0

    WaDeD B sends a MESSAGE
    I'm a WaDeD!

    WaDeD A sends a ROOT
    root: B04424B1

    WaDeD B sends a ROOT
    root: B04424B1

This is an actual dump of two WaDeD synchronizing. You can see that the trees have eight sons. The synchronization process is extremely fast if the trees are similar to begin with, and not remarkably longer than a list comparison if the trees are hugely different.

Welcome to the Jungle

But another strong property of this algorithm is that it is stateless: everything is broadcast to everybody, and any device arriving in the middle of a synchronization can benefit from hearing a message. Potentially this leads to conversations dividing and many, many messages being exchanged–hence the name Jungle–, but the WaDeD always act upon messages it gets in a way that leads to increasing its knowledge of the environment. After a few seconds, the jungle calms down: all devices share the same memory state.

In conclusion, we have an algorithm that is extremely efficient when synchronizing two sets that are close and still very good when the sets differ a lot. It deals well when a lot of devices are communicating at the same time, and, since a given device can enter a conversation at any point, missed packets cannot do more harm than having to start the conversation a few messages before.