RFP for netsukuku
Move to design related discussion to Design/*
|Deletions are marked like this.||Additions are marked like this.|
|Line 1:||Line 1:|
|## page was renamed from FreedomBox/MeshNetwork|
It could theoretically connect all neighbours together and NOT simultaneously tx/rx because the wifi radio can only either or, not both at the same time.
1. Vanishing bandwidth
Usable bandwidth gets less at each hop. It might never be able to replace an infrastructure.
2. the dense mesh problem
To connect neighbours without internet, they want to be pretty close to each other... like one in every apartment. That will then make so much noise that there might be no communication possible over more than 2-4 hops.
The OLPC experience
OLPC networks used to basically melt down over 1 hop, much less several, due to the dense mesh problem.
(see John Gilmore's comments on the mailing list) The One Laptop per Child project also wanted to satisfy this goal. They wanted kids all over a village to be able to reach each other and to reach the Internet via a gateway at their school. They had the advantage of designing and building both the hardware and software. But they failed, partly due to system integration issues.
They were using a buggy implementation of 802.11 meshing that preceded 802.11s. But they also used higher level user interface software, which used multicast packets to find and communicate with other nearby laptops. The mesh software worked poorly with multicast. Not only did it send multicasts at the slowest speed (1 megabit), which took up a lot of airtime, but the various nodes would repeat the multicasts to make sure that every node had heard them. This limited the size of the network that they could scale to. They did not discover this unfortunate interaction until very late in the hardware/software/firmware integration process (when the multicast-based application sharing software started working).
2. Reproducing Problems
Another major problem was that a mesh network is very hard to reproduce. If it does something unexpected or suboptimal, the developers can't just teleport themselves to the part of the world where that particular physical configuration of radio nodes, physical antennas, software versions, and firmware versions exists. In many cases they can't even reach into the nodes of that network over the Internet while the problem is happening, to debug it. Many, many OLPC mesh problems occurred in the field which could not be replicated in the lab, which made them 10x or 100x harder to fix. This meant that buggy mesh network firmware and software didn't improve at the usual rate (of the rest of their software).
3. Mesh in a Classroom
The result was that despite a lot of work addressing bugs and performance in the mesh firmware, they never got their automatic mesh network working with more than a handful of XO laptops. If you put 30 laptops in a classroom, they would burn up 100% of the radio bandwidth (and chew up their batteries) merely with overhead packets ("Hi, i'm here." "Hi you, I'm me; have you heard about Joe and Alice over there?" "In case you want to send a message to Joe, send it via me to Alice; I can hear Alice just fine."). There was no bandwidth left for the users to actually communicate; connections would time out, nodes would appear and disappear from the mesh, etc. So OLPC stopped using the mesh and recommended that each classroom install one or more 802.11 access points. Which has worked ok. They also switched to support ad-hoc 802.11 without meshing, for automatically networking "a few students sitting around under a tree", which also works ok.
There ARE some mesh networks that I hear are working on a larger scale, such as B.A.T.M.A.N. I suspect that the large scale meshes are in largely static networks that are tuned by humans to work well (just as the broader Internet's routing system is tuned by humans to work; it's not automatic). I do not know if other meshes support multicast (or other portable ways for high level software to find what nodes are on the network), nor whether they work in a network of mobile nodes with limited battery life. All I can report on is the one project I was involved in (OLPC), in which their mesh implementation failed to accomplish its goals, and was dropped from the next generation hardware and software.
5. the advantage of wired connections
This is part of why I recommend using wired connections wherever possible. For FreedomBox to succeed, it needs to succeed at scale. A FreedomBox network that can't route packets for more than 500 nodes worldwide wouldn't be worth building. (Clue: this is why the Internet exists today: it scaled up and kept working, while the proprietary networks that preceded it didn't scale up to worldwide scale.) In a substantial network, your mesh and dynamic routing protocol could require a few megabits of traffic at all times on each node, just keeping track of everything. Over a 100-megabit Ethernet that's just 2 or 3% of the bandwidth. But over 802.11, that burns up most of the available bandwidth. Every connection you move off wireless onto a wire makes more radio bandwidth available for the folks who truly can't run a wire.
Many mesh routing protocol have been developed:
open80211s (included with Linux kernel)
batman-adv (included with Linux kernel)
Yan Hetges has been actively deploying a wifi adhoc mesh network with ?OpenWrt and derivates since 2006 (it has ~15 nodes and covers an area of maybe 15 km^2).
Dave Taht has been working on mesh networks for a long time, with things like OLPC, etc. I used google earth using gps to locate places and height above ground to make estimates of beam paths possible - ?zoom in below San Juan Del Sur, Nicaragua (requires an Earth browser).
Babel implements "diversity routing" - what that is is that it tries to choose a non-competing wifi channel for forwarding packets from one box to another. this does of course mean you need two radios, on different channels, for it to work. This holds great promise for making mesh networks scale. Some info on that http://www.pps.univ-paris-diderot.fr/~jch/software/babel/wbmv4.pdf
The other one is that the codel work holds great potential to help establish a congestion aware metric which could lead to dynamic load balancing working semi-sanely across multiple paths. We're not there yet tho, I'm not happy with any of the routing metrics currently being used.