June 22, 2006: A work party visited Mississippi today. The cisco in the SE corner of the Mississippi Ballroom building was re-energized for the first time in ages. Since last summer-ish. In order to expedite, a section of cat5 was attached with a rj45 coupler, the connection was then wrapped with tacky-tape and then electrical tape. The connection was slightly flakey, but was on and stable when we left. Next, we visited Quirks and Quandries and installed another repeater (third deployed so far on the network). Funds generate by this purchase can go towards several new netgear wgt634u's.

May 28, 2006: I modified the DHCP configuration to allocate 15 IPs from 10.11.105.1 to 10.11.105.15 to the MississippiNetworkRepeater devices based on their USB radio MAC address. Only a couple of them are allocated at the moment, for those deployed or about to be deployed. For example, 10.11.105.13 is the Amnesia Brewing repeater. Still pingable about 6 or 7 hours after it was rebooted, which is a nice change.

May 26, 2006: Metrix-commons had an outage from about 5pm until 7pm, where it stopped passing traffic on its backhaul network, cutting it and metrix-west off from the rest of the network and the internet. I stopped by Commons and power cycled the metrix. Traffic returned to normal thereafter. --RussellSenior

May 10, 2006: Metrix-west was offline for about 24 hours (starting Tuesday, a little before noon). Apparently due to a displaced power cord. Power reconnected. Tested, back online. --RussellSenior

April 9, 2006: The whole network was knocked offline this afternoon for about 2 hours (from a little after 2pm until about 4) due to a flakey ethernet connection in the NAYA wiring closet to "naya", our primary router. We could connect to it from the outside, but its connection to the entire local network was down until I could arrive on-site and jiggle it. This is a problem needing fixing. Flakey ethernet connections suck. --RussellSenior

April 2, 2006: The northern leg of the network was knocked offline this afternoon (from around 1pm until 9pm) due to a flakey ethernet connection in the NAYA wiring closet to metrix-naya-nw. Someone has plugged in an access point to the 8-port switch, perhaps they bumped our crappy crimp job. --RussellSenior

We are currently experimenting with NetgearWgt634u repeaters, using OpenWgt and a Hawking USB radio. They seem to have an annoying and thus far unexplained tendency to lose their upstream association after a while. I stopped by Amnesia to reboot its repeater this evening as well. --RussellSenior

March 28, 2006: The core of the network was down for a few hours in the afternoon, apparently due to a power outage in the area. The FreshPot network was also offline during this period.

February 7, 2006: RussellSenior stopped in NodeFreshPot to power-cycle the edimax client. Our link to FreshPot has been out since approximately 4pm on Sunday (February 5). It seems to be back now.

Interestingly, I can't ping the edimax from either the FreshPot or Missnet side. I think (unconfirmed) this has to do with MAC cloning at the bridge. It is confusing, because I thought I remembered pinging it successfully before.

February 4, 2006: RussellSenior replaced the 5-port netgear switch he had loaned with an 8-port netgear switch that the project had purchased. This will allow us a place to jack in for testing after we reconnect the third AP on the roof. Reused the existing wall-wart transformer after confirming it was also a 7.5V 1A device.

January 24, 2006: RussellSenior visited the Center for Self Enhancement at N. Kerby (three blocks east of Mississippi) and Failing today, and was given a tour of the roof by Facilities Manager David Proby. See [http://www.personaltelco.net/~russell/photos/2006-01-24b/ photos]. There are some tree issues. We apparently can't get onto the white part of the roof, which slopes down to the east. A [http://www.tessco.com/products/displayProducts.do?groupId=341&subgroupId=50 taller mast] might compensate.

January 14, 2006: RussellSenior and I traveled to the Naya Building today to install a second server: "chevy". The addition of chevy will take some stress off naya, which was doing everything until now. We also took some time to organize and label the cables around the server "shelf". With the addition of zipties, staples, and a few custom-cut cat-5e cables, we conquered the madness. Finally, we took a trip over to metrix-west to test download speed at the far end of the network with the new load-balancing installed. The result: load balancing is awesome.--CalebPhillips


January 11, 2006: RussellSenior, CalebPhillips, TroyJaqua, BenjaminJencks and DavidJencks convened a WeeklyMeeting at NodeFreshPot at approximately 6:45 pm. At approximately closing time, we configured a serial console on the FreshPot nucab and shutdown. We opened the box, installed the ISA NIC, a 3Com 3c509B, and rebooted. The nucab came back up without the new interface. Modprobing 3c509 installed the kernel module driver and we had an eth2 interface. However, we wanted to use the ISA card to connect to the DSL circuit (it is a slower card), so we inserted 3c509 into /etc/modules and rebooted. From the serial console (ttyS0,19200,N81) we could see that the ISA card came up at eth0, so we rearranged the cat5 so that eth0 remained the DSL; eth1 remained the local FreshPot AP; and eth2 attached to the edimax. We added a corresponding stanza to /etc/network/interfaces for eth2 and gave it an IP of 10.11.104.20 (when we decommission metrix-naya-sw, we'll give the FreshPot eth2 10.11.104.2 instead, in order to keep the gateways together in the IP space).

After packing up at FreshPot, we headed over to NAYA to see if we could get load balancing working.

TroyJaqua worked on configuring a Linksys WRT54GS and an edimax in WDS repeater mode, and did so successfully. This may be a solution for our businesses looking for a booster at their locations. The downside to this is it requires a static wds link to be configured on the upstream radios. The upside is that it works. Troy added a WDS link to metrix-naya-nw's b/g radio to connect to his Linksys, then connected an edimax to the Linksys and then several people (Troy, Ben and David) associated their laptops to the edimax and got dhcp resolution and connected to the rest of the world.

Made some progress on load-balancing, and adjourned around 9pm. Ben is continuing to work on the load-balancing remotely.

January 10, 2006: TylerBooth, MichaelWeinberg, RussellSenior, BenjaminJencks and CalebPhillips worked at NodeFreshPot to install a wireless link to the Mississippi Network in order to facilitate load balancing. We found a suitable location for the edimax device, on top of the shelves behind the counter, and found a way to run ethernet from there to the nucab in the backroom, and power to the nearest practical outlet. We had a little difficulty getting a high quality crimp on the ethernet cable. This was attributed to the outdoor-rated cat5 being perhaps a bit thicker that typical, and perhaps not completely compatible with the tips available. We were making good progress when we discovered that the FreshPot nucab had only two PCI slots and thus would not accomodate another PCI NIC. Russell went home to find one, but failed. Packed up and went home.

January 7, 2006: RussellSenior, BenjaminJencks and DavidJencks spent some time today on Mississippi. First, NoCat was enabled on web ports (80 and 443). We tested from NodeFreshPot and it seems to function properly. It took a little hacking because of current DNS inadequacies (e.g. nodemisssissippi doesn't resolve). We spoke briefly with the NodeFreshPot counter people about the wireless there, and they referred us to the manager. Need to pursue that through Tyler, most likely.

We visited the BlackRoseCollective, just north of NAYA across the small community park and spoke to BHT. We told him what we had in mind, and he was okay with it. We plugged in an edimax to test the signal to naya-nw. They have no nucab there, so using the edimax there will be more challenging. His only expressed concern is that they have a house full of people and didn't want missnet traffic to impinge too much on their bandwidth.

We visited Commons and tried to connect to the cisco-commons from inside (it was pouring rain outside), but were unsuccessful. We could connect to metrix-commons. Given that no one appeared to be connecting, we decided to decommission it for the time being, possibly to live again on NAYA. Russell and Ben climbed up on the roof and removed the cisco and its sector antenna. We clipped off the ethernet and wrapped the end in tacky tape. Check with Russell for the old tip and we can be sure that we crimp the new tip appropriately.

Ben investigated the edimax and determined that it won't do routing in its normal client mode. We figured a way to use it in bridging at NodeFreshPot, where we can install another NIC in the nucab there, run ethernet to the edimax in the front of the store, and do the appropriate routing on the nucabs. It appeared that we couldn't get DHCP resolution through the edimax client-bridge, but that won't be an issue with FreshPot, as we can assign a static IP. Need to coordinate with StephouseNetworks on NodeFreshPot wiring.

We stopped in to check with the Dog Shop, and the woman at the counter reported they'd seen the splash screen. I told her we'd just enabled it and that they'd only see it once a day (if they stay connected). We don't have any cacti data on that edimax, so we don't know how heavily it is being used.

Earlier in the day, I'd worked with CalebPhillips on checking out whether the edimax repeater mode will work on the Mississippi Network. We had some partial success, but had trouble with DHCP. We have not thus far succeeded in getting DHCP resolution through the repeater. --RussellSenior

December 16, 2005: As of about 3:30pm, the southern branch of the Mississippi Network was converted to a WDS configuration, and simultaneously, the problematic metrix-west (N Missouri and Failing) began to function properly. RussellSenior visited the neighborhood and confirmed the ability to get DHCP resolution from metrix-west and was able to roam seamlessly to the nodes at Mississippi Commons and the NAYA building (Mississippi and Shaver). Will need to convert the northern branch as well now. Thanks for everyone's patience as we sorted through the problem. We are now poised to further grow the network with much less turmoil and delay.

At about 7:00pm, the northern branch of the Mississippi Network was also converted to the WDS configuration. This will allow better monitoring of performance (particularly seeing if people are connecting), and will allow us to transition Ed's roof from the metrix-naya-nw connection to the metrix-commons, assuming that is ultimately considered desirable.

December 12, 2005: RussellSenior has autogenerated /etc/network/interfaces for each of the metrixes for using a WDS configuration. On the test rig, he has been running a ping for the last 20 hours from one client to another (as described in the December 4 entry) and seem to have a consistent 3.4% ping loss rate. We are still seeing a kernel panic after ifdown/ifup'ing the interface (as described [http://madwifi.org/ticket/222 here]), but believe the problem is tolerable since the metrixes are rebooting themselves on panic. The goal is to get the WDS configurations installed this week, possibly on Thursday.

December 9, 2005: Last night, MichaelWeinberg, JenSedell, and RussellSenior distributed flyers at the Mississippi Art Walk. We may have located another willing roof host at the furniture shop on Mississippi down near Fremont. Russell continues to work on a metrix configuration that will work reliably. Current status is that WDS is working, bridging works, slightly lossy, panics on ifdown/ifup, but at least it is rebooting itself and coming back up in good shape. Maybe an interim solution is just always rebooting to ifup interfaces.

December 4, 2005: I have had partial success using WDS bridging on a test bed consisting of two metrixes and a router/AP using the madwifi-ng drivers and a multiple VAP configuration. I am able to ping from a client-11g -> WDS-11a -> WDS-11a -> WDS-11a -> client-11b, which is essentially what wasn't working before. Pings aren't without a few dropped packets, but relatively few (~3%). The most significant problem now is that I am having trouble getting the backhaul radios to consistently come up in 11a mode. Perhaps some timing issue. Also, I've seem some oopses, not always fatal. I should probably sync everyone up to the latest rev of madwifi-nw. Anyway, hopeful news! With luck, this will get ironed out in the next few days and we'll be able to deploy it.

December 2, 2005: Became aware in the late afternoon that the nucab's DHCP server was not running. The connection was fine, but clients weren't getting configured, which, uh, reduced utility. AaronBaer patched up the deficiencies and as of about 3:40pm the DHCP server appears to be runnning again. We are talking about ways to facilitate more expeditious outage reports. --RussellSenior

December 1, 2005: Buick replaced with a nucab box. TroyJaqua and RussellSenior fixed a small bug consisting of a missing /etc/network/nat.sh script and it started working. Network functioning again. Modified ebtables on metrix-naya-sw to reflect the new gateway (substituting its mac address for buick's eth1).

November 30, 2005: Metrix-naya-sw hung at about 3:00am when I ran "athdebug +recv" in order to collect information on MAC addresses going through each node. It is not passing traffic, so metrix-commons and metrix-west are currently unreachable. So, metrix-naya-sw needs a power cycle as soon as possible. --RussellSenior

November 29, 2005: Buick got sick and was rebooted. In fact, it is still sick and will be replaced, hopefully on Wednesday evening, with a nucab, at least temporarily. We also power cycled the Edimax AP in the dog shop in Mississippi Commons. It appears to be functioning now.

November 21, 2005: RussellSenior built a freshened kernel (2.6.14.2) and madwifi-ng (rev 1329), installed them on metrix-naya-sw, metrix-commons, and metrix-west, and rebooted. The new madwifi-ng rev was built in a metrix-compatible chroot environment and so the madwifi-utils in /usr/local/bin are now linked properly. The other two metrixes, metrix-naya-nw and metrix-ed are still running the original 2.6.12.3-metrix kernel and the WDS-branch madwifi drivers from late July. It is possible to connect with essentially zero packet loss from buick to metrix-west if you simultaneously "ping -f 10.11.104.2" from buick. Metrix-west was apt-get upgraded that way.

November 20, 2005: RussellSenior thinks he's figured out what is going wrong. It is an effect caused by client-node to client-node when the traffic needs to pass through one of the client bridges. As mentioned earlier, when a client-node sends to a client-node, it sees the traffic twice, once when it is sends it and once (in promiscuous mode) when the master rebroadcasts it. When the traffic is passing from the other side of the bridge (say, from buick on eth0), and it sees the rebroadcast packet it just sent with a SRC MAC on ath0, the bridge is reassigning that MAC to the bridge port associated with ath0, not eth0. When packets return headed for that MAC, they get to the bridge and the bridge fails to deliver to the port where that MAC actually lives. Boom. This problem does not occur when communicating client-to-master (or master-to-client), because these packets are not rebroadcast. The problem doesn't occur when the communication is strictly client-to-client, because even though the client still sees the rebroadcast packet, the bridge is smart enough to know not to reassign local MAC addresses to a different port.

RussellSenior tested this model this morning by ping flooding from buick to metrix-commons (thus keeping metrix-naya-sw's bridge refreshed with where buick's MAC should properly live) while pinging the problematic metrix-west. Still some lossage, but far less than the usual 98%, only about 17%.

Now the question is, what is the solution? One temporary solution might be to use ebtables filtering to drop packets at metrix-naya-sw where buick's MAC shows up on ath0 as a SRC MAC. But there are other situations where we'll see the same phenomenon, e.g. 11b/g clients of the 11a client nodes. The real solution is to get the sending bridges to ignore the rebroadcasts altogether.

November 19, 2005: DonPark and RussellSenior climb up on Mississippi Commons and collect some more data. Russell collects some kismet data during some ping tests, but misconfiguration of his kismet rig reduces its utility to near zero, but a careful examination of the tcpdumps from the metrixes yields the insight needed to figure out what is going wrong. The failure in ping request/reply loop between metrix-west and buick always occurs in the delivery of packets from metrix-naya-sw to buick. ARP traffic gets delivered fine in both directions, but ICMP and other IP traffic disappears after it arrives at naya-sw on its way to buick... buick never sees it. why?

November 17, 2005: TroyJaqua and RussellSenior visited Cecily's to try to recover metrix-west from misconfigured network. Unfortunately, we were unable to connect via ethernet either, so at about 3pm we came back, equipped with a ladder generously loaned by a neighbor, and Troy climbed up and swapped out the metrix motherboard with one configured with Ben's firmware. This was deemed to be the most practical solution, given the difficulties of getting the serial cable onto the DB9 pins and logging in while balancing on the crest of the roof. The radios on metrix-west remain the same as before, just the motherboard and its flash and ethernet are changed. See updated MAC address in the table below. Also did some testing from metrix-west. One interesting result was that pinging from metrix-west to buick disrupted a ping from metrix-commons to buick. Still need to get onto the commons roof to collect some over-the-air packets. Hopefully tomorrow. Ebtables may be our salvation. We are getting closer, but still haven't cracked it yet. The 11b/g radio was put on essid notyet.personaltelco.net to indicate it isn't actually working yet. I said we'd switch back to www when it was active and working. Talked to a few residents that were enthusiastic to dump their $60/month broadband.

November 11, 2005: I think I've figured out the "received packet with own address as source address" messages. They are only appearing on metrix-west and metrix-naya-sw. I think they are a consequence of having bridges on nodes in managed-mode. The master-mode node rebroadcasts frames sent via it, and bridging puts the interfaces in promiscuous mode, so the sender is hearing the rebroadcast. The messages are therefore, presumably, innocuous. --RussellSenior

RussellSenior, MichaelWeinberg, and I got together today and did some more discussion and testing regarding the problems at hand. This discussion continued into the evening on IRC. As of now, we have some unanswered questions the most pertanent being: what is naya-sw doing with packets coming from metrix-west headed to buick, and why isn't it doing the obvious thing...send them to buick? --CalebPhillips

November 10, 2005: RussellSenior rebooted metrix-commons and metrix-naya-sw to the new kernel, and magically, traffic started to flow between metrix-west and metrix-naya-sw for the first time. However, oddly, connectivity from buick (10.11.104.1) to metrix-west was still severely lossy. Log messages "received packet with own address as source address" are appearing on metrix-naya-sw's /var/log/messages. Some progress, but some bugs still need straightening out.

November 9, 2005: RussellSenior got into the basement and was able to recover metrix-west via ethernet. The problem had to do with modules not loading. Patched that problem in a somewhat kludgy way by adding "pre-up modprobe ath-pci" to the athN stanzas in /etc/network/interfaces. The ath-pci module should have loaded from /etc/modules, but wasn't for some reason. Applied the same fix to metrix-commons and metrix-naya-sw, but haven't rebooted them. Can ping metrix-commons from metrix-west, but not all the way to metrix-naya-sw. Hoping a reboot to the new kernel will correct that.

November 8, 2005: RussellSenior has copied a new kernel, modules and utilities for use with the madwifi-ng drivers over to metrix-west, metrix-commons, and metrix-naya-sw. The /boot/grub/menu.lst file is modified but still pointing at the 2.6.12.3-metrix kernel. Except in the case of metrix-west, which because it was already not connected, we decided to use it as a test case. It rebooted to 2.6.14-metrix (with the madwifi-ng drivers) and is associated with metrix-commons with a nice strong signal on 802.11a, but for some reason its network is not functioning. Same thing with the 802.11b/g radio, association and a nice strong signal from the street, but no network. It isn't pingable from either radio. Going to try to get inside to test from the ethernet tomorrow.

November 7, 2005: RussellSenior is hacking on a metrix image with a new kernel and madwifi-ng drivers, using the metrix we pulled off of Cecily's as a testbed.

October 28, 2005: CalebPhillips, RussellSenior, and MichaelWeinberg replaced the metrix on Cecily's rooftop and re-attached the equipment with real chimney mounting hardware. This node seems to be working just fine now, with good connectivity to Commons. However, currently there seems to be a problem with the switch at Naya that is preventing the network from handing out DHCP or access to the intarweb. The todo list below has the current todo.

Russell and I came back to fix the switch issue and found Buick DOA. We also replaced the switch with one Russell had on hand. Everything seems to work now...except, Cecily's cannot connect to anything past commons in the direction of naya. Specifically, it seems like clients of the commons 802.11a radio (with the omni) cannot see each other (naya-sw and metrix-west are both clients to commons). We are working on an explanation. At this point the network is entirely functional everywhere except metrix-west. - CalebPhillips

October 27, 2005: CalebPhillips and RussellSenior managed to get Ed's roof online. Without access to a ladder we had only one option to make progress, and that was to see if the apparently non-functioning metrix on Ed's roof was actually powered on and accessible from the ethernet. Ed graciously let us in to check. This possibility was suggested by Russell's experience with metrix-naya-sw, where the radios did not initially come up after a reboot. Turns out, the metrix was on. Russell was able to connect via the ethernet, and got a weak signal radio signal on 11g, roughly 7 dB SNR. So the problem wasn't a bad POE connection and it was not a failure to load ath_pci either. Caleb suggested that we might have the antennas backwards. Twice, trying to "ifdown ath0" and the "ifdown ath1" froze the metrix. Russell tried swapping ath0 and ath1 in /etc/network/interfaces, rebooted and bingo, 11b/g started working! SNR in the attic jumped to about 30. However, the 11a backhaul was weak. Pinging to Commons worked, but with about a 40% packet loss. Maybe need to repoint the antenna (isn't there a distance tweak for 11a having to do with an ACK timeout or something, but I thought it was for further than we're talking about here). Caleb and Russell retreated to FreshPot to report success and think. Russell, looking out the window at the backfire already pointing at Ed's from NAYA NW, realized there was a chance that Ed's radio might be able to hit NAYA NW, even though it wasn't pointed directly at it, because it was only 600 or so feet away instead of 1500 feet to Commons. Russell changed the metrix-naya-nw ath0 radio to ESSID backhaul-nw and master mode on channel 161 and turned it on, then walked up the block near Ed's, logged in via 802.11g and reconfigured its ath1 (connected to the 11a antenna) to backhaul-nw and rebooted. Bingo! SNR of about 30 dB. Kind of a chewing gum and bailing wire solution, but it is up and passing traffic. - RussellSenior