Personal Telco VPN
Overview
This is a place to tie together all of the information about the ["VPN"] aspects of ["PTPnet"].
In Summer of 2006, JimmySchmierbach and KeeganQuinn spent several weeks planning and testing a design for a system which could allow all PersonalTelco nodes to be logically interconnected. The design allows us to provide a structure that is flexible enough to take advantage of any type of connection, although we focused mostly on IP-over-IP tunnels created with software such as ["OpenVPN"] with glue provided by ad-hoc IP routing protocols like OptimizedLinkStateRouting.
The Good
Aside from being a cool idea and a fascinating problem domain from a technical perspective, there are a couple of practical benefits to this:
Maintenance - Some nodes are trapped behind unfriendly routers doing cone NAT, which prevents the NetworkOperationsTeam from working on them in the usual way. One example is NodeLuckyLab, but there are more than a couple of these out there. (Is there a list of these?)
Universal connectivity - it would be ideal if all of our different locations were connected, via radio, laser, fiber, Ethernet, frame relay circuit or whatever you like, forming one big ["PTPnet"] cloud. Unfortunately, the fiber-backed wireless dream mesh isn't quite blanketing the world yet. However, in the meantime we can achieve a similar effect with tunnels.
One related idea would be allowing more users from outside our network to tunnel in, participating as VPN clients. Think PicoPeer.
- Technically, quite easy to do, with the foundation that is already in place. It could be set up a number of ways.
- Several tunnels exist but currently they are all between nodes and specific designated servers; anyone can access the network but only if they are physically present at one of the integrated nodes, which needlessly limits the potential usefulness of the network.
- What about connecting networks that are not nodes? For example, a home with no wireless network, or a block of servers at a company..
Redundancy - More connections mean more bandwidth for everyone. Even tunnels have the potential to supplement direct links. For example:
- Additional bandwidth could be gained in situations where multiple routes through different interfaces are available.
- Fault tolerance is also possible; traffic can be redirected to another path if one interface fails, reducing or even eliminating service interruptions..
IPv6 deployment - Tunnel brokers are effectively the only way that most people in this area can obtain significant IPv6 connectivity. It's better than none at all but these tunnels tend to be unreliable and often suffer from high latency. In contrast, we actually have something of a bandwidth surplus, especially when it comes to just getting around town. The conclusion which follows is that if we were to put a bit of serious effort into it, we could easily end up with a faster and more useful IPv6 network of our own, especially if we establish a BGP peer relationship or two at the Internet border rather than falling back to another broker tunnel.
Education - Tunnels give us an opportunity to start acquiring practical knowledge about how to deal with increasing scale in a wide area network, which is going to be invaluable as we begin facing those problems with physical networks.
- Similarly, we could potentially get a head start on exploring and building potential applications to run on these networks.
Not So Good
An unattributed comment was left in the midst of the ideas in the previous section. It stated that "some people are opposed to the idea of using tunnels" and that "they think that it is just a result of us becoming complacent and lazy about our idealistic goals". While rewriting the list above, I thought about turning it into a bullet item, or maybe even putting together a separate list of negative points. I finally decided against it because there's no context, not enough information for me to formulate a reaction without incorporating a lot of my own assumptions about what may have been meant. I could turn it into a discussion of the pros and cons of IP-over-IP tunneling, but in response to a statement deriding the very standards and ideals of everyone who has worked on the project, that would have been completely irrelevant. Try as I might, I just couldn't figure out a way to integrate the little chunk of malice back in with my newly-revised list of ideas.
I may not like what it says, but just as I couldn't drop it back into the middle of the list, I couldn't bring myself to censor it away completely. Apparently these people feel pretty strongly about this issue, and they have a right to express that opinion. If there's an explanation behind it, I'd really like to know what it is; maybe they hold the secret of the one fundamental design flaw in my idea that I haven't thought of yet. I'd really like to know about that as soon as possible rather than spending any more time on a project doomed to fail. However, even if me and my colleagues are a band of complete klutzes working on a project which will never be relevant even to PTP, that still wouldn't really be an appropriate way of letting us know.
Otherwise - if there is no reasonable explanation to be found - I would ask that the author kindly remove what he wrote about us. I would also appreciate it if that person would add their name somewhere in this section. I would consider it a token of good will, which would make me feel a lot better about this whole thing - even if you disagree with me in every way, we're just twiddling bits here; there's no need to make it personal. When you said me and my colleagues were lazy and complacent, you most certainly knew who you were talking about: my name was placed on this page without my knowledge, before I thought to look at what facets this venerable wiki had cut in our coal lump of a plan. While you're out here, anonymously making slanderous public statements about us, I am doing the best I can to actually make it possible to build a sustainable large-scale community network and I know I'm not alone.
-- KeeganQuinn
Reference
Here are the places on this wiki where there is currently information on VPNs:
Goals
Vague and not too ambitious
The near-term goal would be to actually complete the implementation that has been attempted so many times, then look into applying Jimmy's ideas to whatever extent is possible, and finally document everything here so that it can be maintained and expanded with relative ease.
The polished brass version
Every time it comes up, the idea is always that we should start with NodesBehindNat. It's an easy decision to reach by committee, since it allows you to completely overrule any naysayer Scrooge types by playing the security card, and the folks who just really want a network get told it'll happen sometime soon. Everyone's happy. It's happened that way probably a dozen times with a different group of people each time; rather than listing all of the names or even the most recent batch, just give yourself a pat on the back if you've ever been one of them.
The rationale is generally that those systems stand to gain most significantly at first. While this is completely true, and very noble, I'm afraid it has actually slowed down the progress of VPN deployment overall, as a result of the very same factor which is always thought will speed it along: these nodes aren't accessible except to a person with a laptop who must physically travel to each one. So it's not just a matter of a couple of hours with a terminal - it's a couple of days or more traveling all over town trying to make the right things happen. They're always the first ones, so there are always problems which of course don't manifest until later that night, and so these poor nodes get visited over and over by these equally poor guys who are doing their best to make things work.
Anyway, I'm not saying you should refrain from treading that well-traveled path, if you're upwardly mobile (read: car owner) and have the will to get out there and do it. Go for it. Send me (KeeganQuinn) an email; I'll help out. What I am saying is that there is really no reason for all of the nodes that have good connectivity to wait patiently on the back burner while the second-class citizen nodes get emancipated. In fact, it seems to me that it makes more sense if the hard-to-reach nodes get their wings later on in the process; if something doesn't go quite right in the beginning, which has happened every single time, it's no trouble to fix it if the node was accessible anyway. If instead that botch means someone has to drive all the way across town again, the amount of time spent goes way out of proportion to the benefit realized, as does the frustration level of our good friend the example volunteer.
The skinny
We've got a whole bunch of nodes that need to get hooked up. A list needs to be made; NodeAudit is probably a good starting point. To my knowledge, there is not a single node anywhere in the city that doesn't need at least one of these things done to it, although relatively few will need the full-service deal. Every node is different, which is real cute, but a huge pain in the ass when you want to start thinking about them as a group. So, I'm not going to say it's required, but the process of hooking up to PTPnet would be a great time to make them all at least a little less unique.
With that in mind, this is a rough description of things that need to happen to each node before that new tunnel goes hot. I will start doing these tasks myself and fill in more details as I go. Be patient - we'll be lucky if we get them all done before 2008. Don't even think about doing any of this just yet unless you are prepared to fix whatever you break. Actually, it's fine by me if your answer to that is going down to the location, ripping the box out of the dusty hole (nearly all nodes live in dusty holes), bringing it to someone who can fix it, watching them do it, then bringing it back to the dusty hole. It's a great learning experience, and you're not likely to make the same mistake twice after it costs you.
They don't need to be done in the NodeAudit list order or the middle of the night or any specific thing. A fair percentage tend to break from time to time, because we're neglecting them, because their operational environment is partially submerged and the water is moonlighting as part of a 220V circuit or just because some of them are really crappy old computers - you'd be amazed. Anyway, if you're going to attempt the process at all, don't be timid, just beat the thing up by running through the steps until you're pretty sure there are no more steps - then it'll either be really properly broken or working perfectly. Either way, you'll get three cheers from me.
Computers are pretty good at remembering stuff; if you end up on a node that's been done already, it's just going to tell you that, and usually with this type of stuff, it will refuse to do it again. It will also refuse to do stuff at all if it's something that doesn't seem like a good idea to it. As a final word of warning, some of these systems were not configured with major upgrades in mind, and some were only barely adequate to begin with, so keep on the look-out for completely filled hard disks and partitions. If you get stuck with a full disk, do the best you can to get the system to at least keep serving up Internet access and make a note of it on this page. A few of them could use new disks... Anyway, just dive in and have fun, and you'll have the quirks figured out in no time.
1. Most nodes will need to be upgraded to etch first.
- Some of them will need to be upgraded to sarge before they can be upgraded to etch. Believe it.
- Read the upgrade docs in the release notes and just follow the instructions. Seriously.
2. Install the etch kernel after all of the other software updates and get it running.
- Technically this is part of the etch upgrade, but it is worth repeating. It's important.
- Some nodes will appear to be upgraded to etch already but are actually running sarge or woody kernels. Always check.
- We really want our shiny new VPN to work properly after we set it up, and having a dozen different kernel versions trying to interact sanely does not help at all. It will seem like it works fine at first, but eventually things will start acting weird, so just take care of it before it's a problem.
The same goes for OpenVPN and olsrd versions; use only the stuff in etch or you'll wish you had.
Last but not least, it would be really nice to get them hooked into the Osiris server. It doesn't all have to happen at once but it will all need to get done eventually, and it's really pretty quick and easy to do it all in one session. You can move quickly; nodes don't get upset or anything if every detail isn't checked and double-checked. We spend a little more time now to save a lot of time later.
After those nodes are connected, work should be continued on interconnecting other nodes, as time allows.
These are some pretty mundane goals, and they rather make it sound like someone is paying us or something. Some more interesting (although also more long-term) ideas are described in the section about benefits, above.
Methodology
For now, we are using one central server with several clients connecting to the server, in a classic hub pattern. At some point, we're going to reach an upper limit with this design and we will have to re-evaluate our options given the technology available at that point.
JimmySchmierbach has done some fairly significant planning in anticipation of this eventuality, including some detailed documentation of a design based on a hierarchy with two tiers, referred to as supernodes and nodes. The basic idea was that all of the supernodes would be connected together with tunnels to form a full mesh pattern, then each node would connect to one or more of the supernodes. The original drawings specified the three core servers as the supernodes: cornerstone, bone and alitheia.
Someone once wrote here that Jimmy's plan involved the supernodes each being connected to a master node (eg. donk), resulting in a hierarchy with three tiers. That is not correct. donk was never a functional part in the original design. It didn't have to be; the idea was that with three supernodes in a mesh, all bases were covered as long as only one server was ever down at a time. All of the routes would still work because everything was supposed to be redundant.
It has been roughly one year since that design was dreamed up; unfortunately, during that time, alitheia has been completely removed from service and the Subversion repository (which was our most important organizational tool) has been destroyed. bone has also seen some hard times; although it ended up with an upgraded mainboard and some other components, it is now hosted virtually in the same location as cornerstone, which unfortunately forces us quickly to the conclusion that, of the three supernodes in the original design, only cornerstone retains even a chance of being effective in that role. Even cornerstone loses considerable appeal if you compare it with a system which includes proper redundancy features, and even cornerstone's admittedly interesting location becomes far less compelling if you compare it with a location which is furnished with redundant Internet connectivity, power and environmental control.
Jimmy's design is really impressive in theory, but I don't recall that we ever actually got it to work with all of the good parts. The VPN clients kept taking naps and the dynamic routing daemons got confused about the fact that we were running a mesh on a layer over their heads. Sometimes machines on the same switch, side by side, would decide that they'd prefer to talk to each other through a big chunk of Internet. It might work better with current software but I'm not really in a big hurry to try it; there's a lot to be said for simplicity, like for example a big fat server that handles everything and actually just works. You want redundancy? Get another big fat server, and just double everything or let it do round robin failover. Simple. Works. Doesn't confuse the software or the people configuring it.
This brings us right back around to the first paragraph in this section: we're running the whole show from donk, until it starts to break, at which point we need to look at our options. I propose we add capacity the same way I suggested we add redundancy: more big fat servers. However, that's probably a moot point, since we're nowhere near any kind of capacity limit. I don't expect we will even catch a glimpse of a limit until real users start generating traffic on the tunnel network, which is not likely given our modest selection of near-term goals, outlined in the previous section. At that point, my bet is that the pipe hits a red line before the box does, anyway.
Configuration
Oh, you actually want to set up a VPN tunnel?
To make a new client key do something like this:
ssh you@donk sudo -s cd /etc/ssl/easy-rsa . vars ./build-key thenode cp keys/thenode.crt /etc/openvpn/keys/ cp keys/thenode.key ~ mv keys/thenode.key ~ exit
Then, do the configuration on the server side - add a file in /etc/openvpn/ccd with a name like thenode.personaltelco.net. The contents should be something like (replacing 10.11.255.X with an unused IP within 10.11.255.0/24 from the NetworkAddressAllocations page):
ifconfig-push 10.11.255.X 255.255.255.0
Finally, you must configure the client. Do something like:
ssh you@thenode sudo apt-get update sudo apt-get install openvpn cd /etc/openvpn sudo scp you@donk:thenode.* . sudo scp you@donk:/etc/openvpn/keys/ca.crt .
Create the clients configuration file at /etc/openvpn/client.conf:
client remote donk.personaltelco.net 1195 proto udp dev tap ca /etc/openvpn/ca.crt cert /etc/openvpn/thenode.crt key /etc/openvpn/thenode.key comp-lzo
And finally, start openvpn on the client-side:
/etc/init.d/openvpn restart
Now, you should be able to goto 10.11.255.1 from the client and get to donk, or 10.11.255.X (where X is whatever you assigned it) on donk to get to the client.
Address Allocation
Servers
Server |
10.11.255.? |
Port |
Proto |
Compression |
Dev |
donk |
1 |
1195/udp |
OpenVPN |
lzo |
tap0 |
Clients
Node |
Client |
Tunnel To |
10.11.255.? |
luckylab |
donk |
5 |
|
chevy |
donk |
6 |
|
afterthought |
donk |
7 |
|
dryrot |
donk |
8 |
|
star |
donk |
9 |
|
cantos |
donk |
10 |
|
NodeTB151 |
beast |
donk |
11 |
DNS
Each client/server should have an entry in DNS for their VPN IP as a subdomain of vpn.ptp (i.e. donk.vpn.ptp). But, this isn't always as uptodate as it should be...