Tropo.com plugs Twitter into the voice/SMS/IM communications cloud
Dave Michels - "The Many Voices of VoiceCon"

Another Hotel Fails To Support Skype - Here's Why Skype's P2P Connection Model Breaks Their System

UPDATE: When I stayed at this same hotel in August 2010, I no longer had the issue with Skype being blocked. Presumably they got a smarter network monitoring system. While this specific hotel now works with Skype, the same issue will undoubtedly be out there for many other hotels and locations.


Summary: Hotels restricting the number of simultaneous network connections per user may wind up blocking legitimate usage of Skype. Skype's peer-to-peer network model uses a high number of network connections to synchronize multi-party group chats.

Read on for the full story, network diagrams, etc....


grandbohemian.jpgTwo weeks ago on a visit to Voxeo's corporate headquarters in Orlando, FL, I stayed at the Grand Bohemian Hotel, conveniently located only a block or so away. Arriving in the early evening, I checked in, got to my room and immediately plugged my laptop into the Ethernet port to catch up on what had happened while I'd been offline traveling. As is the case in many hotels, I was asked to login and pay through a system from "Nomadix". I did so... and very quickly started to see Skype coming online, my other IM client (Adium) coming online, email starting to flow in and a website coming up.

Then it all stopped.

No Internet connection. Offline. I did all the standard things... disconnect and reconnect the cable. Stop/start the network interface. Nada. Nothing. Dead.

I'd only been on a minute or two but it had seemed to be fine, so I naturally called the front desk who put me through to the tech support team which turned out to be an external company. (Note: Nomadix makes the gateway that is used by this external company to operate the hotel's network. The network is not operated by Nomadix.) They, too, ran through the typical checks, found nothing, and then checked the list of blocked IP addresses - there I was.

The technician unblocked my IP address... I saw that I was online again... and then after a minute or so I wasn't. I called back in, spoke with a different technician, had the same experience and stayed on the phone for a good bit investigating the matter.

It turns out that:

they automatically block IP addresses that generate over 200 simultaneous network connections.

And here is my dilemma:

I am a heavy user of Skype, particularly Skype's persistent group chats.

skypelogo-shadow.pngEvery time I connected, Skype was initiating hundreds of network connections to update all of the group chats that I had open. This, of course, was triggering the rules in the Internet gateway and landing me on the blocked list. If the technician unblocked me (or later testing seemed to show that every 15 minutes or so the block was released), I would then wind up blocked again after only a minute or so.

The support technicians were all very pleasant and explained that unfortunately the 200-connection-limit was hard-coded into the gateway system and there was nothing they could do (at their level, anyway) to change or set aside that limit.

As a security guy, I do understand some of what the company is trying to do here with the limit. They do have to treat the hotel network as "hostile" to a certain degree. Someone with malicious intent might connect to the network and try to execute a Denial-of-Service attack or send out spam. Or someone might unwittingly be infected with a bot that is commanded to execute some attack. They also want to prevent someone from sucking up all the network bandwidth so that other hotel users receive poor service. Limiting the network connections is one way to potentially try to deal with these type of attacks. Unfortunately, the limits also restrict the legitimate usage of Skype.

To explain how Skype plays a role here, let's dig a bit more into the way in which Skype's P2P architecture works...


THE BASICS OF PEER-TO-PEER (P2P) ARCHITECTURE

In any given peer-to-peer service (Skype or otherwise), there is a fundamental design difference from "typical" services:

there are NO centralized servers.

If you look at a "standard" instant messaging (IM) service, all of the IM clients connect in to a central server (or group of servers). From a network point of view, it looks like what you see in this image:

imservice-central-2.jpg

This could be for AOL/AIM, MSN/WLM, Yahoo!Messenger, Jabber, IRC or pretty much any of the other IM services out there outside of Skype.

ALL traffic goes through the central server. If you want to send an IM message to the person sitting next to you, your IM messages are still going through a central server somewhere.

From a network traffic point-of-view, each IM client is opening up a small number of connections to the central IM service. It might only be a single connection, or it might be a couple... but it's only a few. All traffic, both control messages and the messages themselves, travel across those few connections - and all through those central servers.

Over in peer-to-peer land, though, where there are no central servers all of the communication occurs through what is typically referred to as an "overlay network" (or "P2P overlay network" or "P2P overlay"). The overlay network is essentially a "virtual network" between all the nodes in a network. From an architecture perspective, it looks like this:

imservice-P2P-1.jpg

It's a "mesh" network where all of the service clients are interconnected with all the other clients. (In fact, the term "peer" is typically used instead of "client".) Typically in a P2P network this "overlay network" may be a "distributed hash table" (DHT) using a protocol like Chord or Bittorrent. It sits on top of a systems existing IP connection to the Internet - hence the term "overlay". So this is a P2P network sitting on top of the regular IP network.


SKYPE'S P2P - AND PERSISTENT GROUP CHATS


Caveat: I am not an employee of Skype and have no real connection with Skype other than having been an user for something like 5 years now. The material below is based on educated guesses - and could be entirely wrong. (And I'd love it if someone from Skype would confirm or deny any of the info here...)


Skype uses some type of P2P network for communication between Skype clients. When your Skype client comes online, it connects to the Skype P2P overlay network. I don't know personally what protocol Skype uses for its overlay network, but odds are that it is some kind of DHT or similar system. Now, Skype's network is not entirely a P2P network in that there are some centralized services that do, for instance, authentication (which were part of the outage back in August 2007), but for the most part it's a big P2P cloud.

Now let's talk about Skype's "persistent group chats". The strength of Skype's multi-person group chats is that you can shut down your computer, travel somewhere, open your computer back up, have Skype reconnect..

and receive ALL communication that occurred in the group chat while you were offline.

This is a huge benefit to an IM-centric organization. If you shut your computer down at the end of the day, or if you are traveling, you simply reconnect and have the complete history of all communication that occurred while you were offline. It works fantastically well for globally distributed teams. I use it heavily within Voxeo and for external teams as well.

The question naturally is... when you are offline, and if there are no central servers...

where are the chat messages stored that you get when you come back online?

The answer, of course, is:

the messages are stored in Skype's P2P overlay network.
(Do you see yet where this is going and the problem it's going to create?)

So if I am in a Skype group chat with five other people, all the text we type is stored in the Skype overlay network and specifically in a mini-network or "ring" between our 6 nodes. Now... we don't know Skype's P2P protocol to know precisely how it is stored across the nodes... but some parts of the text are probably found in all the members of our little ring.

When you have been offline and come back online, your Skype client has to connect out to the others in your mini-network to update your local client with all the messages that occurred when you were offline. Because your Skype client may not know if the 5 other clients were all online during the entire time you were off, your client seems to need to check with all of the other nodes in your mini-network. So it looks something like this:

imservice-skypechat-1.jpg

Now, if you look at it from a network traffic point-of-view, 1 Skype groupchat generated 5 network connections. (And that assumes only a single connection is made per Skype node.) From various discussions and research over the years, the rule seems to be:

Each Skype groupchat is going to generate a number of network connections equal to the number of participants in the groupchat, up to a limit of 15 per chat.

If you have an open chat to one other person, when your Skype client comes online it generates 1 network connection to that person. If you have an open chat to 2 other people, Skype opens 2 network connections. For 5 others in a chat, that's five connections. Ten others in a chat, 10 connections... and so on.

This obviously doesn't scale when you get into very large group chats - I'm in one public chat that has been around for years and has 200+ participants. In those cases, a Skype developer a few years back said in a public chat that in group chats larger than 15 people, your Skype client would connect out to 15 other nodes in the chat to get updates. I don't know if that is still true today, but it seems a logical way to address the scaling issue. Odds are that if you connect out to 15 other nodes, some number of those are going to be online and have enough of the chat history to get your client up-to-date.

For the purpose of this post, let's assume that is still accurate... and so any chat with more than 15 users generates 15 network connections when your Skype client comes online.


THE CHALLENGE OF THE HEAVY SKYPE CHAT USER

So here's the problem that I believe nailed me at the hotel. If I look at my Skype client right this moment, I have 56 chats open in one window and 20 chats open in another - total of 76. skypechats-1.jpg That's actually down a bit because I went through and closed a bunch recently. In scanning through the list, probably 15 of them are 1-to-1 chats that I either keep open because they are people I frequently communicate with or are new chats that I haven't closed - and keeping the chats open lets me very easily see their presence. The rest are multi-person group chats that range from 3-5 people up to 150 or in one case 200 people. Most of them seem to be in the 10-20 person range. Some are long-term chats that I keep open because there is frequent traffic in them, others are short-term chats that have been set up for specific projects or events and will then be closed when the project or event is over.

Without actually going through and calculating the precise number, I'm going to guess that the average number of participants across the 61 group chats is probably around 10-20.

For the sake of keeping the math simple, let's just assume the average number is 10 users. Multiple that by 61 and then add in the 15 one-to-one chats and you have:

625 network connections

Oops.

If the hotel blocks a user at 200 connections, I'm obviously triggering limits. Even if my average is off, or if Skype does something to space out connections over time (which it doesn't seem to do) or to otherwise make connections between users more efficient, I am still probably going to run over that 200 connection limit. My network traffic profile at a high level is going to look like a spammer or DoS attack.

Keep in mind that these are short, quick connections just to sync with the other nodes in the ring created for each chat and get any messages - so we are not necessarily looking at a large amount of bandwidth, but we are looking at a large quantity of connections.

[NOTE: The next step someone needs to do is to take some wireshark captures and generate some pretty graphs of network connections.]


WHAT TO DO?

So now what? What should I as a user do?

  1. DON'T USE SKYPE - I'm sure someone will suggest this. However, Skype estimates that over 30% of their traffic is business usage. Outside of my own usage, I know many people who use Skype as a significant part of their business communication. Not using Skype obviously is a solution, but not the desired outcome.

    This isn't only about "skype". While Skype is the issue in this post, this is a general concern with P2P architectures in general. As an open standards supporter, I'd love to see someone come along with a solution based on P2PSIP that provides similar features - but guess what, it's going to run into similar issues. It's an architecture issue - the idea of blocking on some number of connections is based on the old-fashioned client/server model where local clients make only a small number of connections out to dedicated servers. For that model, it may work... but that doesn't reflect evolving usage of P2P networks that are a mesh between nodes.

    Today the issue hits Skype... tomorrow it may hit some other cool application that uses P2P for communication.

  2. HAVE FEWER OPEN GROUP CHATS - It's not clear to me how quickly Skype checks the status of "closed" group chats, i.e. ones that you are still a member of but are not currently displaying. It has to check at some point in case someone typed messages there, but does it do it on initial connection or launch? (I don't know.)

  3. REDUCE THE NUMBER OF GROUP CHATS - Obviously this can help address the issue... simply "leave" (versus "close") many of the chats you are in. However, the persistent group chats are one of Skype's great features and enable very powerful collaboration between globally distributed teams. Not really an option.

  4. CHANGE HOTELS - Of course we as users have the option to find other hotels that don't place the same restrictions... but sometimes we don't have that option.

What can hotels do?

  1. RAISE THE CONNECTION LIMIT - An obvious solution is raise the number of simultaneous allowed connections... to what number, I don't know... there are trade-offs in trying to block the illegitimate traffic that may be on the network.

  2. PERFORM MORE INTELLIGENT LIMITING - Applying a hard limit on the raw number of network connections is a rather brute-force approach. Instead the software should look at the quality of the network traffic. Are there are large number of high-bandwidth connections? Perhaps someone is downloading software or movies via some P2P network... in that case maybe they need to be throttled back or limited. Are they smaller connections that may be okay? Can they identify the actual Skype traffic and allow it but block other traffic?

  3. THROTTLE/LIMIT VERSUS BLOCK - The rules I ran afoul of block your entire Internet access. Too many connections and your link goes dead. Why not truly limit or throttle back the connection instead of terminate it entirely? There's technology out there that can do this type of thing. (Consider, for instance, the idea behind good old ICMP source quench.)

  4. ALLOW TECHNICIAN OVERRIDES - When I spoke with the technicians and explained what I was doing, the technicians had no options other than to momentarily unblock my IP address. Why not allow them to have a "white list" to which they could add the addresses of certain guests who request special access? It doesn't solve the issue, but it at least would keep certain guests happy.

  5. GET A NEW SOFTWARE SOLUTION - Obviously to do these steps the hotel may need to look at new software... or a new Internet provider.

  6. _____________ - What else do you suggest they do?

It's 2010 - the reality is that Skype isn't going away... and P2P architectures are continuing to evolve and provide interesting ways to solve communication challenges. The fully-meshed P2P overlay network will continue to be a feature of proprietary networks like Skype as well as standards-based solutions. Travelers want to use communications solutions like Skype... and hotels and their Internet providers need to figure out how to allow the legitimate usage of these tools and services while still keeping their controls in place to block malicious network usage.

What do you suggest? What would you do as a user or as a hotel?

P.S. This issue has been around for quite some time... I wrote about another hotel blocking Skype back in 2007. Same issue... blocking on *quantity* of connections versus actual network impact.


If you found this post interesting or useful, please consider either subscribing to the RSS feed or following me on Twitter or identi.ca.


Comments