Mitigating Bandwidth Problems on a Budget

Some schools have it good. I toured ASDubai last year and saw their server room, where they aggregated 10 x 100Mbps internet lines to provide wicked fast service for their campus. I’ve heard that ASBombay has phenomenal internet.

The American International School of Bamako – located in the capital of one of the poorest countries in the world – is not quite there. We’ve got 2.5Mbps of bandwidth. But that doesn’t mean I’m not trying to create an environment where teachers can effortlessly integrate technology into learning.

Spoiler alert: we chose pfSense to provide firewall services, WAN aggregation, bandwidth throttling, and captive portal. The price? Gratis.

Since August, I’ve wanted to open up access to our network as much as possible to encourage students to bring their own devices. Our school is still using dedicated computer labs to give students access to technology. While we have a favorable ratio of computers to students, it’s still too hard for teachers to integrate technology into their practice when it’s a pull-out activity that requires transition time to and from the labs. The layout of the labs isn’t conducive to collaborative learning, either. The whole setup implies that technology is something that happens apart from everyday learning, not embedded in it.

At the same time, I faced very real constraints in the level of service I wanted to offer. Our school has 2.5Mbits of available bandwidth that we pay dearly for, and it’s very easy for just two small classes to consume that when doing web searches or any Web 2.0 activity; Google Drive is unusable. So I had to be creative in how I managed our limited resources.

I wanted to:

  • Manage access to the network. I wanted each student to be able to access the network, but not to abuse it by connecting two or more devices. I was concerned that the automatic updates and push notifications of smartphones and tablets would slow down everyone. At the same time, I wanted to prioritize internet access for the finance and front offices and teachers over that for students.
  • Manage bandwidth and enforce fair use policies. I wanted to prioritize Skype traffic (used by our director for interviews) over web browsing, which in turn should receive priority over p2p. I also wanted to make sure that one user couldn’t hog all the bandwidth with large downloads.
  • Improve reliability and speed. With such limited bandwidth I wanted a robust caching solution. We had a bandwidth manager called NetEqualizer that very cleverly penalized the heaviest network users, but it sat between the squid proxy and the network, which meant that even cached downloads were throttled. Reversing the situation would remove the ability to enforce fair use policies, since all web traffic would look like it was coming from the proxy server. Furthermore, I needed to aggregate our two internet connections (a 2Mbit dedicated line and 512Kbps line) and load balance and ensure failover between them.
  • Minimize manual labor for the IT department. The Wifi system in place required us to manually register the MAC addresses of students and parents who wanted to get on the network. Even with a small user base it was cumbersome to register fill out paperwork, record the MAC, and register it with our firewall, and it was a process that wouldn’t scale well.

We looked at three solutions we felt were affordable:

  • IPCop (free)
  • Untangle (~$1500 annually for our user base)
  • pfSense (free)

We decided to implement pfSense since it met nearly all of our requirements. It was also free, compared to a lot of commercial appliances like NetEqualizer, Bluecoat, iBoss, and CyberRoam that run from $5000 to tens of thousands of dollars. Our new setup lets us:

  • Balance traffic between our two connections
  • Prioritize/block internet traffic the way we want, and block inappropriate sites. p2p is severely limited, and I could block it if I wanted
  • Guarantee Skype QoS so that the director can do Skype interviews even at peak hours
  • Throttle web traffic on a per-user basis to ensure fair use in a way that lets casual/research-based web browsing function normally while penalizing heavy downloaders

By December, it will also create an authenticated campus-wide Wifi network that lets students log on with their OpenDirectory credentials (limiting them to one device per person) and lets parents log in using a voucher system – even though our WiFi is basically a consumer-grade network with individually managed access points (although I’m working on fixing that, too).

More detail – almost step-by-step – after the break.

pfSense installed as virtual machine on Apple XServe 2,1 running VMWare ESXi 5.0 update 3

I put two extra NICs into our old XServe and assigned them to pfSense as the WAN links. pfSense had stability issues with the NICs (one of our gateways would go down and not come back up by itself, as described here) which were fixed by bridging the connections with a virtual E1000 NIC instead of dedicating them to the VM using DirectPath. I installed the SquidGuard, Squid, and openvm-tools packages.

Load Balancing and Failover

First, in System > Routing > Gateways > Edit Gateway > Advanced, I set the weight of each gateway. For example, our faster connection (2Mbit) was given a weight of four (since it’s four times faster) as opposed to our slower one (512Kbit), which was given a weight of 1. Then, in System > Routing > Groups, I created a group, added both gateways, and assigned them to the same tier. This means that pfSense load balances and does failover for the two, but it prefers the faster connection, which was earlier given a higher weight.

This seems to work except for Squid – if the default gateway goes down, Squid does not know to use the other gateways. This means that web browsing stops working until you either change the default gateway or fix the broken one. There appears to be a fix, but I haven’t had the time to test it.

Traffic Shaping using HFSC

Screen Shot 2013-12-01 at 2.53.09 PMI first set up queues in the traffic shaper as follows:

  • LAN
    • qLink > for traffic between pfSense (including the web proxy) and local network. Bandwidth 97.5 Mbps (the LAN interface is 100Mbps, minus the bandwidth we’re devoting to the Internet queue below)
    • qInternet > for traffic to the WAN. Bandwidth set to 2.42 Mbps (97% of our total download bandwidth, to ensure that packet queueing happens at our router and not upstream)
      • qVOIP > for Skype. Bandwidth 200Kbps, Link Share 200Kbps.
      • qACK > for interactive traffic like SSH as well as DNS and ICMP. Bandwidth 10%, Link Share 10%
      • qDefault > used for HTTP, HTTPS, SMTP-S, and IMAP-S. Bandwidth 20%, Link Share 10%
      • qp2p > for everything else. I don’t explicitly block anything; it just gets put in here. Bandwidth 5%, Link Share 5%, default queue.
  • WAN1. Bandwidth 1.94 (our upload bandwidth for this link)
    • qVOIP > for Skype. Bandwidth 10%, Link Share 10%.
    • qACK > for interactive traffic like SSH as well as DNS and ICMP. Bandwidth 10%, Link Share 10%
    • qDefault > used for HTTP, HTTPS, SMTP-S, and IMAP-S. Bandwidth 20%, Link Share 10%
    • qp2p > for everything else. I don’t explicitly block anything; it just gets put in here. Bandwidth 5%, Link Share 5%, default queue.
  • WAN2. Bandwidth 124Kbits (our upload bandwidth for this link)
    • qVOIP > for Skype. Bandwidth 10%, Link Share 10%.
    • qACK > for interactive traffic like SSH as well as DNS and ICMP. Bandwidth 10%, Link Share 10%
    • qDefault > used for HTTP, HTTPS, SMTP-S, and IMAP-S. Bandwidth 20%, Link Share 10%
    • qp2p > for everything else. I don’t explicitly block anything; it just gets put in here. Bandwidth 5%, Link Share 5%, default queue.

Layer7 shaping: Created rules for skypetoskype and skypeout that put them into the VOIP queue. I don’t do torrent shaping here because it can’t detect encrypted torrent traffic. Rather, since torrent traffic won’t match the firewall rules below, it gets put into the p2p queue.

Make Firewall Rules

After setting up the traffic shaper, I had to create firewall rules (Firewall > Rules) to actually put traffic into the right queues. I made most of my rules on the LAN interface – I had trouble getting floating rules to work correctly with load balancing because I just don’t understand enough about how rules work. But here are some examples:

 Screen Shot 2013-12-01 at 2.02.47 PM

As you can see, HTTP and HTTPS traffic (Ports 80 and 443) get put into the default queue. It’s also assigned to the Gateway Group I set up earlier, so the traffic gets load balancing with failover. You can see some rules at the top to put admin traffic for pfsense into qLink so it doesn’t get throttled – this is very important since if you don’t have the rules, your access to the admin portal will be SLOW.

The last rule in the list is a catchall. It takes all traffic that hasn’t been matched to one of the rules on the other interfaces and puts it into the p2p queue, but it also inspects that traffic for Skype and if it find it, it puts it into the VOIP queue. This means that our director can hold a Skype interview during peak hours and be guaranteed that her traffic is prioritized.

Transparent proxy/web caching with Squid

If you’re on a fast connection, then web caching can actually be slower than just fetching the web page from a remote server. But when you’ve got 20 students trying to access IXL or Raz-Kids at the same time, caching can be a lifesaver because you only need download the same website once. The problem is that your bandwidth manager is going to limit traffic from the cache, so there’s not apparent speed difference to the user. But you can fix pfSense so that traffic from the web cache gets put into a faster queue. I followed these instructions, and tested them by downloading a 20MB file multiple times. Squid logs verified that the file was coming from the cache, and the download speed was faster than the bandwidth alloted to qInternet, so I know it was working. I also changed the Squid settings so that it stored files up to 50 or 100MB in size – you want to set it to at least 2MB, which covers the images that pop up in Google Image search (once again, I’m imagining large classes coming in and doing similar searches, hitting the same content and multiple large image files).

Per-User Bandwidth Limits

I’m not completely finished with this. pfSense is queueing traffic the way I want it to, but it’s not doing per-user fairness. p2p isn’t really an issue since it’s limited to no more than 5% of bandwidth, but since HTTP and HTTPS are the majority of our traffic, I need for one user not to be able to soak up all the bandwidth. Luckily, Squid had a feature called Delay Pools that let you limit bandwidth per user. You can even set it up so that a user gets a burst of high speed, but after a certain amount of data it’s throttled back down to a slower speed – a scenario that favors people browsing and reading webpages but penalized downloaders of big files or many small ones (as in the case of Dropbox). This is why I’ve implemented bandwidth throttling in Squid rather than through Captive Portal (below). pfSense’s squid package isn’t set up for this by default, so you need to make some changes manually before it will work. The relevant section in my /usr/local/pkg/squid.inc looks like this:

delay_pools 1
delay_class 1 2
#Set the overall bandwidth pool to $overall and per host throttling to a refill rate of $perhost and a burst/max of 50KB/s. The $overall and $perhost values are set in the WebGUI.
delay_parameters 1 $overall/$overall $perhost/51200
#delay_parameters 1 $overall/$overall $perhost/$perhost
delay_initial_bucket_level 100

and our Squid config in the WebGui looks like:

Screen Shot 2013-12-01 at 2.44.09 PM

Still to Do: Captive Portal

The last piece is to set up Captive Portal, which pfsense allows you to activate by physical interface or VLAN. We don’t have VLANs set up yet, so school-owned computers will be added to a whitelist by MAC address, but everyone else will be redirected to a login page where they will have to log in with their OpenDirectory credentials (authenticated against a RADIUS server). This will allow us to associate MAC addresses with actual users and limit them to one device per username. Additionally, pfSense will generate a list of one-time-use vouchers that we can give out to parents visiting campus, especially PTA members. There are a lot of enterprise Wifi solutions that take of this for you, but we just don’t have the funds for that right now – we can integrate this with our existing network of small business- and consumer-grade access points.

The downside is that setting all of this up required a LOT of reading. See my Diigo bookmarks (tags for squid and pfsense) to see what articles I found relevant.

Leave a Reply

Your email address will not be published. Required fields are marked *