Getting gigglebits to gamers — How we scale PFSense at temporary events

Here at Multiplay, we run the biggest local area network (LAN) events in the UK which we call the Insomnia Gaming Festival. For anyone that has not experienced this, picture 3000+ gamers who bring their computers along to the National Exhibition Centre in Birmingham for a whole weekend of 24/7 gaming. 3000 computers in one building has some interesting challenges, but when you start to add gaming into the mix, it brings a whole new set of problems.

When you play a multiplayer game, every other player needs to know your movements and this works by constantly sending these information updates via a server. If this information does not arrive or arrives late, it could be the difference between narrowly ducking behind cover and being shot square in the head!

To make sure our gamers have the best experience, we need to ensure our network is performing as well as possible and we use PFSense to help us with part of that problem.

PFSense is something that nearly every network admin will know of, or will have dealt with. For anyone that is not inducted in the ways of PFsense, it’s an open source router/firewall which has many many uses; all of which can be found here: https://www.pfsense.org/

It’s a very popular product for a small branch office or home router, but how does it deal with scale?

We are avid users of PFSense and have been using it for many years for a variety of applications (Shoutout to those 1.2.3 power users!). We’ve watched it grow from a small quirky open-source product to a stable solid platform which we use to serve terabytes of data to thousands of clients.

Scooting back 18 Years to the start of our Insomnia Gaming Festival, we needed a way to add some firewall rules and to do some basic NAT (Network address translation) at our very first events. For 20 people in a room sharing a 56k dial up connection, we used a FreeBSD server running IPFW with a ‘block all’ stance for everything but IRC (Internet relay chat)! A few events later, we stumbled across PFSense which used PF (Packet Filter — The default firewall for most BSD based systems) under the hood and actually had a user interface. It was easy, simple and worked like a charm.

As our event grew, so did our PFSense boxes, we used newer and newer x86 hardware which allowed us to keep the flexibility we needed at our events without having to buy super expensive firewalls. We soon broke the 1000 player mark and our sponsored 100Mb circuit became rather full! We used PFSense’s excellent traffic shaper and some heavily customised rules to ‘squash’ web traffic whilst allowing the teeny tiny UDP gaming traffic to go through. Anyone that knows online gaming will be wincing reading this; latency, jitter and packet loss leads to the gaming enemy number 1: LAG. It’s the difference between the headshot of glory and an epic fail.

For anyone that hasn’t taken a step into the world of online gaming, its traffic pattern is similar to that of VoIP (Voice Over Internet Protocol); small UDP packets which NEED to arrive in order and on time. Packet loss is not an option. In gamer terms, losing packets, or increasing latency is the equivalent to taking spark plugs out of your car engine or the chain off of your bike. It ruins everything.

Given that we are all nerds at Multiplay, we grew up battling this problem. In our houses, we made sure we had the best networking gear and the lowest latency ADSL lines. Doing this at scale is a different thing. When you turn up to a gaming festival with your computer, you expect the internet connection to be flawless and better than your home line.

Well, if you think about it, that’s quite a big ask nowadays! Lots of UK houses now have access to 40Mb/s FTTC (Fibre to the cabinet). Take that and extrapolate to 4000 users and you start to deal with some pretty big numbers! But the sheer volume of traffic is only half the story.

Typically, firewalls are mortal enemies of computer power users. They get in the way, disrupt traffic and more importantly to the gamers, they slow everything down. However, you’ve got to admit that they serve a purpose; securing traffic where needed and giving you a nice point in the network to perform network functions like NAT and traffic shaping. We think NAT is evil (If you are interested, ask me why!), so we soon threw that out the window, but that leaves us with the two remaining pieces; security and traffic-shaping.

We’ve covered traffic shaping above. To summarise that point again: Latency = Bad and playing games at a video game convention is more important than cat videos (although we like cats too). So that brings us onto the last point; security. If we had our way, we’d have a liberal, hippy internet where nothing was blocked, but let’s face it; that doesn’t work in real life. PFSense helps us to block some nasties both in and out of the network (Yes, not everyone that turns up to these festivals is an internet good guy)

So for various reasons, the firewalls are here to stay. Why PFsense? Well, I’ll give you three pretty simple reasons:

  1. Reliability — FreeBSD (The operating system on which PFsense is built on) is known for its reliable network stack. It’s made to do this, we’ve been using it for years, we trust it
  2. Cost — PFSense is free; just throw some hardware at it and off it goes. This is REALLY important when you are building temporary networks. It’s a very short conversation when you ask your boss for a five-figure sum to implement a vendor firewall solution that will only be live for 3 weeks a year!
  3. Speed — Man, I just keep harking back to this point, don’t I? If it ain’t fast, it’s no good! You can’t 360 no-scope if your firewall is adding a few ms latency to every packet

On the face of it, PFSense seems like it sits in the middle of one of those Good, Fast, Cheap Venn diagrams. You know, the one where they always make you pick two. So what’s the catch? Ah… Well.

Thanks to Jeannel King (jeannelking.com) for the image which perfectly illustrates my point!

Unlike unicorns, PFsense is not 100% perfect and does have a few quirks. We’ve tried a few things over the years to counter these, so in no particular order:

Mbufs — Memory Buffers (or Nmbclusters)

This might be quite common for Unix admins out there. Mbufs are nothing more than buffers in memory that are directly allocated to the networking stack. They are used for storing data whilst hardware is processing. Really, all you are looking for here is that you are not hitting the limits of the numbers. Typically, if you have a bigger memory system, or if you have bigger ports (10G+), you are going to want to increase these to ensure you are not hitting limits. This is especially important when operating in gaming environments with lots of small packets. The change is a simple one; just add “kern.ipc.nmbclusters” to your system tunables list, or loader.conf (Note; loader.conf is the old-school PFSense way of doing things; this should always be done in the webUI now)

NICs — Network Interface Cards

When you start pushing the limits of the NICs you are using, things start to get fun. We’ve experimented with several types of NICs over the years, but the higher end ones which seem to work better often always have their own hardware offload build into the card. We first used ‘em’, then migrated to ‘igb’ and then through various different cards, on our last revision of x86 hardware we used Chelsio NICs. Bottom line is, when you start to push higher PPS traffic, you need to think about more than CPU and Memory in your machines.

Moving to 10G

Ok, so this one is just a simple rule of ‘my pipe is too small’. For several events, we had moments where we touched maximum capacity through our 1G NICs. The solution was to upgrade our PFSense boxes with 10G NICs. Back in the old days of PFSense, this meant custom settings and special drivers, but nowadays, pretty much all the common 10G NICs are supported right out of the box.

Spreading the load

After upgrading our NICs, we started to see a natural point of resistance on the firewalls. Whenever we hit the ~1.5Gb/s throughput mark, we used to start seeing higher latency for traffic traversing the firewall. On our x86 hardware, we had to balance traffic to ensure CPU did not hit its maximum. We theorised that this was due to full NIC buffers and some traffic being punted to CPU during busy times. As mentioned above, we tried lots of different NICs and eventually settled on Chelsio, which helped alleviate the problem but not remove it. At this point, we were running the firewalls active / standby, which relied on CARP (Common Address Redundancy Protocol) to failover a VIP (Virtual IP) if/when the active firewall failed. Obviously, this meant that we had a whole firewall sitting there not doing anything, so it was time to make some bigger changes!

We made the firewalls active / active and used our routing infrastructure to load balance traffic between the firewalls. PFSense has a few handy features to make this easier:

  1. Config sync, where changes in firewall rules are synced between firewalls (Single point of management).
  2. State sync, which does what it says on the tin and syncs states between firewalls, allowing return packets from WAN -> LAN to hit either firewall and be correctly processed. This worked really well for a few events, but we then started to hit problems.

State Sync

So it turns out, when you have 4000+ clients and lots of tiny sessions, you end up hitting some state limits. There’s a surprise! No problem — raise the limits and try again. Well, we ended up with a very interesting issue; the dedicated 1Gbps cable between the two firewalls used for state syncing was actually saturating! Yes, you heard right; 1Gbps of state syncing updates. Time for a rethink! Initially, it seems obvious that you would just upgrade it to 10G, but given the elevated CPU and Memory we saw when syncing states, we would have just hit similar issues with CPU.

We eventually settled on a bit more of a clunky solution, which we intend to change in the future. We split our traffic in half and route it via either firewall. As we know which firewall our traffic is leaving the network from, we know where to redirect it when it comes back in, which solves the issue of packets hitting a firewall which it did not leave from (and so has no state for). We think this is a little bit of a kludge and have some ideas on how to re-engineer the solution with a little bit more finesse — More on that when we’ve done it!

The move to appliances

Around a year ago, our faithful x86 servers were due for replacement and we decided we wanted to try something new. We did some research and made the leap to PFSense appliances. We’ve not looked back!

After lots of talking with our good friends over at Amica tech, we settled on the XG-2758. This is not the biggest, baddest box in the PFSense appliance arsenal, but we picked it for several key features:

  • Small compact size (Great for transporting to and from events)
  • Onboard 10G, with space for expansion slot
  • Decent throughput figures
  • 4 x 1G onboard

This makes it SUPER flexible for the world of events, either being used as a small 1G router or a bigger 10G one when needed. We run them active/active depending on the show (and number of clients) and we have been super happy with them. Note — these have recently been put end-of-sale so we are excited to see what comes next!

P.S We really do recommend Amica tech; they are smart guys with a real passion for PFSense like us! Check them out at: https://amicatech.co.uk/

The wrap up

All in all, we’ve been on a pretty massive journey with PFSense and it’s 100% here to stay in our networks. Take the time to learn its quirks and you will be rewarded with a platform that gives you the best of all three worlds; Good, Cheap and Fast! Please take this article as a helpful learning tool about our journey, not a “You should totes go and spend all your money on new NICs”. We are in a very odd situation when it comes to building pop-up / temporary low latency networks very quickly at scale, so this won’t be for everyone! I’m also not 100% sure we are even doing it ‘correctly’, if there even is a correct way to do this. But hey, who actually needs five 9’s anyway? (Plot twist: Gamers really do)

--

--