The WAF Swiss-Knife

This blogpost showcases some not-so-commonly-advertised benefits and features that modern WAFs have, and how they can be used in an average company to gain benefits ranging from technical to political. It also provides some tips and tricks that I would have liked to know a few years ago.

This was originally presented as a talk at B-Sides São Paulo 2024‘s Cloud Village. It has been translated and adapted into this blogpost to further spread it.

Special thanks to the following humans for reviewing the original talk and suggesting improvements!

Douglas Oliveira (nullniverse)
Felipe Espósito (Pr0teus)
Lucas Santos (LI: “lucas-ds-santos”)
Natalia Sampaio (NullNats)

1 – Intro

It turns out a lot of people don’t really get what a WAF does

And it also turns out that lots of companies kind of suck at managing their WAFs.

So, it seems prudent to talk a little bit about WAF management. Specifically, I want to focus on the features and benefits of a WAF that are not commonly talked about, about the creative ways in which you can use these features, about lots of little tips and tricks that my past me would have liked to know, and the “hidden” benefits that a WAF can bring you, ranging from technical to political.

So that’s where the title comes from. You’ll learn to use your WAF as a swiss knife. Not the absolute best at anything, but very versatile, quick and life-saving in some situations.

And I hope that this knowledge will make your WAF a little better, and thus, the internet a little bit safer.

Which WAFs are we talking about?

Just before we go any further, I’d like to make it clear that we’re mostly talking about SaaS WAFs with plenty of other features. One of these, basically:

We are NOT talking about:

“Old-school”/”Traditional”/”On-prem” WAFs that are “just WAFs” (i.e. not many extra features)
Cloud-natives WAFs (like AWS’s, GCP’s, etc.), which usually also don’t have many extra features.

For most content we’ll be using CloudFlare as the reference point, simply because that’s what I have the most experience in, but the content itself is as agnostic as possible.

By the way, do note that there has been some “convergent evolution” among WAFs and CDNs – WAFs have adopted CDN features, and CDNs have adopted WAF features. So you may also find these tips usable in solutions that are more labeled as CDNs.

2 – Knowledge Leveling

Before we get into the meat of the content, let’s take a step back and review a few things about infrastructure architecture and about how WAFs work. This will be useful for later sections.

2.1 – What does a modern infrastructure look like?

Let’s start from the ground up. First, a developer will produce an application. This may be delivered as an executable, a script, a docker container or other methods. Let’s assume he’s producing a container.

But a container by itself does nothing. Someone working in the Infra/Cloud/DevOps/WhateverOps team will have to create something to run the container, and then make it reachable through the internet.

Let’s assume he decided to use a single instance, hosted in a cloud provider, that has a public IP.

This works! But it has a few problems.

First, no one uses an app by typing its raw IP address into the browser. So the infra guy will register a domain and configure that domain to point to the actual IP. Now the users can just memorize the domain name.

Much better! But we’re just getting started. Here are more problems that the infra guy will have to solve:

1 – Security: How do you encrypt traffic so that sensitive information is not going around in cleartext?

Encrypting the traffic is easy in theory, but quite some work in practice: he will have to generate TLS certificates for all servers, install them, and then maintain them. There are solutions that make this easier btw, but we’re not talking about them today.
(By the way, you can generate free TLS certificates using LetsEncrypt. They are, mathematically, just as safe as any other paid TLS certificate. If you’re paying for TLS certificates, then don’t.)

2 – Scalability: How do you ensure the service will not fail if too many users try to access it?

At this point the infra guy will have one of two options: use vertical scaling, which essentially means adding more capacity to existing machines (which is generally less preferred, but can actually work wonderfully) or use horizontal scaling, which means spinning up additional machines/instances to handle the additional traffic (this is usually harder to configure, but is seen as “more scalable”, as there is a hard limit on how much RAM and how many CPUs you can fit into a single machine, while there’s no limit on how many instances you can create). In practice it’s actually a mix of both, by the way.

Anyway, moving on, let’s take the more common path and assume he decided to use horizontal scaling.

Now he has multiple instances. How does he balance the traffic between them? Well, by using a load balancer! Note also that the public IP is now the load balancer’s IP, and that TLS certificates are installed and being used.

However, it’s pretty rare for a company to only host a single service. To acommodate these multiple services into a single domain, we need to introduce something to route requests to their designated services. Think of something like:

tigrinho.app/blog -> Direct to the ‘blog’ service
tigrinho.app/app -> Direct to the ‘game’ service
etc…

This can be done using something that may be called an API Gateway, a L7 Load Balancer, or something else if the marketing team is up to it. So now our design looks like this (the endpoints have been ommited):

Note that, again, the public IP has changed.

Now, finally, someone comes along with the idea of introducing a WAF. So the WAF is introduced! Yay!

The WAF stands at the very outermost layer of the infrastructure, and it will be responsible for intercepting all requests, analyzing/processing them, and then blocking or allowing them. No IPs in the original infrastructure have changed, only the DNS record.

The WAF is basically a benign Man-In-The-Middle (MiTM). Note that, as your DNS points to it, it can emmit valid TLS certificates for your domain – and that’s exactly what it does. It is able to decrypt all traffic to analyze it, modify it if required, then re-encrypt it and send it forward to your origin server. It’s transparent, yet powerful.

Another thing: Don’t forget that, just because you have a WAF set up, that doesn’t mean people can’t bypass it!

To do so, all an attacker would have to do is connect directly to your infrastructure’s real IP. “What about the TLS certificates?” – nothing some magic using /etc/hosts and telling chrome to trust untrusted certificates can’t fix. Finding the real IP may be tricky, but it can be possible.

Anyway, to prevent this, you should remember to block source IPs that do not belong to your WAF in your cloud’s firewall. It’s the only certain way to do it.

You may be forced into creating exceptions for some very special cases, but it should be very rare, and you should be very careful about it (document all exceptions, don’t allow a bigger range than you should, review every once in a while, etc.)

2.2 – How to setup a WAF

There are basically two ways to configure your WAF.
You can either point an existing domain to it, typically using a CNAME record:

Or you can turn your WAF into your authoritative DNS server:

The second method is a little bit more complex, but not harder: you point your authoritative nameserver to one your WAF provides, and it starts managing all your DNS records. You also have to import existing DNS records, but your WAF should have something to do this for you. Just be careful because changing the nameserver can take a long time (up to 1 day) to propagate around the globe, and during this period any DNS changes you make may not propagate at all.

I highly recommend using the second method, simply because it gives you (the WAF administrator) the power to manage DNS records. If another team is responsible for managing DNS, you will always have headaches to point new (sub)domains to your WAF, or avoiding said team from turning off the WAF (which is something they’ll want to do every time any bug is found “just in case”). In other words, you’ll always be reactive, never proactive.

If DNS management is in your hands, YOU can ensure that the WAF is enabled for all domains, and you can also certifiy that things are OK before a new subdomain is published, for example (but more on that later). Also, using the first method may disable or restrict certain features from your WAF.

3 – The reverse honeymoon

Very well, so you’ve installed your WAF. Here’s how it usually goes:

“Wow, what a nice, new WAF. It’s blocking all SQLi’s from Havij, and XSS’es – well, as long as no one uses a bypass, at least. He’s gonna fuck something up every once in a while and the other teams (and my boss) will scream at me. Oh look, now I’m being called into every single warroom because every time something goes wrong the WAF is always a suspect. Damn, that sucks. Is this thing really worth it? Well, compliance requires it… Ah, but I can just set it to ‘monitoring mode’! This way no one will scream at me, I won’t risk bringing prod down again, and compliance is happy! I’m sure we’ll properly analyze and respond to all alerts the WAF generates”

And that’s how a WAF dies. How do we prevent that?
I hope the following tips will help you manage your WAF more efficiently, and maximise the value you’re getting out of it. And I hope that this will mean you’ll have a stronger security and a stronger relationship with other teams in your organization.

4 – Using your WAF as a Swiss Knife

Here’s a few things that your WAF can do that you may not know:

DNS management
Load Balancing
Transparent certificate management
Remote infrastructure access (the folks at Gartner call it Zero Trust Network Access sometimes)
Customizable Rate Limiting
Support security operations and incident responses
Performance tuning for pages
Global Caching
Virtual Patching
Deception Operations
And much more!

The caveat is that your WAF may not be excellent or not even the recommended tool for these jobs, but choose your battles. This will be a central theme from now on. Sometimes you need a quick fix, or you need to deliver something but don’t have other resources to do it.

4.1 – Digital certificates

Your WAF is able to “swap” digital certificates for your domains. In fact, that’s what its doing for this blog right now.

How does it do that? We’ve already seen in section 2 that is is basically a bening MiTM, able to decrypt and re-encrypt all traffic. So, it can issue and use a valid certificate to guarantee a secure connection between the end user and the WAF, and then it can do whatever for the connection between your WAF and your backend servers.

Want to use an expired certificate? Or even plain HTTP? It can “hide” all this from the end user. They’ll only see the valid WAF certificate.
Obviously, this is not recommended. It would be better to do things properly and use valid, rotating certificates in all your backend servers. But if you have 300 legacy apps without TLS, and you don’t have enough manpower or political influence to adapt, configure and maintain everything… Then the WAF is a viable alternative. Again, pick your battles.

As a bonus, you can also mess around with some other TLS settings. “Always use HTTPS” is particulary awesome for removing all traces of HTTP from your apps, and enabling TLS 1.3 and restricting old TLS versions is also useful for fixing some bugs, improving security and meeting compliance requirements.

4.2 – Redirecting Requests

You can use your WAF to easily redirect requests from any page to any other pages based on parameters such as the path, a cookie being present, the region of the caller, etc… Here are a few examples:

Redirecting users based on their home region:

Redirecting users away from an old path:

There are a thousand other cases that will pop up. Just keep this functionality in mind and communicate it to other teams, and they’ll be asking you to configure redirects pretty often.

“But shouldn’t this be done directly on the server?” – Again, it should, but choose your battles. Sometimes no one even knows where the dang server is, and the only dev that knew started getting messages from the fifth dimension and parted in a journey to find the flat earth’s border.

There IS a legitimate case though: sometimes your subdomains may be related to third-party services which you do not have complete control over (think stuff like FreshDesk, managed WordPresses and other SaaS apps). In these cases, you may not be able to create the redirect in the tool itself, and you are forced to do it a layer above. Doing this in the WAF works fine, and has saved my behind a few times in the past.

4.3 – Load Balancing, Request Routing and Health Checks

Some WAFs also offer options for Load Balacning, Request Routing and Health Checks

These are features that are very commonly offered by clouds natively, and probably with more configuration flexibility than your WAF’s offering. However, there may still be some benefits to doing this in your WAF:

If you’re using multiple clouds, its sometimes easier to configure balacing/failover between them directly on the WAF instead of routing everything through one cloud to send it to other clouds.
- My experience in using them to switch to a DR environment during incidents or outages has also been pretty positive – If something went wrong in your cloud, it’s possible you won’t be able to do this sort of switching there. So your WAF ends up acting like an isolated control layer, which is very nice.
If you have users around the globe, your WAF will probably have optimized configurations for them in a more “transparent” manner, while in the clouds you may have to worry about replicating your load balancer across regions.
If you don’t have many different apps/services, or if the routing between them are relatively simple, then you can try configuring your entire routing logic inside the WAF. This saves you the effort of using a separate service for this, or of becoming commited to a single cloud (e.g.: it may not be easy to port the configurations of your AWS LB to other clouds).
Health Checks configured in your WAF usually work VERY well, since the WAF has endpoints all across the globe and most closely mimics what your customers are seeing. You’ll easily be able to see if an outage is regional or global, too.

Again, probably not the best tool for this, but worth considering in specific scenarios.
The exception are health checks – you should absolutely take some time to configure them for your main pages.

4.4 – Rate Limiting

Another very important thing: Rate Limting!

This is something that WAFs are very well positioned and well-equipped to handle, so no guilt about doing this on your WAF.

Another thing: rate limiting is usually never considered when an app is being configured or developed, but it is extremely good to have and will prevent many other issues which range from security to cost related. So this is something where you can be proactive and help your engineering buddies.

Properly tuning your rate limits will allow your users to use the application normally, but will give headaches to any attackers trying to enumerate pages, brute force credentials, mass download resources and other shady things.

BROTIP: Have a basal rate limit for all pages (e.g.: 500 requests/min) and smaller or bigger limits for specific pages (e.g.: 3/min for your login pages, 1000/min for your static resources folder, etc.).

“But how can i properly tune this without blocking a bunch of users in the process?” – You may ask. The next section has the answer!

4.5 – How to fine-tune parameters without impacting availability

This is valid not only for rate limiting, but also for any other configuration in your WAF. Some WAFs have a “DEV” environment which provides a similar feature, but if yours doesn’t, no worry.

The trick is as follows:

Every time you’re going to tune a parameter, create a blocking rule, or anything of the sort, instead of creating it directly, create a LOG ONLY rule with the new parameter in place. Then you can look at the logs and compare it to the current configuration (is it blocking more users? Are these blocks justified?).

After you’ve verified the configuration is OK, you can migrate the rule to BLOCK (if it’s a new rule) or adjust the parameters on the existing rule to match the ones you were testing (if you were fine-tuning something)

4.6 – Caching

Caching is very similar to rate-limiting in the sense that WAFs are great for it, nobody thinks about it when developing something, and you can be proactive about it to help others.

Some very common problems when it comes to caching are:

Your costs are too elevated due to a lack of proper caching
The cache is storing things it shouldn’t (especially resources from logged in users!), or not storing things it should (like a very popularly accessed file).
Cache takes too long to refresh, causing issues or making devs unable to check deploys

When using a WAF, caching will be divided in two or more “layers”:

The cache from the WAF itself (using its CDN and etc.)
The cache from user’s browsers
OTHER caches along the way, depending on the end user’s network (e.g.: a local cache in their network).

The content goes from your server to your WAF, is cached there, and then is sent to additional caches in the way, and finally to browsers, which may also cache the data. Knowing from which cache the data is coming from can be very helpful when debugging issues.

Caches are affected by the Cache-Control header, which is set by the origin server; But again, many times it itsn’t set properly, and overriding the cache settings with your WAF may be a quick and easy solution. One exception is the WAF’s own cache settings, which can’t be controlled by the origin server.

Your WAF is (probably) able to override the amount of time that data is cached by the WAF’s CDN and by user browsers (by modifying the cache-control header). Here are the fields that CloudFlare can use to fine-tune caches, for example:

You can create rules using these fields to solve the common problems we discussed earlier, and also to optimize your traffic in general. It’s generally worth looking into! From my experience, the default settings are good but there are always corner cases that can be optimized for significant cost reduction.

Use your WAF’s cache analytics page to find issues and fix them. Be wary of large, popular and static files that aren’t being cached because of their extension or some other issues. Communicate to your infra and dev teams that you can solve caching issues – it’s gonna cause an issue some day, I guarantee it, and you’ll be the hero.

Oh, and a more specifc tip: NEVER cache data from authenticated sessions, or you may expose sensitive information from a single user to all other users. You can prevent this by creating a rule that disables caching when a session cookie is present (the specific name of the cookie will depend on your application). This shouldn`t happen by default, but…

4.7 – Observability using your WAF

Another cool thing: since your WAF is MiTM’ing all your traffic and can see all data in plaintext, it becomes a pretty handy observability tool.

There are major caveats though:

Your WAF may not be able/willing to log all request data (e.g.: Cloudflare doesn’t log request bodeis)
It obviously cannot see what is going inside the origin server

So it definitely should NOT be the only observability tool you have. But, again, maybe people didn’t think about observability before deploying the app, so the WAF can at least give you an overview of what’s going on, which for some cases is enough for troubleshooting.

A few real world examples:

I’ve used LOG rules to monitor the behaviour of a suspected IP. You can easily tell if it’s a bot, a human, or a DoS attempt by the order, type and contet of requests.
I’ve used LOG rules to fine tune DDoS protection. Sometimes you can see patterns in ASNs, headers, user-agents or paths being used, so you can block a huge chunk of the attack.
Lots and lots of troubleshooting in PROD where no other tools where available:
- Client complained about something. What paths was he accessing? When? Where was he accessing from?
- Someone reported the app is failing. When did it start, exactly? Is it still ongoing? Is it global or regional?
- Lots of people are getting 404’s from a deactivated page. Why? Where are they coming from?
- etc etc etc…

Again, communicate with other technology teams that you can do these things, and you’ll be able to help them.

OBS: This is also a great place to extract statistics about page views, traffic, usage of certain apps/pages, etc. Communicate this to your higher ups, and you may also be able to help them to extract important metrics.

Here are a few example of fields you can observe in CloudFlare:

PROTIP: If there is no additional cost for doing so, I recommend creating a rule to log EVERYTHING. WAFs usually don’t do this by default, and always having logs can be useful for issues that you can’t replicate (so it wouldn’t be possible to create a specific logging rule).

4.8 – Gambiarras with custom rules

Still keeping in mind the previous image, let’s talk about some gambiarras you can create using custom rules.

Do you have a page that shouldn’t be exposed to the internet, but is? Make a custom rule that blocks it for everyone except those coming from your VPN/Office’s source IP ranges.
Do you have a page that only specific users, services or clients should acess? Again, restrict them by source IP ranges
- This is pretty common for webhooks!
Any internal/administrative pages that need be exposed? Restrict them to your company’s operating countries to reduce the attack surface.
Need to block your mobile app’s old version from working? Block its user-agent! (If it changes with each version, of course).
Devs need to test something in PROD and using a real domain, but you don’t want to expose the page yet? Tell them to set a specific cookie on their browsers, and then block that page for anyone who doesn’t have that cookie set.
- This is like a really basic, really silly form of “password protection” for your page.

Creativity is the limiting factor here. There’s a lot of cool stuff you can do to help the life of your fellow colleagues or to mitigate risk inside your company. Keep your eyes open for opportunities!

4.9 – Incident Response using your WAF

Combining the logging and custom rules we talked about, you get some IR capabilities as well:

You can see all the requests from an IP involved in an incident
You can temporarily block or redirect traffic away from a service under attack. You can do this for only specific IPs/regions/ASNs as well
You can redirect users to a “maintenance” page while something is going on
You can impose aggressive throttling / rate limiting in a page that is being brute-something’ed.

There are many more things you can do. Again, creativity is the limit.
The world is your oyster, and the WAF is a dirty chucking knife you found in the sink.

4.10 – Virtual Patching

Another cool thing related to IR:

Custom WAF rules can be used to patch some vulnerabilities. This will not fix the root problem, but it will at least prevent the problem from being exploited.

Most of the times these rules come pre-packaged with the WAF (see below), but if the WAF doesn’t have it, you can just create it yourself (sometimes!).

For an example you can patch, think of an SQLi vulnerability in an application that uses a specific query parameter. Using a custom rule you can just block requests with that specific query parameter.

4.11 – Deception

Another cool trick, this one tipped by Pr0teus:

Since your WAF can intercept and modify requests and responses, why not use it to freak out some pentesting tools? Create responses that make it look like you have SQLis, SSRFs, IDORs, etc… This will turn your WAF into a “mini-honeypot”!

You can do this using redirects, blocks, or even by creating complete responses (for example, you could do this in Cloudflare using workers).

4.12 – Remote Network Access

Since your WAF is already in front of your entire application, it can also serve as a zero trust access point. There are innumerable insufferable buzzwords for similar things, but it’s usually called SASE, CASB, ZTNA or something else.

A basic thing you can do, for example, is put an Oauth login page in front of pages that should be internal, but are public. Again, something that should be fixed at the root, but that the WAF can duct tape over.

There are usually other features that aim to replace traditional VPNs or even endpoint managers, but those are usually not really WAF features, more like “tools that take advantage of existing WAF installations”, and can use additional agents, products, etc.

The point is: when you have pains with remote access, remember the WAF. It’s worth looking into what your supplier offers, as you probably already have part of the requirements set up, and maybe something will be a good fit.

5 – Soft Power using your WAF

Let’s step back and talk about some more soft advantages of having a properly configured WAF.

As we’ve seen, having a tuned WAF gives you many opportunities to help developers, devOps, and even other areas. This is GOOD, because it allows you to improve your image with other areas, and shows that security can be something that helps other areas, not only blocks them. It is also good because it justifies the WAF’s existance, which will be important for our main point, which is the main point of this entire post:

When you have control over a configured WAF, you have the keys to the kingdom. You have the supreme power to block or allow ANY request coming into your app.

Here are some things you can do, for example:

Vulnerability Management
- If a critical vulnerability is not fixed in x days, the path is completely blocked!
Domain Management
- Any new subdomain must be configured by you, which gives you the chance of ensuring all security procedures and reviews have been followed (which is something folks love to ignore or dodge)
SPF/DKIM/DMARC Management
- Since you control DNS, you can tune these to your heart’s content. If this is managed by IT, they usually just configure it to “have it but not ever block anything”.
Domain Verification
- Lots of tools require you to verify domain ownership using DNS (Google Workspace, for example). That means these tool admins will have to ask you for help, and that’s another opportunity for you to check if everything is OK before allowing the process to continue.
Etc!

To conclude: if you have a WAF and it is of enough help to other teams, it’s likely it won’t be going away soon. And then you can use this in your favor to be included in talks where people would rather you not be included, or to pressure teams into fixing things they would rather not fix 😉

Obviously, take care when using the WAF as leverage, or you will just be hated and may lose it entirely.

6 – A Final Warning

A serious warning – be mindful about how you’re using your WAF.

We’ve talked a lot about 1000 things you can do with your WAF, but the point isn’t that you should do all of them! Again, there are probably tools that are much more suited for your goals than your WAF.

It’s OK to use your WAF when these tools are not available or viable, when the fight isn’t worth picking, etc etc etc. But do NOT use your WAF as a crutch, or you will become too dependent on it.

“So what? Is there a problem in that?”

Yes, multiple!

When you need more customization, more features, more power, etc… the gambiarra you made using your WAF will not be able to keep up with the demand, and then you will have to install an entire new tool from zero to do something that should be simple.
If the cost of your WAF increases too much and management considers switching it, you are screwed. You will absolutely not be able to migrate configurations as-is. You will probably have to install additional tools, run like hell, and the entire migration process will be infernal.
If your WAF goes kaput, the impact may be much bigger than if you were just using it for simple things.

Be careful!

7 – Conclusion

To conclude and review, here are the main takeaways from this post:

Having a control layer in front of your entire application is ridiculously, extremely, Super Sayiajin-ly powerful
- BUT, you need to truly understand the application and be creative to use this power 😉
WAFs that are not properly cared for don’t usually last long
- If you’re going to use a WAF, then at least tune it properly!
WAFs have lots of other capabilities that you can use to help yourself and others. Get to know them and use them when possible.
- …but you should treat it as duct tape! do not use it as a crutch! There are usually more suited tools with better solutions.
- Still, duct tape can usually be pretty helpful and save people in a pinch!
If your WAF is tuned and being used to help others, it will be well-seen
- This will grant you some soft powers. Use them, but wisely!