Web Performance
Web performance is ultimately about what happens between a user's request and the moment the page becomes usable — and how much of your server's capacity goes to real users versus traffic that delivers no value. A site that wastes half its bandwidth on bots has less capacity for visitors. A server that compresses poorly burns CPU on work it has already done. The performance gains that matter most in production often come not from framework-level optimisation but from infrastructure decisions about how the server handles every request.
This topic page connects content from across the site that addresses web performance from the operator's perspective — compression configuration, bot traffic management, and content protection. The coverage focuses on the operational layer where independent site operators have direct control: web server configuration, response encoding, header policies, and traffic filtering. This is where decisions made once affect every subsequent request, where compression choices compound across millions of responses, and where unmanaged non-human traffic means paying for load that contributes nothing.
Compression and encoding
Response compression is the highest-leverage performance optimisation at the server layer — reducing transfer sizes by 60–90% for text-based resources without any application code changes. Algorithm choice, compression levels, and content-type negotiation all affect the ratios achieved in production.
Apache mod_brotli
Brotli offers meaningfully better ratios than gzip — typically 15–25% smaller responses for HTML, CSS, and JavaScript. Apache's mod_brotli is not a drop-in replacement for mod_deflate; it has its own directives, content negotiation behaviour, and performance characteristics at different compression levels.
This page documents the practical configuration: directives controlling compression level, content types that benefit, the mod_deflate fallback for non-Brotli clients, and the trade-off between compression ratio and CPU cost. Static pre-compression versus dynamic on-the-fly compression is a meaningful architectural choice — pre-compressed assets can use higher quality settings without per-request CPU cost. For anyone running Apache, Brotli is one of those optimisations where the effort-to-benefit ratio is exceptionally favourable.
Bot and crawler management
Non-human traffic is a significant and growing fraction of total web traffic for most sites. Search engine crawlers, AI training scrapers, monitoring services, SEO analysis tools, and outright malicious bots collectively account for a substantial share of requests — and unlike human visitors, they rarely respect rate limits, often ignore caching, and frequently make requests that are computationally expensive to serve. Managing this traffic is a performance concern as much as a security concern.
Excessive AppleNewsBot Requests
This page documents a specific case of Apple's news crawler generating request volumes disproportionate to any reasonable crawling purpose. The observation illustrates a pattern affecting many sites: a recognisable, legitimate bot consuming meaningful server resources without corresponding benefit.
The analysis covers the request patterns from server logs, volume relative to actual content, and available management approaches. The broader lesson: bot management is not just about blocking malicious crawlers. Legitimate bots from major companies can be the largest source of unwanted load. The distinction between "legitimate" and "well-behaved" matters — a bot can identify itself correctly and still degrade performance for actual visitors.
Modern Hotlink Protection
Hotlinking — where external sites embed your images, stylesheets, or other assets directly by referencing your URLs — is one of the oldest bandwidth-theft patterns on the web. The traditional referer-based blocking approach has not aged well: HTTPS-to-HTTP transitions strip the referer header, privacy features in modern browsers reduce referer precision, and CDN architectures complicate the enforcement point. The result is that hotlink protection methods from the early 2000s are largely ineffective against modern embedding patterns.
This page examines the current state of hotlink protection, covering what works, what has broken, and what approaches remain effective given modern browser behaviour and CDN architecture. For sites serving substantial static assets — images, fonts, large CSS files — unmanaged hotlinking is a performance drain that scales with the popularity of whatever external site is embedding the resources. The bandwidth cost is yours; the benefit accrues entirely to someone else.
Defacing Content Scrapers
Content scraping sits at the intersection of performance and content protection. Automated scrapers consume server resources during copying and, if they republish, create duplicate content problems. This page examines approaches for detecting and responding to scraping — practical strategies for imposing costs on automated copying while preserving the normal reading experience.
The approaches range from server-level pattern detection to content-level techniques that degrade scraped copies. The performance relevance is direct: aggressive scrapers that spider an entire site in minutes generate load spikes affecting real visitors. Rate limiting, pattern analysis, and selective response modification reduce the resource cost without affecting legitimate access.
What readers usually need
Readers arriving at this topic page typically have one of these questions:
- How to configure Brotli compression on Apache → Apache mod_brotli covers the directives, content types, compression levels, and the interaction with gzip fallback
- A specific bot is hammering the server → Excessive AppleNewsBot Requests documents the pattern and management approaches for disproportionate crawler traffic
- How to handle hotlinking in a modern HTTPS environment → Modern Hotlink Protection assesses which traditional techniques still work and what has replaced them
- How to deal with content scrapers → Defacing Content Scrapers covers detection, response modification, and practical mitigation strategies
Navigating related content
This topic page is one of several cross-section hubs on the site. The topics index provides the full list of available topic clusters. If you are looking for content organised by editorial type rather than by subject, the section hubs are:
- Web Development for the full collection of server and front-end notes
- How-To Guides for practical walkthroughs and configuration guides
- Tech Notes for behavioural observations and subsystem documentation
- Security for investigations and protocol analysis
- Development for scripting, tooling, and extension work
- Reviews for product and service assessments
- Journal for reflective commentary
Web performance touches multiple sections — server configuration lives in web development, some performance-relevant security hardening appears in security investigations, and caching behaviour observations surface in technical notes. The pages linked from this topic hub are the ones where performance is the central concern. As new content addressing compression, caching, traffic management, or delivery optimisation is published, it will appear here alongside the existing coverage.