Apple News User Agent Strings: Identification and Behaviour
Apple News sends multiple bot variants to fetch web content, and distinguishing between them in server logs requires more than grepping for a single substring — the preview fetcher, the full content crawler, and the image prefetcher each present different user agent strings with different behavioural characteristics. This technical note documents every observed Apple News user agent variation, the crawl patterns each variant produces, the IP ranges they originate from, rate-of-request profiles, and the practical differences between the preview fetch that fires when a link is shared and the sustained indexing crawl that runs against published content. The analysis draws from access log examination across multiple server configurations over extended observation periods. This note sits within the tech notes section and connects to the excessive AppleNewsBot requests investigation and the modern hotlink protection guide on managing unwanted resource consumption.
The short version
Apple News uses at least three distinct user agent strings to fetch content from your server. The primary crawler identifies as AppleNewsBot and behaves like a traditional web crawler — fetching HTML, parsing metadata, and re-visiting pages on a schedule. A separate preview fetcher fires when someone shares a link in Messages or other Apple apps, presenting a Safari-like user agent with an AppleNewsBot suffix. A third variant fetches images and Open Graph assets independently, sometimes with a truncated user agent that omits the browser-like prefix. All three originate from Apple-owned IP address space. Filtering on the substring AppleNewsBot catches all known variants.
The primary crawler: AppleNewsBot
The main Apple News crawler uses a user agent string that follows the pattern of a full browser identification with an appended bot identifier:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.0 Safari/605.1.15 AppleNewsBot/1.0
The version numbers within the string change over time. The Version/16.0 component has been observed as Version/15.0, Version/16.0, Version/17.0, and Version/18.0 across different crawl periods. The AppleNewsBot/1.0 suffix has remained consistent — the bot version has stayed at 1.0 even as the embedded Safari version numbers have incremented.
This is the crawler responsible for the sustained indexing activity that generates the bulk of Apple News bot traffic. It fetches HTML pages, parses Open Graph metadata, follows internal links, and re-fetches content on a schedule that does not tightly correlate with content changes.
The primary AppleNewsBot crawler targets HTML pages first, then independently fetches resources referenced in Open Graph tags — particularly og:image URLs. A single article page visit typically generates two to four log entries: one for the HTML document, one for the OG image, and sometimes additional requests for favicons or Apple touch icons. The crawler does not execute JavaScript, which means it reads server-rendered HTML and meta tags but does not interact with client-side-rendered content.
Version number progression
The embedded Safari and WebKit version numbers in the user agent string have tracked Apple's Safari release cadence, which is useful for identifying when the bot's rendering engine was last updated. The version does not correspond to a separate AppleNewsBot release cycle — it reflects the underlying WebKit build that the bot uses to parse HTML and evaluate CSS.
| Observation period | Embedded Safari version | WebKit version |
|---|---|---|
| 2022–2023 | Version/15.x – 16.x | 605.1.15 |
| 2023–2024 | Version/16.x – 17.x | 605.1.15 |
| 2024–2025 | Version/17.x – 18.x | 605.1.15 |
The WebKit build number (605.1.15) has remained static across all observed variants. This is the same build string used in Safari's own user agent and does not indicate an outdated engine — Apple uses this static string across their WebKit-based products.
The preview fetcher
A distinct user agent appears when someone shares a URL in iMessage, Apple Mail, or other Apple applications that generate link previews. This fetcher fires once — at share time — and retrieves the page and its OG image to construct the preview card shown in the conversation.
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko)
This variant sometimes omits the AppleNewsBot suffix entirely, which makes it harder to filter alongside the primary crawler. It appears in logs as a standard Safari-like request, distinguishable primarily by its IP origin (Apple-owned ranges) and its request pattern: a single page fetch followed immediately by one or two asset fetches, with no subsequent re-visits.
The preview fetcher's user agent is close enough to a genuine Safari desktop browser that naive bot detection rules will miss it. If you are building bot-vs-human traffic segmentation, relying solely on user agent string matching will miscategorise preview fetcher requests as human visits. Cross-referencing with IP ranges is the reliable approach — Apple publishes its bot IP ranges, and the preview fetcher consistently originates from within them.
Distinguishing preview fetches from crawler fetches
The behavioural pattern is the clearest differentiator:
- Preview fetch: single request to the page URL, one request to the OG image, no follow-up requests. Typically arrives within seconds of someone sharing the link.
- Crawler fetch: repeated requests to the same URL over hours or days, following internal links, re-fetching images independently of content changes. Arrives on the bot's schedule, not in response to a user action.
In access logs, the preview fetch appears as an isolated cluster of two to three requests for a specific URL that was recently published or shared. The crawler's requests appear as sustained periodic hits against the same URLs over extended periods.
The image prefetcher
A third variant handles image fetching independently of the primary page crawl. This fetcher requests images — OG images, article body images, and occasionally CSS background images referenced in inline styles — with a user agent that may be truncated:
AppleNewsBot/1.0
This short-form user agent omits the Mozilla/5.0 prefix entirely. It appears almost exclusively on image resource requests (.jpg, .png, .webp). The image prefetcher operates on its own schedule and does not respect Cache-Control or ETag headers with the same fidelity as the primary crawler. Images that returned 304 Not Modified to the primary crawler may be re-fetched in full by the image prefetcher hours later.
The image prefetcher generates a disproportionate share of total AppleNewsBot bandwidth consumption. On a site with twenty articles, each with a 200 KB Open Graph image, the image prefetcher can account for 60–70% of total bot bandwidth while representing only 30–40% of total bot request count. The images are fetched at full resolution regardless of the device context — there is no Accept header negotiation for WebP or AVIF, and no indication of viewport size.
IP address ranges
All observed Apple News bot variants originate from Apple-owned IP address space. Apple maintains a published list of bot IP ranges, and cross-referencing access log source IPs against these ranges is the most reliable method to confirm that a request genuinely comes from Apple rather than a scraper spoofing the user agent.
grep "AppleNewsBot" /var/log/apache2/access.log | awk '{print $1}' | sort -u
The resulting IP addresses should resolve to Apple-owned ASNs. A quick verification:
grep "AppleNewsBot" /var/log/apache2/access.log | awk '{print $1}' | sort -u | head -5 | while read ip; do echo "$ip — $(dig -x $ip +short)"; done
Reverse DNS entries for genuine Apple bot IPs typically resolve to hostnames within Apple's infrastructure domains. If you see AppleNewsBot in the user agent but the source IP resolves to a residential ISP or a cloud hosting provider, the request is spoofed.
Rate characteristics across variants
The three bot variants differ meaningfully in their request timing:
| Variant | Typical rate | Caching behaviour | Trigger |
|---|---|---|---|
| Primary crawler | 50–500 req/hour during active crawl | Partially respects Cache-Control; issues conditional requests | Scheduled, not event-driven |
| Preview fetcher | 1–5 req per share event | Does not re-visit; single fetch per share | User shares a URL |
| Image prefetcher | 20–200 req/hour | Largely ignores cache directives | Follows primary crawler |
The primary crawler's rate varies significantly between sites. Sites that publish frequently and have been picked up by Apple News see higher sustained rates. Sites with stable, infrequently updated content see lower rates but still experience periodic re-crawl bursts where the bot re-validates its entire index of your content over a short window.
Earlier versions of AppleNewsBot operated at lower request rates and more reliably honoured HTTP caching headers. The bot's scheduling appeared to incorporate Last-Modified and ETag signals when determining re-fetch intervals. Re-crawl frequency correlated roughly with content publication frequency — sites that published daily were crawled more aggressively than static sites.
Current AppleNewsBot behaviour shows higher baseline request rates and weaker correlation between cache signals and re-fetch timing. The bot re-crawls content regardless of caching directives, particularly for image assets. The decoupling of the image prefetcher from the primary crawler's cache state means that even perfectly configured cache headers do not prevent duplicate image fetches. Rate management requires server-side controls rather than relying on the bot to self-throttle based on your response headers.
Filtering Apple News traffic in log analysis
For comprehensive filtering that catches all known variants, the substring AppleNewsBot remains the most reliable single filter:
grep -i "applenewsbot" /var/log/apache2/access.log | awk '{print $1, $7, $9}' | sort | uniq -c | sort -rn | head -30
This gives you a count of unique IP + URL + status code combinations, sorted by frequency. The output reveals which pages the bot is hitting most aggressively and whether your server is returning 200, 304, or error codes.
For analytics platforms that import access logs, filtering on the AppleNewsBot substring before import prevents the bot traffic from inflating page view counts. For real-time monitoring, a tail -f with a grep exclusion keeps bot traffic out of your live view:
tail -f /var/log/apache2/access.log | grep -v "AppleNewsBot"
The preview fetcher's user agent does not always contain the AppleNewsBot substring. If complete Apple bot traffic accounting is required — for billing, capacity planning, or traffic classification — supplement user agent filtering with IP-based identification using Apple's published bot ranges. User agent filtering alone will undercount Apple-originated traffic by the volume of preview fetches that use the truncated Safari-like string.
Practical implications for server configuration
Knowing which variant is hitting your server determines the appropriate response. The primary crawler is the one worth managing through robots.txt and rate controls — it generates the sustained traffic. The preview fetcher is harmless at scale and actively useful (link previews drive engagement). The image prefetcher is the bandwidth concern that targeted .htaccess rules can address without affecting content crawling.
The excessive AppleNewsBot requests investigation covers mitigation in depth: robots.txt directives, Apache rate limiting, cache header strategy, and the trade-offs between controlling bot traffic and maintaining Apple News distribution. This technical note provides the identification layer — knowing what you are looking at in the logs before deciding what to do about it.
Apple News bot identification has grown more complex as Apple has expanded the service. The original single-variant crawler with a consistent user agent string has been supplemented by specialised fetchers for previews and images, each with its own user agent pattern and behavioural profile. Server administrators who set up AppleNewsBot filtering rules several years ago may find that their rules no longer capture the full range of Apple-originated traffic, particularly the preview fetcher variant that omits the AppleNewsBot substring. Periodic review of bot filtering rules against current access log data is the only reliable way to maintain accurate traffic classification.