Skip to main content
Mozilla Location Service WiFi data contribution process and crowdsourced geolocation architecture

Contributing to Mozilla Location Service WiFi Data

Mozilla Location Service (MLS) was Mozilla's open alternative to proprietary geolocation databases — a crowdsourced system that mapped WiFi access point identifiers to physical locations, providing geolocation without GPS and without depending on Google or Apple's closed datasets. Contributing WiFi data to MLS involved scanning for nearby access points, recording their signal characteristics alongside a GPS fix, and uploading the observations to Mozilla's servers. This entry documents how the contribution process worked in practice, the tools involved (particularly Mozilla Stumbler on Android), the privacy dynamics of crowdsourced WiFi scanning, the data quality challenges that affected the service, and what happened when Mozilla discontinued MLS and the implications for open geolocation going forward. The assessment draws on direct participation as a contributor over multiple years. This is part of the journal section, and the privacy dimensions are covered further in the privacy and security topic hub.


What Mozilla Location Service was

→ Short Answer

Mozilla Location Service was a geolocation provider that used crowdsourced data about WiFi access points and cell towers to estimate a device's physical location without GPS. When a device queried MLS with the WiFi networks and cell towers it could see, the service looked up those identifiers in its database and returned an estimated latitude and longitude. The service was free, open-source on the server side, and accepted data contributions from anyone. It was used by Firefox, Firefox OS, and third-party applications as an alternative to Google's geolocation API.

The geolocation principle

WiFi-based geolocation works because most wireless access points do not move. A coffee shop's router sits at the same physical location for years. If you know the router's MAC address (BSSID) and you have previously recorded that MAC address alongside a GPS fix from that location, you can later estimate someone's position just by seeing which routers their device can detect — no GPS required.

The technique is powerful because WiFi signals penetrate buildings where GPS does not, and because every smartphone and laptop already has a WiFi radio. A device that cannot get a GPS fix indoors — which is most devices, most of the time — can still be located to within tens of metres using WiFi data. The accuracy depends on the density of mapped access points in the area. In a well-surveyed city centre, WiFi geolocation can match or exceed GPS accuracy. In a rural area with sparse coverage, it may be useless.

Google and Apple both maintain massive WiFi geolocation databases, built from the scanning that Android and iOS devices perform continuously. These databases are proprietary, accessible only through their APIs, and opaque in their data handling. MLS was Mozilla's attempt to create a comparable dataset under open terms — community-contributed, publicly documented in its data handling policies, and available to any application or service that wanted to use it.


How WiFi data collection worked

Contributing to MLS required collecting "observations" — records that associated WiFi access point identifiers with physical locations. Each observation contained:

  • BSSID — the MAC address of the access point (the unique hardware identifier broadcast in every WiFi beacon frame)
  • SSID — the network name (optional; MLS accepted observations without it)
  • Signal strength — the received signal level in dBm
  • Channel/frequency — which WiFi channel the access point was operating on
  • GPS coordinates — the contributor's latitude and longitude at the time of the scan
  • GPS accuracy — the estimated precision of the GPS fix
  • Timestamp — when the observation was recorded

The system required a GPS fix alongside each WiFi scan because the entire point was to associate wireless identifiers with physical locations. An observation without a GPS coordinate was useless.

Mozilla Stumbler

The primary contribution tool was Mozilla Stumbler, an Android application that ran WiFi and cell tower scans in the background while the contributor's device had GPS enabled. Stumbler was purpose-built for MLS contribution: it scanned continuously, batched observations, and uploaded them to Mozilla's servers over WiFi (to avoid consuming mobile data with large uploads).

⬡ Observed Behaviour

Mozilla Stumbler performed WiFi scans approximately every two seconds while active and moving. When stationary (detected via accelerometer), the scan frequency decreased to conserve battery. Each scan captured every visible access point — typically 5 to 30 in urban environments — and tagged the set with the current GPS position. A typical walking session through a suburban neighbourhood generated several thousand observations per hour. The upload process batched observations and submitted them as compressed JSON payloads when a WiFi connection was available.

Mozilla Stumbler interface showing active WiFi scanning session with access point counts and GPS track

The contribution experience was straightforward but tedious. You installed Stumbler, enabled GPS, and went about your day. The application ran in the background, draining battery faster than normal (the combination of continuous GPS and frequent WiFi scans consumed meaningful power). Periodically, you checked the statistics screen to see your observation count and the coverage area you had surveyed.

The gamification was minimal. Stumbler showed a leaderboard of contributors ranked by observation count, which motivated some people and was irrelevant to others. The real motivation for most contributors was ideological: building an open alternative to Google's location database.


The privacy dynamics of WiFi scanning

WiFi-based geolocation creates a specific set of privacy tensions that MLS handled carefully but could not entirely resolve.

What contributors exposed

A contributor running Stumbler was broadcasting their precise GPS track to Mozilla's servers. Every observation included the contributor's location at a specific time. Over a sustained contribution period, Mozilla accumulated a detailed movement history for each contributor. Mozilla's privacy policy stated that contributor location data was processed for database purposes and not retained in identifiable form after aggregation, but the raw upload necessarily contained that information in transit and during processing.

Every WiFi router scanned by a Stumbler contributor had its BSSID recorded and mapped to a physical location without the router owner's knowledge or consent. This is the foundational privacy tension of WiFi geolocation: the technique works by cataloguing hardware identifiers that are broadcast involuntarily. Your home router's MAC address, mapped to your home's GPS coordinates, becomes a location reference point that anyone querying the geolocation database can use indirectly.

⚠ Common Pitfall

The privacy concern with WiFi geolocation databases is not that someone can look up your router's MAC address and find your home address directly — the databases are queried in reverse (submit observed MACs, receive estimated coordinates). The concern is that the existence of the database means any device that can see your router can determine its own location, which means it can determine that it is near your home. For most people, this is a benign consequence. For people with stalking, harassment, or physical safety concerns, the involuntary inclusion of their home router in a public geolocation database is a non-trivial exposure.

MLS addressed this partially through an opt-out mechanism: if a network's SSID ended with _nomap, MLS would exclude it from the database. This followed a convention established by Google's WiFi scanning practices. The mechanism required the access point owner to know about it, understand WiFi geolocation, and modify their network name — a bar that the vast majority of router owners never cleared.

Cell tower data

MLS also collected cell tower observations — the tower identifiers and signal strengths visible to the contributor's device. Cell tower geolocation is less precise than WiFi (tower locations are known to hundreds of metres rather than tens) but provides coarse positioning in areas without mapped WiFi access points. The privacy dynamics for cell tower data are somewhat less sensitive because tower locations are not tied to individual households, but the contributor's own location history was equally exposed.


Data quality challenges

Crowdsourced geolocation data has inherent quality problems that MLS worked to mitigate but could not eliminate.

Mobile access points

Buses, trains, and mobile hotspots create access points that move. A contributor who scans a bus WiFi network at a GPS coordinate creates an observation that maps that BSSID to a specific location — but the next time someone queries MLS while near that same bus at a different location, the database returns the original (wrong) coordinates. MLS used algorithms to detect mobile access points based on observations from widely separated locations, but the detection was imperfect and relied on having enough observations to identify the movement pattern.

GPS inaccuracy

Contributors in urban canyons, inside buildings near windows, or in areas with poor GPS coverage submitted observations with imprecise or incorrect GPS coordinates. A WiFi scan at a true location of 51.5074° N, 0.1278° W recorded with a GPS fix of 51.5080° N, 0.1290° W places the access point 100+ metres from its actual position. Individual observation errors average out over many contributions, but access points in areas with few contributors retained their initial (potentially inaccurate) positions indefinitely.

Stale data

Access points are replaced, moved, or decommissioned. A router mapped in 2015 may have been replaced with new hardware (new BSSID) by 2018, leaving the old mapping as dead weight in the database. MLS applied age-based weighting to prefer recent observations, but the database inevitably accumulated stale records that degraded positioning accuracy in areas without active contributors refreshing the data.

Coverage map showing MLS data density variation between well-surveyed urban areas and sparse rural regions

What happened to MLS

↻ What Changed

Mozilla announced the shutdown of Mozilla Location Service in 2024. The service stopped accepting new contributions, the API was deprecated, and the infrastructure was decommissioned. Mozilla cited the cost of maintaining the service relative to its usage and the difficulty of competing with proprietary databases maintained by companies (Google, Apple) that have billions of devices contributing data passively. Firefox transitioned to using other geolocation providers. The MLS dataset, despite being open in principle, was not released as a public download — the privacy implications of publishing a global WiFi-to-location mapping were considered too significant.

The shutdown was not sudden. MLS had been in a slow decline for years before the formal announcement. Mozilla Stumbler had been removed from the Google Play Store. The contributor community had shrunk. The data coverage was increasingly stale in many regions as fewer active contributors refreshed the observations. The service worked well in the specific cities and countries where motivated contributors had surveyed intensively, but it never achieved the global, continuously-refreshed coverage that makes a geolocation database reliable enough for mainstream use.


The contribution experience in retrospect

Then

Active MLS era (2013–2020s): Contributing to MLS felt meaningful. You walked or drove through your neighbourhood with Stumbler running, watched your observation count grow, and knew you were adding coverage to an open geolocation database that Firefox and other applications could use. The contribution process was simple, the purpose was clear, and the open-source ethos of building a commons-based alternative to Google's proprietary dataset was genuinely motivating. The gamification through leaderboards created a community of dedicated contributors who systematically surveyed their cities.

Now

Post-MLS landscape: No open, community-maintained WiFi geolocation database of comparable scope exists. The alternatives are Google's geolocation API (proprietary, requires API key, rate-limited), Apple's location services (proprietary, device-locked), and commercial providers like Skyhook. The open-source geolocation space has fragmented into smaller projects with limited coverage. For most applications, the practical choice is to use the platform vendor's built-in location services and accept the proprietary dependency. The dream of a community-maintained alternative to corporate location databases did not survive contact with the economics of data maintenance.

What contributors actually achieved

The honest assessment is mixed. MLS contributors collectively built a dataset that was genuinely useful in well-surveyed areas. In cities where active contributors lived and worked, MLS provided geolocation accuracy competitive with Google's service. Firefox users in those areas got accurate positioning without any data flowing to Google. That was a real achievement with real value for users who cared about the privacy implications of location services.

But the coverage was fundamentally uneven. A geolocation service that works well in Berlin and Portland but poorly in São Paulo and Lagos is not a viable alternative to a service that works everywhere. Google and Apple achieve global coverage because every Android and iOS device contributes data passively — the scanning happens invisibly, continuously, and at a scale that no volunteer effort can match. MLS required active, intentional contribution from technically motivated people, and that contributor pool was always small relative to the global coverage requirement.


Privacy lessons that outlast the service

MLS is gone, but the privacy questions it surfaced remain relevant because WiFi geolocation itself is not going away — it has simply retreated entirely behind proprietary walls.

Passive scanning is universal. Every smartphone continuously scans for WiFi networks and reports the results to its platform vendor. The scanning that MLS contributors did intentionally with Stumbler happens automatically on every Android and iOS device. The data collection is identical in nature; only the consent model differs. Google and Apple disclose this in their terms of service. Most users do not read those terms, and the scanning cannot be disabled without also disabling WiFi entirely.

The _nomap opt-out is a polite fiction. The convention of appending _nomap to your SSID to request exclusion from WiFi geolocation databases has no legal or technical force. It is a voluntary gesture by database operators, and compliance is unverifiable. Whether your renamed network is actually excluded from any particular database is something you can never confirm. The mechanism was designed to address a legitimate concern — involuntary inclusion in location databases — but it placed the burden on the party with the least knowledge and the least power in the interaction.

Crowdsourced data creates collective privacy costs. When you contribute location data to any geolocation database, you are not just sharing your own location. You are mapping other people's infrastructure — their home routers, their workplace networks, their personal hotspots — into a system that enables location determination by third parties. The ethics of this are not resolved, and the discourse about WiFi geolocation privacy has not advanced significantly since MLS was active.

The privacy and security topic hub covers related themes around the intersection of technically convenient data collection and its downstream privacy implications. The open web hub explores the broader pattern of open alternatives struggling to compete with proprietary platforms that benefit from massive passive data collection.