crankycoder1975's comments

crankycoder1975 · on March 18, 2024

You can't.

Skyhook basically owns triangulation of rf. It's absurd.

crankycoder1975 · on Oct 2, 2020

I used to work at Mozilla until recently and what you're talking about is nonsense. The telemetry data is handled extremely carefully. Any downstream use of that data is also carefully managed so that creating re-identifiable data is not possible.

Could Mozilla reidentify people with raw telemetry - sure. Is it easy or done in practice - no. Shut the hell up unless you know better than someone who has handled that data.

Seriously folks - Mozilla has problems - but proper handling of your private data is not one of them.

crankycoder1975 · on Feb 2, 2018

Does Bruce still want you guys to use Zorba for Compass?

crankycoder1975 · on Feb 2, 2018

Ah, nevermind. I'll just read your emails. #ExGapperJokes

silent1mezzo · on Feb 2, 2018

:slow-clap: Luckily not possible now (and hasn't been for quite some time).

crankycoder1975 · on Oct 29, 2014

We're going to put some more information there on why we don't publish wifi at this point.

https://github.com/mozilla/ichnaea/issues/330

In a nutshell though, we don't know how to do this yet without creating a privacy nightmare.

crankycoder1975 · on Oct 29, 2014

YES.

We caught that bug just before release. :)

https://github.com/mozilla/MozStumbler/issues/1137

crankycoder1975 · on Oct 29, 2014

I'm going to try getting an F-Droid build out today, but we've got some build issues with the fdroidserver. In any case - it will get to F-Droid real-soon-now.

lgierth · on Oct 29, 2014

Cool, very good to hear!

crankycoder1975 · on Oct 29, 2014

Hi all!

We cut a 1.0 release of the Mozilla Stumbler finally.

Have at it. File the bugs. Complain about battery life.

Help us make this thing not suck and build out a proper open location service.

hackuser · on Oct 29, 2014

Congratulations; I hope this gives us a safe, effective, open location service.

The privacy policy[1] could be clarified for less technical readers, and even for others. I infer that collected data is anonymous because you write,

1) We receive publicly observable data about WiFi access points and cell towers around you, your estimated latitude and longitude, and the date -- Not associated with anything else, that may be anonymous data -- though you could guess my home network or home location by the most common/strongest wifi signals. If you track data by submitter, you also would have a good idea of their travels.

2) we may receive certain temporary data such as your IP address. This data is deleted after being used as follows ... -- You seem to be implying that you do receive non-anonymous data, and delete it after innocuous uses.

3) You can send us data anonymously or under a nickname -- Which implies anonymity is possible.

If what I infer is correct, why not restate it directly and unequivocally with something like the following:

Unless you choose otherwise, the data you send will be anonymous and not associated with you in any way. We will not record who you are or what phone sent the data. We do receive some non-anonymous data, but we delete it within X hours/days after using it as follows ...

And add more detail after that.

[1] https://location.services.mozilla.com/privacy

EDIT: Clarify a bit, and a correction to #1

cpeterso · on Oct 29, 2014

Good questions! The stumbler reports Wi-Fi and cell tower locations and an optional nickname. The location data is stored anonymously. The nickname and just the number of reported networks is stored separately, solely for display on the leaderboard [1] or other gamification in the future.

The IP addresses are just a fact of life of web server logging. They are not stored in the location or leaderboard databases.

[1] https://location.services.mozilla.com/leaders

hackuser · on Oct 29, 2014

> Good questions!

Thanks! My post's intention was to suggest that Mozilla revise the privacy policy to clarify it for everyone. What are your thoughts?

justcommenting · on Oct 29, 2014

"fact of life of web server logging" = screw you, we're not even going to consider deleting our logs even as we talk a good line about how much we respect your privacy

edit after downvote: also, mozilla engineering PMs will intimate on hackernews that it won't internally correlate and potentially sell any of the location and other information it most obviously could correlate about people, even though it has already announced its intention to advertise.

crankycoder1975 · on Oct 29, 2014

We don't correlate your location data to ads. As a Canadian, that would actually be illegal and a violation of the Privacy Act.

We never got authorization from individuals to do that correlation.

We aren't perfect, but I think we do a pretty good job of respecting and protecting your privacy at Mozilla.

justcommenting · on Oct 29, 2014

thank you for these clarifications.

one industry norm that makes these things tough (again, not Mozilla's fault) is that at least under US law, Mozilla could change its privacy policies at some point in the future and do a lot more than it currently does.

and... my parent comment was brash and probably deserved the downvote it received.

lxt · on Oct 30, 2014

Selling user data would be completely against our mission and values, and I think it would be extraordinarily hard for such a change to make it through the internal immune system for such things. I think Mozilla is less likely to do bad things with your data than just about any other company (or government for that matter) out there.

(Disclosure: I work for Mozilla. I am helping write an updated set of privacy guidelines for engineering teams, to be as explicit as possible about how careful and respectful we need to be with data.)

sp332 · on Oct 29, 2014

"2) we may receive certain temporary data such as your IP address. This data is deleted after being used as follows". So yes, they do say they delete it. Also, Mozilla has a better track record with respecting user privacy than anyone else in this space. (And where is their intention to advertise?)

justcommenting · on Oct 29, 2014

i agree that Mozilla has a better track record than most large tech companies in most areas, but that also sets a pretty low bar. i'm more of the opinion that if Mozilla really were as committed to user privacy as they claim to be, they might not respond so flippantly to questions about server logs. If it wanted to, Mozilla could even stop logging "certain temporary data such as your IP address."

regarding Mozilla's intention to advertise: http://www.zdnet.com/mozilla-clarifies-defends-firefox-ad-po...

crankycoder1975 · on Oct 29, 2014

Monitoring server logs is how we detected and implemented protection from a botnet scouring the database for SSID information.

justcommenting · on Oct 29, 2014

there are indeed many useful ways that server logs can positively contribute to improving user privacy; i just thought the attitude of "well of course...that's what everyone else does" (even though that's true!) was dismissive of good-faith privacy concerns.

mike-cardwell · on Oct 29, 2014

Google provides an opt-out. They will ignore your AP if you append "_nomap" to the end of your SSID.

Does Stumbler also support this? If not, why not?

Ref: https://support.google.com/maps/answer/1725632?hl=en

edit: Found the answer to my own question. Yes, they do support "nomap": https://github.com/mozilla/MozStumbler/issues/149

paulojreis · on Oct 29, 2014

Is this really a thing? WTF?

I feel very ashamed, as someone who works in IT, everytime this happens. I mean, people can opt-out, of course - but, in order to do that, they need to know what an SSID is, and how to change it. What about people who don't? Will we just assume that they don't care or that their opinion doesn't matter?

DCKing · on Oct 29, 2014

As someone who works in IT, I always feel ashamed to see outrage over this. We somehow want both privacy as well as a freaking radio beacon spreading out a signal to hundreds of meters away. Let there be no mistake: using a Wi-Fi router in your house means you are voluntarily broadcasting an identifier to anyone within hundreds of meters. There can be no honest expectation of privacy there.

If you don't want people obtaining information from a radio beacon in your house then do not put a radio beacon in your house. But don't pester companies for opting out of the passive database of radio signals you are voluntarily sending into the world. You cannot have your cake and eat it too.

Furthermore, there is nothing intrinsically revealing about an SSID. If your SSID tells people information about you, the problem is the SSID and not the collection of that information. It is trivial to change your SSID to a pseudonymous one.

I know that a lot of people are not aware of the privacy consequences, but those people are not the ones making a point out of this. Once you educate yourself about the privacy consequences of using a Wi-Fi router, do not blame people for collecting information that you are actively and voluntarily broadcasting!

mike-cardwell · on Oct 30, 2014

As you walk around, your phone broadcasts a list of wifi access points that you have connected to.

The existance of these databases mean that anyone who has unrestricted access to query the database, can probably figure out where anyone else who enters their vicinity, lives and works, completely passively.

fulafel · on Oct 29, 2014

Well, having your curtains open also broadcasts an image of your living room on EM spectrum for hundreds of meters for anyone with optics... Same for eavesdropping (laser mic). Easy to listen maybe but you will still get convicted in both cases.

DCKing · on Oct 29, 2014

The difference being that having a Wi-Fi router means actively powering a device that sends a signal beyond the perimeter and privacy of your home. A signal that, as evidenced by this app, can be passively [1] picked up and processed by any casual passer-by.

Having a Wi-Fi router with an SSID is the equivalent of installing a speaker on the top of your house and have it constantly spell a uniquish name to the neighborhood. It might be useful for you to have that, but you might want to think a bit about what it means for your privacy.

[1]: Not having to aim or target anything, not having to have exotic instruments, but being able to be picked up by anyone at all by just listening.

makmanalp · on Oct 29, 2014

One could argue that the main purpose of the device (or the main reason users use the device) is not to broadcast identity, it is to let the user connect to the internet within the perimeter of their domicile.

Just like you can argue that the main purpose of windows is not so that people can look in, it's so that people can look out, and light comes in.

I agree partially with what you're saying, but there is a mismatch between user expectation and what the technology actually does. I don't think the fact that the user used it implies they consented to the technical side effects.

fulafel · on Oct 29, 2014

Having the lights on in your living room or exercising your vocal cords still fit your description.

DCKing · on Oct 29, 2014

Neither of these have either:

1) The same accessibility for a passerby outside of your house.

2) The same constant, location identifying properties or information content.

The things you mention cannot be described as beacons.

justcommenting · on Oct 29, 2014

I can also passively collect plenty of WEP traffic being broadcasted over public property and decrypt it on my computer (but I don't).

Mozilla's not aiming to do anything remotely as invasive as that, but I still don't find "anything that can be picked up passively from public property is fair game" a very compelling ethical standard, especially for an organization like Mozilla.

DCKing · on Oct 29, 2014

> I still don't find "anything that can be picked up passively from public property is fair game" a very compelling ethical standard

This is a strawman.

Any public information that can be picked up passively from public property is fair game is the real argument. Decrypting WEP, easy enough as it might be, is still unethical as the information was meant to be private. Making a database of public SSID broadcasts is completely ethical as there should be nothing private about an SSID.

unshure · on Oct 29, 2014

It's not the SSIDs but the BSSIDs that end up in the database, isn't it?

hannosch · on Oct 29, 2014

Yep. These services only store and transmit the BSSID (which is most often the mac address of the network card).

The only place the SSID (clear text name) is used is in filtering out things on the client end. Both looking for "no SSID" / hidden networks and the _nomap suffix. The SSID is never sent to any service.

justcommenting · on Oct 29, 2014

you're arguing that there's a clearly defined category of broadcasted signals that can be clearly defined as public; i'm arguing that at least in ethical terms, what matters is whether the person behind the device knows and understands that their signal is leaking, where, and how that information could be used. for most people most of the time, i don't think that's the case. maybe we should agree to disagree :-)

DCKing · on Oct 29, 2014

> you're arguing that there's a clearly defined category of broadcasted signals that can be clearly defined as public;

This is another strawman.

I'm not arguing for a particular clear-cut definition of "public" and "private" at all. I'm arguing that the distinction public and private can be made for some forms of communication, and that a radio broadcast to your neighborhood means it is public, and encrypting your traffic means it is private. In addition to that there is also a greyer area like unencrypted traffic over a wire, that should mostly be considered private from an ethical perspective.

I agree that most people don't really know what they're doing, and I agree that it is problem. I also think that most people don't really care, and considering no information is contained in most SSIDs rightly so. Lastly I think that education is important for this, not regulation (legislative or internal) for the collecting companies or individuals. But all of that is not what I was arguing against.

paulojreis · on Oct 29, 2014

> I know that a lot of people are not aware of the privacy consequences, but those people are not the ones making a point out of this.

Of course they are not making a point - they are not aware. How would you expect them to make a point?

What you saying is: if they don't know enough about the subject to decide if a point should be made, then we should ignore the right to give (or not) an informed consent (because you can decide for them if the SSID is "intrinsically revealing" or not).

DCKing · on Oct 29, 2014

How about we stop being so condescending, educate people to make an informed choice, and stop asking Google, Mozilla and anyone with a smartphone to think for them?

paulojreis · on Oct 29, 2014

I totally agree, and that's not the issue at stake. The issue is: what should we do while people are not educated enough to make an informed decision?

Mozilla, Google (and some people in this thread) assume that it's right for them to decide if there are privacy concerns and advance with their initiatives. I don't. And they are, by marking this as opt-in, thinking for them.

DCKing · on Oct 29, 2014

It's still someone's own choice to install a Wi-Fi router and powering it on. The fact that many of them don't exactly understand that it might be privacy issue (if and only if they put identifying information in the SSID) does not mean that Mozilla and Google are thinking for them. The assumption that the router owner does not mean the SSID to be public is also not warranted.

If the SSID was mandated to be identical to someone's name (or any other identifying information), I'd say the problem you describe was real. But since it the information broadcast is mostly pseudonymous, I think it's quite a small thing you are arguing. If people are including personal information in their SSID, by all means tell them!

justcommenting · on Oct 29, 2014

Would you say the same thing if I set up an IMSI catcher at your home and geolocated the other radio beacons broadcasting from your home, or would that be creepy?

You might jump to say "stingrays are illegal so that's different" and in some ways, you'd be right. But it's also the case that the average user's expectations about how their wireless devices will be systematically located by third parties are better codified into law and policy in that case than in this one.

DCKing · on Oct 29, 2014

I don't understand your comparison. An SSID broadcast is meant to be public information. An IMSI catcher actively exploits weaknesses of implementations to MITM non-public connections. IMSI catchers do not catch public information at all, they break into meant-to-be-private connections.

justcommenting · on Oct 29, 2014

the only thing most people most of the time mean when they set up wi-fi is that they want to be able to connect their ipads and chromebooks to the internet at home.

IMSI catchers intercept signals broadcasted from radios that commonly transit across public property. my point was that we routinely consider things other than protocol specs in determining whether and when signals should be collected.

DCKing · on Oct 29, 2014

> the only thing most people most of the time mean when they set up wi-fi is that they want to be able to connect their ipads and chromebooks to the internet at home.

These are not the people I'm arguing against, and I mentioned that in my first post. People should definitely be educated about the privacy consequences of their equipment. I'm arguing against people who do know that an SSID broadcast is a public radio signal they themselves transmit, and are still arguing that other parties (Google, Mozilla) should be responsible for their privacy regarding that signal instead of themselves.

> my point was that we routinely consider things other than protocol specs in determining whether and when signals should be collected.

A radio signal that is explicitly meant to be public should be public information. A radio signal that is meant to private, but can be made public by exploitation or specialized instrumentation should not be public information almost all of the time. If the meant-to-be-public signal can be collected en masse by an app such as Mozilla's, then there's really no way people should feel any expectation of privacy in this regard.

justcommenting · on Oct 29, 2014

Unless Google or Mozilla affirmatively knows that a given user understands the implications of broadcasting their SSID, I don't think it's reasonable to assume that everyone still broadcasting their SSID is doing so deliberately in the informed-consent for mapping sense of the word. That doesn't make Google or Mozilla bad...I just don't think it's a reasonable assumption for organizations to make.

It's hard for me to think of ways these organizations could reliably know whether people don't mind their SSID being mapped or used for related purposes without asking them.

JackC · on Oct 29, 2014

Can you say more about your privacy concern here? I'm not seeing it.

As far as I know, the sole use of this database is to say, "if you can see this set of wifi networks, then you are probably at this GPS location." It's literally the same thing, except at a different electromagnetic frequency, as saying "if you can see houses with these addresses, you are probably at this GPS location." Kind of like a street map.

I definitely think that privacy concerns can emerge when you aggregate public data -- is there something I'm missing here?

bajsejohannes · on Oct 29, 2014

The privacy concern as I understand it is about access points moving in time, not about the snapshot of the data at a certain point.

So you can use my access point to find your location, but if I bring it to my next home, please don't record that in public data.

JackC · on Oct 29, 2014

This is a great example -- thanks.

So, is it fair to say that there's no privacy concern if the API only exposes a one-way lookup? I.e. "here are the access points I can see -- where am I?"

That also addresses the other concern raised below, that the database could be used to search for known-vulnerable routers.

hackuser · on Oct 29, 2014

> is it fair to say that there's no privacy concern if the API only exposes a one-way lookup?

It helps, but no. The data is still there to use. The API or Mozilla policy may change, or security may fail.

From what I can tell, there's no need to record either the devices gathering data or the devices looking up their location. Just don't store that data and everything is fine.

JackC · on Oct 29, 2014

Oh, another example that affects even the one-way lookup is stalking -- if I've been over to Joe's house before, and then he goes into hiding, I can say, "hey, I see Joe's access point, where am I?"

That could be mitigated by requiring at least two access points for a query.

hannosch · on Oct 29, 2014

Both the Mozilla API as well as Google have this "you need to know two" protection. At Mozilla we go a bit further and also make sure the two BSSID's you are sending aren't almost identical. That happens in a lot of modern access points who are setup with separate 2.4 and 5GHz networks or those who have a guest network.

paulojreis · on Oct 29, 2014

Your point is wrong from the beginning: I don't have to explain my privacy concerns, and neither do the others who don't know what an SSID is. "Privacy" should be the default and without need for justification, not the other way around.

antsar · on Oct 29, 2014

How do you feel that privacy is being violated by scanning SSIDs and pinning those SSIDs to GPS coordinates? I believe JackC's point is that there isn't a privacy concern here. The consumer's router is blasting out the SSID for everyone to hear, just like if you were standing on your roof shouting, or had a poster on the outside of your house. There's nothing wrong with those things being recorded, what makes SSIDs different?

publicfig · on Oct 29, 2014

I understand the sentiment but really feel like it falls when looking at the actual situation. It's checking on things that are broadcast outside of your own property, and even offering a way to opt out. It's like transmitting radio waves from your property and asking that no one listens. You have a control over the distribution method or whether or not it even exists.

justcommenting · on Oct 29, 2014

so everyone should have to choose between having their home router's info added to large, aggregated databases and reconfiguring/not operating a router?

i know plenty of people for whom that's not a choice they're likely to know about. perhaps mozilla/google shouldn't be able to dictate my SSID or its visibility just because they don't want to incur the cost/complexity of obtaining affirmative, informed consent.

publicfig · on Oct 29, 2014

Yes, everyone should have to chose that. This should be a choice to make when you are broadcasting a signal out beyond your property. This would be like arguing that your wireless network shouldn't show up in the dropdown list you see when trying to connect to a wifi network. If it's a major concern, then you always have the possibility of using ethernet, but this information is publicly available.

justcommenting · on Oct 29, 2014

I understand the spirit of your comment, but the number of non-technical people, especially in cities, that even know when signals are being broadcast outside their homes is likely quite small. And it's probably almost never deliberate.

If technology perfectly reflected people's intentions for their devices, I think we'd see relatively few people deliberately broadcasting wi-fi outside of their homes intentionally and most people's SSIDs wouldn't show up on any dropdown outside their home.

I agree that this information is often available from public places, but I was getting at whose priorities should dictate whether/how the information gets collected and how it's used--people who paid for devices they may not fully understand or be able to control, or organizations that want to systematically exploit signals from them for different purposes that may be different from those of the person who owns the device?

Someone1234 · on Oct 29, 2014

Here's an analogy:

Everyone who travels past your home can see if the lights are on in the evening. They can also see which lights are on in the front of the house.

So I'm going to give you three scenarios and I want you to tell me when exactly it becomes a privacy issue:

1) A single person travels past your house and happens to notice which lights are on.

2) Someone travels past your house and records, on a piece of paper, which lights are on.

3) A Google car travels past your house and records, electronically, which lights are on.

Same thing with WiFi SSIDs here. It is like you standing on the roof of your home and shouting your ATM pin using a bullhorn, then complaining when someone else hears or records the information.

You want people to stop "monitoring" your SSID? Stop freaking broadcasting it at all.

JoeAltmaier · on Oct 29, 2014

That solution is suboptimal. If you don't broadcast it, then properly provisioned clients have to probe for it. Which they do, on every channel. So you go from one device beaconing the SSID (your AP) to all client devices advertising it, on every channel.

justcommenting · on Oct 29, 2014

I think the difference we're talking about here between #1 and #3 is that #3 makes it much easier/cheaper to (for example) predict when you'll be out of town if they want to break into your house (router)...potentially even without ever traveling past it.

Just because this information is legal to collect, doesn't mean people think a nonprofit that claims to be committed to user privacy should be moving the center of gravity closer to your third scenario.

But maybe more importantly, we're not talking about "someone else" recording the information or just a few "people" "monitoring" an SSID. We're questioning the wisdom of an organization building software to systematically collect, store, and make an SSID far more readily available to far larger numbers of people.

unshure · on Oct 29, 2014

It's the BSSID that is made far more readily available, not the SSID.

paulojreis · on Oct 29, 2014

Analogies are analogies because they're similar, not identical.

> You want people to stop "monitoring" your SSID? Stop freaking broadcasting it at all.

This is technocentrical BS, washing the hands to justify doing what you want.

1) Most people don't know that their SSIDs are being recorded (with position), so how do you expect them to make an informed decision? It's not like the information is readily available (I work in IT and I did not know about appending "no_map" to the SSID).

2) Everyone has a router, broadcasting the SSID. Do you really and honestly expect everyone to know how to disable it?

Someone1234 · on Oct 29, 2014

I don't think it is a privacy violation AT ALL. And nobody in this thread has even tried to explain why it is.

Just hand waving and "we don't have to explain ourselves, privacy is the default state!"

I gave an analogy above, you didn't even answer it. When does it become a privacy issue exactly?

paulojreis · on Oct 29, 2014

> I gave an analogy above, you didn't even answer it. When does it become a privacy issue exactly?

You see, that's the problem - and that's the point. I did not answer because:

a) I don't really care about my SSID privacy. I do, however, care about other people right to know what's happening and to make informed (not implicit, by Google or Mozilla rules) decisions; and

b) I really don't (shouldn't) have to. It's not your concern when or how I feel my privacy being violated. I don't have to answer that, and it's a sad, sad society where this happens.

untog · on Oct 29, 2014

But SSIDs are not private. At all. Should what the outside of your house look like be private information? How would that work? What about when I appear in the background of a photo someone took on the street?

paulojreis · on Oct 29, 2014

> Should what the outside of your house look like be private information?

Everyone knows that someone can record the outside of your house; not everyone knows that SSIDs with location can (and actually are!) registered. Do you notice the difference? You can't assume that WLAN specifics are as tacit as knowing that people can look at my house!

untog · on Oct 29, 2014

Even if you did explain to everyone that their SSIDs are being indexed - what would you tell them is actually being indexed? What personal information are they giving up? Your address, age, other residents of your house are already listed publicly. The name you gave your wireless network pales in comparison.

function_seven · on Oct 29, 2014

You don't have to justify them, but the actual privacy-violating mechanism is worth explaining, no?

What is it about SSID-based geolocation that compromises the AP owner’s privacy?

justcommenting · on Oct 29, 2014

see my other comment for a hypothetical: https://news.ycombinator.com/item?id=8527229

through no fault of mozilla's, most home routers are ridiculously, pathetically insecure. this is not a situation that would be improved by making it easier to geolocate routers from specific vendors. if vulnerable routers become easier to find, my communications passing through that router could quickly become a lot less private. would mozilla be responsible? no. but that doesn't mean mozilla sharing my probably-vulnerable router's location wouldn't play a role in compromising my privacy.

zz1 · on Oct 29, 2014

I guess you have a better suggestion, and am curious to hear about it.

paulojreis · on Oct 29, 2014

Of course I do. Make this "opt-in".

Using SSID naming conventions to do this is just dishonest: most of the people who will be scanned won't know what an SSID is. Even if they do, and do have the competence to change it, how many of them will know about this convention? More than this: how many "home network owners" know that their networks are being scanned and georeferenced? This opt-out scheme is ridiculous and plain "hand-washing" - they obviously don't expect people to use it.

zz1 · on Oct 29, 2014

"Make this "opt-in"" is a statement, not a procedure. How, and why? Would the cons outwheigh the pros?

paulojreis · on Oct 29, 2014

> "Make this "opt-in"" is a statement, not a procedure. How, and why?

I don't really care about the procedure; Snailmail, if need be. Convenience is not a valid argument for breaking privacy.

> Would the cons outwheigh the pros?

The answer to this is dependent on one's stance. As you may imagine, from where I stand, they do (clearly). I can't see any "logistical inconvenience" justify breaking privacy by default.

anonymousDan · on Oct 29, 2014

Why not simply require the SSID to end in _map? If Google think it's so easy for users to do, then surely this shouldn't present a problem?

justcommenting · on Oct 29, 2014

the obvious option is to request consent instead of violating people's privacy and make a dragnet data collection effort like this opt-in instead of opt-out.

but that's clearly not the intention here, because how dare anyone question someone else's motives/objectives/priorities for collecting data about devices they don't own. in fact, we're supposed to think this is the "nice" version because a google-funded nonprofit is doing it instead of google doing it unlawfully with cars or through waze.

seeing mozilla move in this direction while talking about how much they respect everyone's privacy is a strategic stumble indeed.

zz1 · on Oct 29, 2014

Did you ever count APs during a short walk in a relatively low density neighbourhood? You have hundreds in under 100m. Tell me, how do you plan to ask each and everyone of them? Ring on every doorbell?

-Sorry, is FritzBox!239?

-No, here is YouMakeTooMuchNoiseWTF

-Oh, I see, could you please pass a message to your neighbour? I'd very much appreciate if he could please fill in this form and send it back through paper mail to Mozilla…

justcommenting · on Oct 29, 2014

that would be one respectful way to do it, and yes, challenging.

but the premise of your comment is that of course my device's SSID and related location should be collected in someone else's database because a google-funded nonprofit wrote an app for people to go wardriving with.

just because SSIDs can be legally observed and collected doesn't mean i have to be happy about it. I wasn't talking about this as a technical problem as much as an ethical/political one for an organization that claims to be committed to my privacy...except when it's not.

crankycoder1975 · on Oct 29, 2014

Hey, we're happy to hear about privacy concerns and ways that these might be addressed.

As for collecting your SSID information - devices are already storing SSIDs to do an active scan.

If you're not happy that the Mozilla Stumbler can record that SSID, you should probably also be unhappy that all WiFi devices capable doing a probe request - which is basically all wifi devices.

As far as the ethics concern - I'll bite.

This is one of the privacy reasons why we do not publish the wifi database yet. We haven't figured out a way to do this without exposing too much personal data yet.

We've got some rough ideas on how to do this, but nothing good enough yet that we'd be willing to expose our users to this risk.

justcommenting · on Oct 29, 2014

"devices are already storing SSIDs to do an active scan" - Not mine, although I would readily acknowledge that I'm in the minority and this is generally a truism.

And thank you for acknowledging privacy concerns over publishing the wifi database, although I'm personally still concerned whenever that information gets aggregated systematically, even if it's internal to Mozilla.

One way I think about privacy for data like this is respecting people's intentions. When most people set up wi-fi, I would argue that their intent is almost never to help Mozilla or Google precisely locate phones or IP addresses; it's to connect wirelessly to the internet. More to the point, it's hard to find out someone's intention without asking them. Kudos to Mozilla for getting people to wardrive consensually; but that may still not make me feel much better if I'm just someone with wi-fi.

cpeterso · on Oct 29, 2014

Just to clarify, the Mozilla Stumbler apps looks at SSIDs (to filter out "_nomap" and known mobile phone and transportation networks), but the SSIDs are not reported to the Mozilla Location Service. The BSSID/MAC addresses are, though.

zz1 · on Oct 29, 2014

Since you are underlining the provenance of Mozilla's budget, I guess that when Google stops financing Mozilla everything will change for you. Otherwise you are just lining up words to make a big impression but without any meaning or clue at all.

Don't you want everyone to observe your SSID? Hide it. You are cluttering the public's ether, so you are subject to public scrutiny. Don't you want to add "no_map" to the end of it? Shut up.

Or just do what Buckiminister Fuller told you to do: do not criticize a system but build a new and better one to obsolete the one that don't work. I promise to print your form if you start with a better approach. Unless you are not a complete idiot and understand that it is a theoretically possible way to deal with the problem but not a feasable one. Anyway, go on, just complain and talk nonsense: it will help. A lot.

justcommenting · on Oct 29, 2014

i don't think i was the first or the only person to point out the similarity of this data collection program to google's street view program and related legal/policy/privacy issues that arose with it.

as engineers, we often end up offering people choices that aren't really choices. for my grandmother's ISP-provided wi-fi access point, adding no_map to her SSID isn't a choice she's prepared to make, and i don't think those are reasonable expectations for the average user.

when people suggest otherwise, i think that part of what they seem to be arguing is that the technical problem they're trying to solve--often for commercial gain--is more important than being respectful of other people. people shouldn't have to know how to hide their SSID or add "no_map" to their SSID to stay out of large databases by default.

my view is that the world is a better place when information sharing is consensual, even when it's otherwise legal to obtain that information. i think that's a better world than one in which we tell people to hide their SSIDs or add "no_map" to them. i'm interested in building software and systems that respect people and their devices.

Ygg2 · on Oct 29, 2014

How is this any different than robots.txt?

I don't see your point. If you are ignorant enough to not know how to secure against such measly attempts at privacy breach, how will you secure against a more determined hacker?

Further more the SSID is publicly broadcast, so that any device you authorized can identify and connect.

justcommenting · on Oct 29, 2014

i didn't say i didn't know how to secure against something like this or that it was not legal.

my point was that this approach to data collection, consent, and privacy sharply and directly contradicts claims mozilla makes to users about being committed to their privacy. i think this reflects the opposite.

maybe a better analogy would be someone from the ACLU photographing everyone they saw in public: legal and easy to defend against, but hypocritical/not cool in my opinion and it might make me question the organization's priorities.

fooqux · on Oct 29, 2014

I understand what you're saying, but you have to draw the line between privacy and common sense at some point.

It has been understood for awhile now that you have no expectation of privacy in public, at least as far as not being photographed, talked to, etc. Most people would probably agree that the paparazzi taking sneaky pictures of celebrities buying milk at Kroger aren't being very classy, but they'll also probably say it's fair game at that point.

Likewise, I would argue that broadcasting your SSID over the electromagnetic spectrum is public. As far as privacy is concerned (I have a slightly different opinion when it comes to security) I still haven't seen any compelling argument explaining how having your SSID mapped to a location is an any way a violation of privacy. Maybe you have one?

justcommenting · on Oct 29, 2014

Sleazy paparazzi can exist in the world without breaking the law, but I expected more than that from Mozilla.

One hypothetical example: SSIDs often betray vendor names out of the box, and home routers are typically embedded devices that don't frequently receive security updates. Suppose Mozilla makes its database public and lists my SSID--or more likely, some weakly-secure hash of my SSID--in a public database that later gets compromised (e.g. plenty of people know their own SSIDs). Then, through no fault of Mozilla's, there's some 0day announced for my router. Now, every script kiddie in the neighborhood's using metasploit against a pre-selected list of vulnerable routers, potentially even remotely depending on their ability to integrate information from other sources. Maybe that sounds like more of a security issue than a privacy issue, but at some point, the effect is the same.

fooqux · on Oct 29, 2014

As you said, that's not a privacy issue but a security one. Also, in your example I'd argue it would just be easier to attack every single IP address and/or WAP rather than attempt to figure out which ones are Linksys and running a vulnerable firmware. It would take less time and also solves the case of non-default SSID names.

I'm still interested in seeing an example of how linking SSIDs to physical locations is a violation of privacy. Especially compared to, say, linking my full legal name to my house address which is already treated as public knowledge.

justcommenting · on Oct 29, 2014

I don't think you'll like my answer, but I think it was Schneier who said that it's not necessarily any one thing: it's having easy access to a bunch of different things, together.

Ygg2 · on Oct 30, 2014

I believe that according to law, the onus is on the owner in question to make sure their WiFi Router is secure. If a hacker takes control of your router, and downloads pirated material, you are considered responsible if you didn't take even necessary steps to protect yourself. Then you sue manufacturer, and all routers come with a set of different passwords and _no_map by default. That is the most likely logical course of action.

Everything else is idealizing. Same as with video and with DRM. Mozilla could take a principled stance and say no to patented codes and no to DRM, and then Google says yes to both of those things, reap the benefits, while the end consumer abandon Mozilla because it doesn't play YouTube or Netflix, and then Mozilla is no more.

If you don't like it, you can fork Firefox and/or choose not to trust Mozilla. The situation is super sad, but what else can you do? Be principled and disappear? Or compromise and survive?

Touche · on Oct 29, 2014

Your SSID is being broadcast on public property. You have no claims of privacy there.

justcommenting · on Oct 29, 2014

i never disputed the lawfulness of doing this--in fact i explicitly acknowledged it in my last comment.

i'm not making any legal claims to privacy--just pointing out that collecting everything that's lawful to collect runs counter to mozilla's policy stance of being committed to users' privacy.

paulojreis · on Oct 29, 2014

> seeing mozilla move in this direction while talking about how much they respect everyone's privacy is a strategic stumble indeed.

Yup. I'm as heartbroken as I can be with a company.

A "hand-washing" attitude towards privacy from Google, Facebook or a telco is expectable. But from Mozilla? This saddens me, way more than the support of DRM in the web.

justcommenting · on Oct 29, 2014

agree completely--it's one thing to do this sort of thing (it's probably legal, etc.), but to do it while claiming to be fighting for user privacy is really galling to me.

walterbell · on Oct 29, 2014

An option during router setup when SSID is named? An organized promotion of this approach, e.g. list of routers that make it simple, list of services that honor it. A cool name other than "Do Not Track".

zz1 · on Oct 29, 2014

That would require an action on the router-side, not on the mapping one. I think it could be a good option, but this doesn't apply for this case.

Also, what if I agree to put my SSID into an open database but not in a locked one? Apple's and Google's location databases do not compare, for me, to Mozilla's one: I am more than glad to be in the latter, but not to be in the two formers.

therealunreal · on Oct 29, 2014

I used to use it a lot a few months back "mapping" my town and I had noticed that it would go at full strength even when I was staying in the same place for hours. Would it be possible to gradually slow down the collections when it detects that the user isn't moving?

zz1 · on Oct 29, 2014

There used to be a mode to geo-fence data collection, i.e. harvest only outside of a certain area, but I don't see it anymore.

crankycoder1975 · on Oct 29, 2014

geo-fencing was extremely rarely used. I think we had single digit numbers of people who enabled the feature and understood what it meant.

We decided to cut out geofencing because it was confusing to most people and reducing the number of knobs and dials on a program is usually a good thing.

We've got a bug filed to hook the accelerometers though: https://github.com/mozilla/MozStumbler/issues/1107

towelguy · on Oct 29, 2014

I see there is a leaderboard, so users have a nick, but can we contribute anonymously without signing up?

Also, why do you need access to my photos?

cpeterso · on Oct 29, 2014

Yes, anonymous is the default mode. :) The nickname is optional and only used to record a leaderboard "points". The nickname is not tied to any reported Wi-Fi or location data.

Mozilla Stumbler does not need to access your photos; it just needs to read/write to your sdcard (to cache map tile graphics and export KML data). "Access your photots" is Google's unfortunate explanation of accessing the sdcard.

raesene4 · on Oct 29, 2014

Cool stuff, always interesting to see a new entrant to the location services field, especially an open one.

One question though. Will ordinary users be able to directly query the database via the API? i.e. if you want to geolocate a set of Access points and you have their MAC addresses will this service provide a direct supported API for retrieving their location?

The documentation on the API page implies that this isn't really a supported application?

Specifically

"If you are developing a native application or library for a desktop operating system or Android, you can in principle use this service via its HTTPS API. Please refer to the development documentation for the details. On most other operating systems, you cannot access the required cell and WiFi network information required to call the service API. "

and

"At this stage the service is open to anyone, who wants to contribute back to the service and applications supporting the Mozilla mission. "

seem a bit vague.

cpeterso · on Oct 29, 2014

btw, the Mozilla location database is currently used to power geolocation on the well-publicized Firefox OS "$25 smartphone"! That device does not have GPS hardware so all geolocation requires rely on Mozilla's location service and GeoIP.

yzzxy · on Oct 29, 2014

Any plans to gamify/add user incentives for "Stumbling?"

ThePinion · on Oct 29, 2014

This would be an interesting addition. I realized that Google is showing off Ingress as a game, but most technical users know it's about data-mining for maps and other location based products. Mozilla is openly showing this off as something that is about gathering data.

cpeterso · on Oct 29, 2014

We have leaderboards showing the top stumblers overall and for the last seven days. We have lots of gamification ideas inspired by Ingress or the Nintendo DS game "Treasure World", but nothing in the works yet.

https://location.services.mozilla.com/leaders

https://location.services.mozilla.com/leaders/weekly

https://en.wikipedia.org/wiki/Treasure_World

crankycoder1975 · on May 2, 2013

we need Louis CK here: "You're in a BROWSER!"

crankycoder1975 · on April 30, 2013

One of the driving motivations was simplicity for developers and get a reasonable out-of-the-box experience.

This comes from a couple things.

Go compiles to a single static library so you don't have to worry about having dozens of "the right" library installed on your machine. Grab the heka binary and run with it.

This greatly eases our operations work as we have fewer dependency conflicts to deal with when we push things to production.

mapleoin · on April 30, 2013

That doesn't make a lot of sense. You don't have to write a monitoring software from scratch just because you want statically compiled bundled libraries. You can do that with any programming language.

tptacek · on April 30, 2013

How do you run Python, Java, Perl, Ruby, or any JVM language without an installed runtime?

coldtea · on May 1, 2013

Quite easily. All offer options for building standalone programs that don't need a pre-installed runtime.

You just copy them to some directory, run them and they work.

And some of them even support building native binaries (e.g Java through gcc).

tptacek · on May 1, 2013

Virtually nobody in practice uses any of these.† Java binaries are in practice JVM bytecode in classfiles. Python programs are run by the Python interpreter.

Go compiles to native code. Not only do you not need a preinstalled Go runtime on a target system, but there's very little advantage to even having one. The normal way of installing a Golang program is simply to copy the binary and run it. That's powerfully simpler than most other modern programming languages, with the obvious exception(s) of C/C++/ObjC.

† Commenter downthread says the same thing, but let me add that we look at other people's Python/Java/Ruby programs professionally, and I can't recall a single client ever doing anything like this.

coldtea · on May 1, 2013

>Virtually nobody in practice uses any of these.† Java binaries are in practice JVM bytecode in classfiles. Python programs are run by the Python interpreter.

The "Virtually nobody" this is because the main use case for Python and Java are as server side languages (both) and scripting languages (Python). In those cases people are expected to have or to setup the appropriate runtime beforehand.

But for people who want to ship apps to end users (customers and consumers) with Java and Python, the bundling thing is very very common.

People using them in the end user space, regularly do it this exact way. For most of them, you don't even get to know what they use underneath.

Some examples:

- Dropbox (uses and bundles Python in the app).

- Vuze torrent client (previously Azureus and very popular in its prime) bundles a JRE (for when you don't have an installed one).

- LightTable is just a JS runtime bundled with Webkit as a standalone app.

aphyr · on May 1, 2013

Huh. Most of the JVM shops I've worked at deploy apps as a monolithic fat .jar, built by their CI system, rather than trying to manage libraries on classpath.

mh- · on May 1, 2013

I think GP comment was referring to the JVM 'runtime'. which, to me, is less of a complication than the Python scenario.

On that note, Python packaging/deployment/repeatability is still a disaster. If you have code with dependencies on compiled C extensions, there are few good ways to deal with this in prod.

In summary, I think a lot of us find the idea of monolithic binaries appealing (perhaps even to an irrational degree, speaking for myself) because of issues suffered in the past. :)

lucian1900 · on May 1, 2013

With the gigantic disadvantage of security updates requiring recompiling everything :(

marshray · on May 1, 2013

I've never seen an organization that didn't do a full rebuild of every build product contained in each release anyway. Usually it's just faster and less error-prone to do a full rebuild than to recompile the minimal set of source files and relink.

lucian1900 · on May 1, 2013

If one uses dynamic linking, one can use (some) system-provided libraries, which will get security updates in the usual manner.

marshray · on May 1, 2013

It looks like Go supports dynamic linking to "system" libraries. At least on MS Windows this https://code.google.com/p/go/codesearch#go/src/cmd/dist/wind... call to FormatMessageW http://msdn.microsoft.com/en-us/library/windows/desktop/ms67... would be to an implementation in Kernel32.dll that would receive security updates.

On Linux there's a large gray area for things like libexpat.so.1 that may or may not be linked dynamically. But libc is LGPL, so I expect it too would be linked dynamically.

tptacek · on May 2, 2013

Yeah, true, each time that happens, it'll be 450ms of your life you'll never get back.

rdw · on April 30, 2013

Technically, you could use something like PyInstaller[1] to bundle the runtime and all libraries into an executable package. Practically, no one ever does that. :)

[1] http://www.pyinstaller.org/

dkhenry · on April 30, 2013

Interesting so its more of a ease of use then a performance issue ? The numbers you quotes for performance seemed impressive.

crankycoder1975 · on April 30, 2013

We needed performance as well as simplicity.

We started by extending logstash, but our needs were more "we need a router" and logstash isn't meant to be a router.

Statically linking the world isn't trivial. For our existing Python code bases - how are you going to deal with third party libraries from PyPI?

Come by on #heka on irc.mozilla.org, we're kicking around in there.

mapleoin · on April 30, 2013

> Statically linking the world isn't trivial. For our existing Python code bases - how are you going to deal with third party libraries from PyPI?

Depends on what you want?

You could freeze the pip-requires to always install the same version and use a virtualenv per application. This is basically the same as bundling everything together, it has all the benefits with the least amount of work.

You could use distribution packages for security, correctness and stability or even roll your own repository inside your infrastructure to absolutely control everything.

Finally you could just bundle everything manually by fooling around with the PYTHONPATH and putting all the dependencies in a single directory. This is kind of like improvising your own virtualenv, it's very hacky, but it can work.

nonsequitarian · on April 30, 2013

Another member of the Heka team here. Yes, there are a lot of options for managing Python deployments. But none of Python's stories are as nice as "Here's a single binary, put this on every machine."

jonastryggvi · on May 1, 2013

Exactly! I started playing with Go a few days ago, and immediately started thinking that this would be the perfect language to create something like LogStash, Flume or SplunkAgents in - and install to the machines that need to forward data to our centralised logging system.

It really bugs me that I have to have an Python interpeter on the frontend web machines (cause I would prefer not to have a C compiler there)..

mapleoin · on May 1, 2013

> It really bugs me that I have to have an Python interpeter on the frontend web machines (cause I would prefer not to have a C compiler there)..

You don't need to have a C compiler installed for the Python interpreter to work. I hope you're joking...