One good reason to use a list like this is to filter suggestions. If a user types 'cun' into a search box, almost all websites will not want to suggest you-know-what as a completion.
Edit: and alas, at Blekko they never let me ship an April-fool's "Did you mean: Mother trucking son of a blintz?" module. No sense of humor.
He's talking about text autocomplete, which is a suggestion mechanism. Autocorrect knows full-well that "fucking" is a word, and doesn't put a squiggle under it or anything. But autocomplete—the thing that happens if you write "fuc" and then move the insertion cursor—will skip right past that word in the Autocorrect database give you "duck" as the best-fuzzy-match.
So true. People will always find a way to talk euphemistically. And on the other end of spectrum, you can't really filter out borderline-offensive phrases.
Reminds me of a system I worked on where we generated temporary passwords for tens of thousands of invitations to a big event. Somebody tried to be clever and generate "user-friendly" passwords by combining words from a public 1st grade vocab list. It seemed like overkill for a temporary password, which had to be changed upon initial login, but whatevs, we had extra time on the project. It looked good in testing.
Within hours of sending out the first batch of invitations, we started getting complaints of people "not comfortable" with their passwords. I don't remember all of them, but some great examples were things like "donkeybanana1014", "drunkgod9488", "devilboy4593". It wasn't a huge PR problem, mostly just caused some laughs and little support scrambling, but I filed it away as yet another example of someone getting burnt by trying to be clever.
Tangential but related: Toontown Online had two communication systems: "speed chat" and "secret friends". Speed chat is basically safe, prefabricated messages. Secret friends is unfiltered text chat, but only between people who have swapped friend codes. Since the game didn't support swapping friend codes, they were trying to make sure you only connected with people you knew.
However, players figured out a language to encode friend codes using speed chat phrases. So you would send a series of speed chat messages, the other player would send some back, and then you'd have each other's friend codes.
Made me snigger. Incidentally, you can't use that word on many forums: snigger.
I live near a town called Scunthorpe. Scunthorpe residents tend to use the nickname Scunny when they're on-line, because the full name gets blocked so often.
I made a reasonable whack at solving the Scunthorpe problem by building a whitelist of terms containing those 4 letters, and a similar list for the Hiroshita problem. Alas, now IBM owns all that code. As I build a search engine with autocomplete for the Wayback Machine, I'm really missing having it.
Yes. I used to frequent a message board that implemented a really dumb profanity filter. It got to the point that its users would greet one another in face to face meetings with "Hecko!"
Today I learned that Herbie Hancock is a jazz musician who made an album called Mr Hands. I also learned that there is a video of a man being buggered by a horse called Mr Hands.
Thank you for enriching my life.
- a James Bond film
And at this point I lose all respect for the compiler of this list. Octopussy is a James Bond film. That is all. Any other usages are just silly.
- the surname of a US presidential candidate for the 2012 republican nomination
Santorum! Santorum! Do people actually use that word, or is it just a running joke?
> - the surname of a US presidential candidate for the 2012 republican nomination
That one is a bit of a special case. His name is a swear word because it's his name. As a protest against him, people decided to make his name into a dirty word, and seem to have succeeded.
"Huge tits" but not "huge melons", "nigga" but not "niggaz"? Who wrote this thing? I think this is about 0.1% of the "naughty" words out there, and it's futile anyway (former school sys admin here, I know what I'm talking about ;) This is before we get onto the desirability of blindly blocking words like "nigger" which have different meanings depending on who is using them, ref. "my nigger", or "tits", ref. "blue tits are eating the nuts again".
Well it's still a very basic effort. If the job it's trying to do is annoy users, it's fine, but if it's trying to limit use of bad language, it mostly fails.
There are so many words here about sex, but none that I can see about violence. Sometimes I am so puzzled by American culture (assuming this list was compiled by an American).
my favorite story on this was back in BBS days this one board would change the f word to "gently caressing" and it really changed the tone of heated arguements.
Why was the swastika included in the Japanese list? While it has pretty bad connotations, particularly in the west, in Japan it retains a pre-World World II meaning of holy and sacred.
Protip: if your list of words doesn't include Carlin's Seven, you're not trying hard enough.
Seriously: "splooge moose" gets an entry but "cocksucker" doesn't? Even Urban Dictionary, usually a canonical source for profane euphemisms, doesn't have a definition for that first one...
I wouldn't blindly use this as a blacklist for spam etc, lots of words here like "vagina" might be OK on a medical site or even some racial slurs if a news agency is reporting a quote etc.
Definitely a good list to use as a starting point.
I'm not sure, but the inclusion of "vagina" would not seem to indicate it, as "penis" is included in the list as well. This seems to be a list of sex related words, without additional considerations.
Escort really shouldn't be on there, nor "jelly donut," and words like "hardcore" and "neonazi" are really quite questionable. And yet it's missing words like condom, scrotum, and labia.
Yeah, serious question: Is this really not safe for work?
Are there workplaces where you'd be reprimanded for looking at a list of vulgar words, especially in this context? (More than you would for any other "unproductive" activity, like say, reading HN.) And if it's just about getting flagged by some monitoring software, I'd think this thread would be just about as likely to get you in trouble.
I get that NSFW is inherently subjective and context-dependent, but for me personally this is not NSFW, i.e., totally SFW.
“I want to stick my long-necked Giraffe up your fluffy white bunny.”
http://habitatchronicles.com/2007/03/the-untold-history-of-t...