Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Michael Tremante here. I'd like to address some points openly as I'm personally mentioned in the forum. I reached out to the Pale Moon community on behalf of the team to try and resolve the issue with the Pale Moon browser.

- We sent our standard NDA to speed things up. I explicitly said in the message that it may not be required, but in the interest of moving fast we sent it to them so they could review it just in case

- We are committed to making our challenge system work on all browsers by clearly documenting what APIs need to be supported. For example, part of the issue with Pale Moon, is that it does not support CSPs correctly

- Notwithstanding the above, to resolve the issue quickly we are willing to lower some of our checks if and only if, we find the right approach. Of course this would introduce some security issues that bot developers may quickly leverage

- Contrary to what many have said in this forum, our challenge has no logic that relies on the user agent strings. We rely on browser APIs. We don't have any special checks for any specific browser

- To address this longer term, we are discussing internally a program for browser developers to have a direct channel with our team and we hope to have something to share soon with the browser developer community

I am happy to answer any constructive questions.



Just because some evil is a standard policy does not mean it's excused. The sending of broad NDA just to address a problem with Cloudflare itself is more throwing it's weight around again ala,

"I woke up this morning in a bad mood and decided to kick them off the Internet. … It was a decision I could make because I’m the CEO of a major Internet infrastructure company. ... Literally, I woke up in a bad mood and decided someone shouldn’t be allowed on the Internet. No one should have that power." - Cloudflare CEO Matthew Prince

Requiring every web browser to support every bleeding edge feature to be allowed to access websites is not the status quo of how the web has been for it's entire existence. Promoting this radical ideology as status quo is also seemingly shady but perhaps the above corporate rep is just in so deep so long they've forgotten they're underwater. Corporate use cases are not the entire web's use cases. And as a monopoly like cloudflare they have to take such things into consideration.

But they keep forgetting. And they keep hurting people. The simple solution is for them to make cloudflare defaults much less dependent on bleeding edge features for the captchas. If sites need those extra levels of insulation from the bandwidth/cpu-time to fulfill http requests it should be opt-in. Not opt-out.

The solution for the rest of us humans that can no longer read bills on congress.gov or play the nationstates.net game we've been playing the last 20 years is to contact the site owners when we get blocked by cloudflare and hopefully have them add a whitelist entry manually. It's important to show them through tedious whitelist mantainence that cloudflare is no longer doing it's job.


There is no such intent from us to throw around our weight. The team is challenged with a very hard task of balancing protecting web assets VS ensuring that those same assets remain accessible to everyone. It's not an easy problem.

The features you refer to are not bleeding edge, and not only that, they are security features. We are still discussing internally but I hope we can publish soon the details so that point can be addressed.

Final but not last, this only affects our challenge system, which is never issued by us as a blanket action across Internet traffic. It's normally a configuration a Cloudflare user implements in response to an ongoing issue they have (like a bot problem). We do report challenge pass rates and error rates but we can certainly always improve that feedback loop.


If you can't see how CF is throwing around it's weight I can only assume the traditional Upton Sinclair quote applies.

The vast majority of sites operate without a CSP (only 7% of Alexa’s top 1 million sites have a valid CSP circa 2020, and in the long tail it's much, much less). It's a niche thing and the type of use you do at cloudflare can be considered bleeding edge in practice by comparing to the rest of the web. For most sites on the web CSP is more of a burden than a benefit.

The crashing and freezing of many browsers only affects your challenge system. Your blocking that's impossible to pass with many browsers is either default or so commonly set it doesn't make a difference. You should try using an non-chrome/non-safari/non-edge/non-firefox browser through a non-residential IP sometime and see how many places you can no longer access because of your employer.


"Contrary to what many have said in this forum, our challenge has no logic that relies on the user agent strings."

If that were true then it would be possible to satisfy the challlenge without sending a user agent header. But omitting this header will result in blocking. Perhaps the user agent string is being collected for other commercial purposes, e.g., as part of a "fingerprint" used to support a CDN/cybersecurity services business.


We expect the user agent string to be present, that yes. We don't have any logic based on it's contents though (except blocking known bad ones) and we don't have any exceptions for the major browsers.

No commercial uses around this.


> We don't have any logic based on it's contents

> blocking known bad ones

These contradict. Blocking "bad ones" is logic. Also such claims are disingenuous without defining what "bad ones" are... For all I know (and it surely seems so), you could be defining "bad ones" is "anything that is not 'the latest chrome without adblock and with javascript on'"


that's what the word "except" means that you quoted around.


Yes, and I’m pointing out that phrasing it that way makes the whole statement meaningless. Eg: I don’t eat foods except some that I consider edible. I don’t kill kittens, except those I think are evil. See how it works? Adding a vague “except” to an absolute-sounding sentence destroys its very meaning


Those are different products. BIC prevents requests such as empty UAs or corrupted HTTP requests to pass CF without a challenge.

Turnstile/Challenges per se don't rely on the UA at all.


According to a company representative, CF requires a UA header, checks the contents of the UA header and blocks access to websites based on strings contained in the UA header that match "known bad ones" as part of its commercial "bot protection" services.

None of this implies that using a string that is a "known good one" is enough to satisfy the CF challenge. But CF still requires people to send CF a UA string. Right.

It seems that CF wants to mandate exclusive use of certain clients to access the web, "as a service", presumably ones that are preferred by so-called "tech" companies that sell advertising services.

Imagine if this type of restriction was imposed on CF itself and some third party blocked CF's access to the www unless CF used the software chosen by the third party or the third party's clients.

The www is supposed to be a means of accessing public information. From what I've seen many if not most of these websites blocked by CF "bot protection" are in fact comprised of public information.


The purpose of a system is what it does.

We speak of an arms race between cloudflare and (bad actors) that results in unintended consequences for end users and independent browsers ... and we need to stop.

There is an arms race: between end users and cloudflare.

The fact that a human chimes in on a HN discussion carries no information.


We continuously scrape a sizable number of ecommerce sites and have had no trouble whatsoever bypassing CloudFlare's antibot technologies.

CloudFlare representatives often defend user hostile behaviour with the justification that it is necessary to stop bad actors but considering how ineffective cloudflare is at that goal in practice it seems like security theatre.


I disagree.

We’ve worked across a number of equivalent anti-bot technologies and Cloudflare _is_ the AWS of 2016. Kasada, Akamai are great alternatives and are certainly more suitable to some organisations and industries - but by and large, Cloudflare is the most effective option for the majority of organisations.

That being said, this is a rapidly changing field. In my opinion, regardless of where you stand as a business, ensure abstraction from each of these providers is in place where possible - as onboarding and migrating should be table stakes for any project or business onboarding them.

As we’ve seen over the last 3 years, platform providers are turning the revenue dial up on their existing clientele.


It's success as a business aside, at a technical level neither Cloudflare nor its competitors provide any real protection against large scale scraping.

Bypassing it is quite straightforward for most average competency software engineers.

I'm not saying that CloudFlare is any better or worse at this than Akami, Imperva etc, I'm saying that in practice none of these companies provide an effective anti-bot tool, and as far as I can tell, as someone who does a lot of scraping, the entire anti-bot industry is selling a product that simply doesn't work.


In practice they only lock out "good" bots. "Bad" bots have their residential proxy botnets and run real browsers in virtual machines, so there's not much of a signature.

This often suits businesses just fine, since "good" bots are often the ones they want to block. A bot that would transcribe comments from your website to RSS, for example, reduces the ad revenue on your website, so it's bad. But the spammer is posting more comments and they look like legit page views, so you get more ad revenue.


I don't believe that distinction really exists anymore.

These days everyone is using real browsers and residential / mobile proxies, regardless of whether they are a spammer, or a Fortune 500, a retailer doing price comparison of an AI company looking for training data.


Random hackers making a website to RSS bridge aren't using residential / mobile proxies and real browsers in virtual machines. They're doing the simplest thing that works which is curl, then getting frustrated and quitting.

Spammers are doing those things because they get paid to make the spam work.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: