If they were scraping the information/pictures off the supermarket sites, then I'd have expected the cease and desist to come from the supermarkets.
Given that the letter came from "a company" then I presume they were taking their information from an aggregator - and seems entirely fair that you should "pay the aggregator" (as there's clearly a company out there doing something similar to what they're doing - but they were taking their data).
If this company was just a data-hose, then I'd have thought you could monetize your consumer-focussed product, simply by flogging anonymized data from your users back to the supermarkets.
"Customer X, dropped product Y from their weekly basket (or swapped supermarket), when you raised the price of Z (or other supermarket dropped it)"
> ...then I presume they were taking their information from an aggregator - and seems entirely fair that you should "pay the aggregator"
This seems to be what they are doing. From their FAQ:
"Maintaining this data for over 200,000 products is a costly endeavour and asking them to offer this for free would be a huge ask. We appreciate that the data company is absolutely entitled to request compensation for their work."
You can see that the aggregator has said that Trolley scraped the images and data in question from the supermarket sites.
The implication seemingly being that the supermarket sites are licensing the images and prices of their own products from this third party (potentially only at a peppercorn price).
But this is only implied and not said, which in legal documents is an important distinction. I would suspect the supermarkets license the info on their sites to this aggregator, who seems put out somebody scraped the same information and is not licensing it from them instead, and is implying trolley needs to get a license, while in actuality perhaps they don’t need to.
I think that your interpretation is slightly off and Nielsen might have actual teeth here (again to remind everyone this is UK, in the US Nielsen would be in the wrong). The supermarket owns the database rights to their online store, which it then exclusively licensed (as far as I know) to Nielsen for re-use of third parties. Regardless of your emotions here (and personally I'm disappointed that this is happening), this is UK law and Nielsen and the supermarkets are correct here. If Trolley instead walked physically in stores and then manually collect the data and bought products to shoot photos of* then they have teeth as they have exerted effort into independently create their own database.
* This is now plain copyright law which raises questions, like if the manufacturer-in-question can stop Trolley from publishing those photos (while Trolley owns the copyright on the photo itself, there are de minimis considerations when it comes to the packaging design which is copyrighted by the manufacturer). It might pass fair dealing though.
Then the onus is still on Nielsen to demonstrate an exclusive license. Otherwise they have no standing when trolley scrapes the sites directly, because only the original copyright holder (the stores) may enforce that.
The pictures are the real red flag. Unless a photograph is bundled with a cc license one always needs to assume somebody will need to grant permission for its reuse, and often that will be for compensation. Trolley should have known this.
I very much doubt the price of the product is coming from this company, everything else sure, trolley.co.uk should explicitly ask them if they data aggregator provides the price to the website.
It doesn't matter if they also store that information, (might be scraping that themselves), if they don't provide that to the website, its not infringing their database rights, it is infringing the supermarkets.
Yes - the feed likely consists of things like the canonical product name, description, weight/size, recommended price, pack shots, and nutritional info.
Trolley.co.uk might be able sidestep this demand by displaying ONLY the product names and prices, but it'd be a much less visually-appealing site as a result.
IANAL but given the unreasonableness of the premise as I read it (? scraped data from public websites of multiple companies is "owned" by some mysterious third-party) this sounds more like a consequence of the UK's legal system, where minnows have to shut down when challenged by sharks, because the costs make it impossible to resist, regardless of the strength of the case.
Also, its all a bit vague, especially the reason for the donations - to pay these jokers off? That just enables and encourages the bullying. They should provide more transparency as to who is claiming ownership, and what evidence is being given and legal arguments for paying.
You generally don't own the copyright to images that you scrape from the web, even public websites. A number of companies handle consumer product information internationally (such as GS1 for barcodes) and shops license product information (images, EAN Codes, nutritional information) from them, so they don't have to deal with each manufacturer individually. The manufacturers also work with these organizations.
If they went into supermarkets and took pictures of all the products it would be another story, but as is, they are using pictures on the internet in a way that they aren't licensed to. This is not particular to the UK, just because some site display a picture publically does not mean you are allowed to display the same picture on your site.
The cease and desist letter stated: "proprietary images and data", I doubt a commercial/sponsored database of prices exists. That would have grave anti-cartell consequences. But you're right: for most reasonable people reading the article, the _only_ relevant data on trolley.co.uk _would_ be the price information. (And I don't have any insider knowledge, I was only guessing ...)
For the sake of 30k I wouldn't be surprised if Asda/Lidl/Tescos step in with the cash for a PR piece. Along the lines of "Being so confident they're the cheapest supermarket, we'll back a price comparison site" and helping 500,000 people to save some money.
I'd be bloody surprised as well, Tesco's even charge some of their suppliers to access an online portal to upload supplier invoices! They had two portals for uploading invoices, one that was easier to use but cost, the other was a web browser interface that was clunky. I think one of them was called Marrakesh or something like that.
Having been part of their new store opening team in the past and the way they undercut all other rivals in the area to get people in, which typically runs for a few weeks, depots drop everything to make sure everything is in stock at the expensive of other stores, even running dedicated lorries to the store which is costly but that came out of the Depot Transport budgets, and then Transport would argue with Warehouse if the Warehouse had failed to load cages properly because Wave1/Wave2 picks never go smoothly because goods-in is never to schedule as no one can predict holdups on the road networks, it really is a cut throat business in so many ways. I'm surprised at how many companies put up with Tesco's demands, but thats what you need to do to become a stock market leader! Likewise the public are really just price sensitive before brand loyalty. Life in the UK is very much, you get what you are given, and I've never worked for an entity that didnt have criminal elements either, which is shockingly two faced of the public, but people keep their mouth shut to keep a wage coming in and every community needs an Emmanual Goldstein figure. Massacres are a community's dirty little secret. So when is a simple reminder about general jobs losses, maybe a smattering of innuendo, not a veiled form of intimidation and harassment? Story telling can be so triggering.
Seems very off and unsustainable (e.g. what happens in 2023 is unclear), along with hiding the licensor's name - that seems unreasonable. Who is the end recipient of this donation? I don't buy the story.
Setup a shell company in Bermuda that does the scrapping for you and then claim you buy the data from them. When they approach you just tell them you have a NDA and can't disclose the company. That'll keep them at bay for awhile... then when you're forced to disclose, they'll spend the rest of their time chasing a shell company in Bermuda.
OR.....
Just create a firefox extension that scrapes the pages for your users.
There's many ways to keep this going while giving them the middle finger and serving your users.
Also, laws only matter when the outcome doesn't to the people in charge. Just because "the law is on your side" doesn't mean you'll win even if you could afford it, because laws don't magically rule on their own. It is judges, with all their pettiness and cravenness and nepotism, which rule.
If your data infringes copyright they will go after you not your fictitious shell company because it doesn't matter where you got the copyright infringing material it matters whether you illegally distributed it.
While pictures would be under copyright (and it's possible copyright is with a third party specialised in this), I can't see how the price listed for a product on, say, Tesco's website and collected by yourself on Tesco's website could be subject to licensing by a third party, even considering database rights.
I am also puzzled by those guys claim that they got a very "generous offer" from that company.
If they are unsure a better first step would be to crowdfund legal advice.
They could then crowdsource taking pictures of all the products from their users.
Unless there is a third party which is taking photos of all grocery products and then licensing that to the supermarkets, to avoid them all having to take their own photos?
I used to work for one of the major supermarkets and am aware that there are providers of product images, but don't know the specifics in enough detail.
Taking a photo only gives you copyright over that photo, it doesn’t stop other people taking a similar photo.
A lot of business try and claim copyright of a subject (like a tourist attraction) and try and prevent photography of it, but that’s legal BS in any jurisdiction I’ve heard of.
No, I mean they take the actual photographs and then license them to the retailers. Plus they can also capture other product data in a consistent format etc.
Right, but if you're making a price-comparison website with scraped data, you can't scrape the product image from a supermarket website legally.
You could of course get images from some other source - but you need meticulous organisation when a single brand of tea might have 40, 80, 160, 240 and 600 bag packages.
Well it seems they're not scraping Tesco's website but some aggregator middleman. Like making a package tracking website using Parcels instead of FedEx, UPS, etc individually. Sure it's convenient to write code that only scrapes one website, but now that website is mad because it has a business selling that data.
It looks like a startup that's hit a brick wall viz a viz licensing pictures.
One way of getting around this, albeit perhaps at the expense of user experience, is to insert stock photos. E.g. I imagine 'carrots' is not particularly unique. Where stock photos can't be used, i.e. it's a specialist/one-off product, perhaps the classic white-box-black-text approach?
If this is a genuine site with good intentions, staffed by volunteers then it would be a shame to see it stamped down by the supermarkets.
They've got this data from somewhere, so why wouldn't they have looked through the license before using it?
Anyway, speaking of MySupermarket, what happened to them? All I know was one day they decided to shut down, without clear reasons. Does anyone have more insight into why it happened?
Yeah, I'm scratching my head at this. On the one hand, the UK does have some weird database trolls (a copyright troll called FootballDataCo claimed licence fees from anybody publishing football fixture lists irrespective of where they sourced the information from, and probably can do again since the EU court judgement against them presumably no longer applies)
On the other hand it makes no sense to accompany a supposed order to c&d screen scraping of multiple (independently maintained) websites with thanks for their generosity and "this company is absolutely entitled to request compensation for their work". Either they're a valuable data provider or an licensing obstacle to using perfectly adequate screen scraping techniques, not both. Not sure why a price comparison website would screen scrape the supermarkets with APIs either...
They're not "independently maintained" websites, they all license their data from the same 3rd party data company and re-publish it on their own websites.
You can't copyright a list of facts. I'm confused about what they have run in to? Maybe they should take some of the money they have raised and talk to a lawyer.
The cease & desist mentions Asda [1] and [2] says "Asda chose NielsenIQ Brandbank, their existing digital content provider for 13 years"
So I'd wager the C&D came from Brandbank - who presumably supply product photos and product data (barcode, pack size, and all the other data like nutritional information you'd find on the packet if you were browsing in store)
Nobody said it was copyright specifically. However:
"A database right is a sui generis property right, comparable to but distinct from copyright, that exists to recognise the investment that is made in compiling a database, even when this does not involve the "creative" aspect that is reflected by copyright.[1] Such rights are often referred to in the plural: database rights."
Yes but it doesn't prevent you to build your the exact same copy as long as you don't use the other DB as a source. Scraping the info yourself is perfectly fine. That's also the reason why people add "mistakes" to their DB as a kind of watermark.
But they are (unwittingly) using the other DB as a source. The data they're scraping turns out not to have been created by the supermarkets in question but sourced from a third party database and published under license on the supermarket websites.
Which raises a very interesting legal question: If the data is recompiled, despite having come from another source via an intermediary (who are using that data to describe the products they are selling), is it still subject to the database rights?
How about we try out a reductio ad absurdam: I'm going to start a startup. Our business model is to license the information on customers' websites, and then license it back to them in turn. This then means that anyone who uses our services in the UK can sue for copyright on lists of facts obtained by screen scrapers, as they've been through our database copyright washer. Think flight prices, hotels, cars, anything.
You absolutely can copyright lists of facts in most jurisdictions outside the US
In the US it is pretty hard[1], but most other jusdictions find that if the list has some "creative" or "work" element (ie, it isn't just a list of everything) then it can be under copyright (eg [2]).
Different situation. You don't buy a terminal, you license it. The license can dictate what you're allowed to do with the data. Rebroadcast the data and you may be just fine on copyright or any other IP, but you violate the license so Bloomberg cancels it.
Delayed prices for publicly traded stocks are generally available freely. Real time prices and order book though are not, not to mention prices for anything traded OTC like many bonds, swaps, etc.
Yeah, things may differ b/w countries. But the Wikipedia article you linked mentions this:
> In regard to collections of facts, O'Connor wrote that copyright can apply only to the creative aspects of collection: the creative choice of what data to include or exclude, the order and style in which the information is presented, etc.—not to the information itself.
So, a list of facts still can be creative and valuable to the society. INAL, but such works should be protected by copyright, or what's the point of copyright in the first place.
Copyright is not like patents. If you arrive at the same result, by a different method, then even if the result is identical, there is no copyright violation.
So "the creative choice of what data to include or exclude, the order and style in which the information is presented, etc." is very weak. You can copy all the data, and then make your own choice on what to include (for example "everything I can find"), and you would not be violating the copyright.
The quote you gave is for things like making artwork based on certain patterns or letters in the data. But it won't help you if your creative choice is "everything" because there is no creativity there.
> So, a list of facts still can be creative and valuable to the society
A list of facts is not creative, it's simply work. And yes, it might be valuable, but that's not how we decide if something is copyrightable.
> or what's the point of copyright in the first place.
What that means is that if you write a list of e.g. “Top 10 tennis players in my opinion”, then you can hold copyright over that. But if you write “Top 10 tennis players by rating” then you can’t as that’s not a creative expression.
I don't think you can own rights to something other people collected and curated though. Though you can own rights to the individual components, but in this case that would be the specific price of something, which doesn't seem like it should be subject to copyright.
The images are obviously going to be copyrighted, I suspect the product information is also likely sourced from third party, and thus you would be infringing database rights by scraping.
I find it very unlikely that the price of the products is provided by thirdparty data provider, and think the actual price is not covered by them despite them suggesting it might be.
> The images are obviously going to be copyrighted
Copyrights are inadvertent and sometimes unintuitive. If the images are created by the product manufacturers, then they hold the copyrights. If the supermarkets edit the images, then they hold copyrights to these edits.
Likewise, if trolley edits the images (undoing the supermarket edits, like cropping around some text) then trolley together with the manufacturer holds the copyright for this new image, not the supermarket.
It doesn't matter where it was scraped from or who was hosting the image, just as it doesn't matter which camera was used.
Yes, if they scrape prices from supermarkets' websites then prices come from supermarkets, they are the ones setting those prices after all.
Really, they don't need product data beyond the name. They could drop pictures and detailed data and build their own database by crowdsourcing pictures and data from their users.
Only relevant if they are actually duplicating the "data company's" database; which is not what web-scraping is.
For example; creating a map from air photographs is not "duplicating" another geo coded database. The geo data exists independently of any map.
"Database rights" are not the same as "data rights", and if your "database" is inadvertently and routinely recreated as by product of other processes, then "database rights" can not apply. Otherwise we are in absurd world.
(In this absurd world, a phone book company could charge license fees to everyone with a phone because their phone books would necessarily contain a subset of the global phone book.)
The UK has FootballDataCo which historically "owned" the fact a football match between two clubs will be held on a particular date, irrespective of how the publisher found out that fact. So UK law has not alway been sensible about this.
The EU Court of Justice threw that out, but we're not in the EU any more...
This use case seems very similar to a search index (aka google) type use case. Crawling and indexing vs scraping and storing, and then publishing in a searchable format for end users to utilise? Surely there's plenty of precedent for this?
It is, but arguably it's closer to the Google News model (in that more than just the titles are shared) which recently saw some pushback in the EU [1]
In general, if the data owner says you must take it down, you rarely have a leg to stand on unless you can claim some kind of protected use (e.g. reporting or criticism)
The title is misleading. Never do they mention the price data. After doing some research it appears that what is really the cause for concern are the product images.
The "data" they mention are product descriptions. The issuer of the cease and desist appears to be NielsenIQ. And they don't provide price data.
Their main product is to provide standardised product images with a white/transparent background, and small product descriptions. Their customers being big E-Commerce stores.
As others have pointed out, it seems sketchy as hell, the desperate last-minute plea for a considerable sum of cash, the weird reasoning about data gotten through a web scraper being somehow subject to fees, and especially in how they want to "protect" the name of the data company.
Regardless, if it was legit then I'd strongly agree with the suggestion of setting up a shell company to not only avoid litigation but to set a trend in teaching these bully companies how the internet works since it seems a few still have to be taught. Anyone using a c&d against a smaller company for data that can be gotten through a web scraper really should fuck off. The UK does seem to be the place where this kind of shitty IP litigation takes place more than is typical.
From their FAQ I think these are the key bits of information:
>What does the data company do?
>The data company digitises and maintains the product information for 98% of grocery products in the UK. They’ve generously worked with us to bring this cost as low as possible.
>Why are you not giving the name of the data company?
>Maintaining this data for over 200,000 products is a costly endeavour and asking them to offer this for free would be a huge ask. We appreciate that the data company is absolutely entitled to request compensation for their work. We know that providing the name of the data company could bring negative implications for them which we wouldn’t want - especially with how generous they’ve already been.
A lot of E-commerce software have data feeds that are ingested by parties like pricespy. These feeds are typically either public or made available after mutual agreements. Scraping a large amount of sites for data is just too much labor in the long run.
My guess is that they were circumventing free-tier API limits for the data-aggregation company. It’s not like this data aggregation is one big dump that they downloaded once and forgot the where it came from. Prices change daily. They’d need to be refreshing via an API (data feed) at least daily.
Trolley scraped the supermarket websites, and the supermarkets got their data (product images, nutrition info, etc.) from Brandbank (the data aggregation company).
> Maintaining this data for over 200,000 products is a costly endeavour and asking them to offer this for free would be a huge ask.
No. Whoever (as someone pointed out, probably Brandbank, they supply most of the supermarkets with data so if it isn't them, then it's likely a scam) has already received full pay for the data they've supplied - if they're charging you again for the data, that's just greed (sorry, I mean "perfectly valid capitalism") - even at 14p/entry or whatever they're charging.
Some people here have mentioned that facts can't be copyrighted - it reminds me of a problem in 2011 that the people distributing timezone data to a bunch of open source projects were sued because a 3rd party software house "owned" the rights to that data[1]. I don't know where that landed (timezones still work in linux, so I think they found a workaround) but most projects like this capitulate rather than risk the cost of court.
But yes, the product descriptions and images may be copyrightable - certainly, Brandbank do all their imagery in-house, and any typos in the descriptions will be owned by them, because some person being paid peanuts - likely somewhere in the Philippines - is having to sit down with a photo of the can of Campbells soup, reading and re-typing the descriptions word for word.
It seems to be related to the product images, or the images were easiest to identify and fight over https://www.trolley.co.uk/imgs/cease-and-desist-letter.png
While a product name, price could be public (a person can see it in a store), the picture is very specific and trolley didn't take the pictures.
It seems they could also just cease use of the 'offending' informationa nd use generic imagery or descriptions instead. For bandwidth/processing reasons, I'd prefer as much text and as few graphics as possible anyway.
[1] is a report of Tesco reducing the range of distinct products they stock from ~90,000 to ~63,000. Some of which will be seasonal items like easter eggs. Some others will only be available in certain stores.
I suspect taking 63,000 product photos, and keeping them all matched up to the right products is probably more than a week's work.
They could but at this point the (claimed) damage is already done.
The price is high but I'd imagine that company guarantees high resolution professional photos of every product. Chasing the last remaining 10% products might be hard. At 160.000 products, let's say spending 1 minute per product (finding it, post production etc) at 12h/day it takes one person 9 months.
If they were scraping the information/pictures off the supermarket sites, then I'd have expected the cease and desist to come from the supermarkets.
Given that the letter came from "a company" then I presume they were taking their information from an aggregator - and seems entirely fair that you should "pay the aggregator" (as there's clearly a company out there doing something similar to what they're doing - but they were taking their data).
If this company was just a data-hose, then I'd have thought you could monetize your consumer-focussed product, simply by flogging anonymized data from your users back to the supermarkets. "Customer X, dropped product Y from their weekly basket (or swapped supermarket), when you raised the price of Z (or other supermarket dropped it)"