More

dmkii · 2025-10-18T06:49:28 1760770168

It’s interesting you mention linguistics because I feel a lot of the discussions around AI come back to early 20th century linguistics debates between Russel, Wittgenstein and later Chomsky. I tend to side with (later) Wittgenstein’s perception that language is inherently a social construct. He gives the example of a “game” where there’s no meaningful overlap between e.g. Olympic Games and Monopoly, yet we understand very well what game we’re talking about because of our social constructs. I would argue that LLMs are highly effective at understanding (or at least emulating) social constructs because of their training data. That makes them excellent at language even without a full understanding of the world.

dmkii · 2025-09-29T22:01:43 1759183303

Most of our jobs consist of working with tools. Yet it’s very hard to get insights into which tools are required most, are growing in your area, etc. So I decided to keep track of tools and technologies mentioned in the data space by keeping track of job openings for the last two years. Now I’ve opened up that data set. Here’s an analysis for jobs per data warehouse: https://selectfrom.work/insights/data_warehouses

dmkii · 2025-09-24T20:57:30 1758747450

By far the stupidest version of this to me has been Snowflake’s implementation of previews. This is a database, where people preview the content of a table, not in an app, not on a phone, and someone thought it was a good idea to make that an image. I have no idea who ever thought this was a good idea, but here i am constantly tricked into thinking I can select some preview data, only to realise I have to go on a 10 clicks and a SQl query diversion to get it done.

dmkii · on April 27, 2024

I agree that there is a line at using someone else’s data to make a profit, but it is kind of ironic that you mention Google, because their exact business model is scraping websites to feed their search results and litter it with ads to make a profit. For me there is a big line between aggregating publicly available data (search results, reviews, news, job postings, etc. ) and intentionally violating terms of service like signing up for fake accounts an harvesting user data. So entitled maybe not (sites can try to prevent you from scraping), but if you make something publicly available you shouldn’t be surprised when people use it in ways you may not originally have intended (within legal boundaries of course).

dmkii · on Oct 26, 2022

Am I missing something? This “hack” requires you to go to his site first, then use the back button and then click on a (fake) competitor link. How is he ever going to get people to his site in the first place? And if it’s through paid ads, why not create a fake paid ad that directs you straight to his fake site in the first place? All sounds very much like a marketer who uses the veil of “security researcher” to hide a scam.

dmkii · on Sept 7, 2022

All you have to do is start a new country called Http, convince ICANN to adopt it as a new TLD (will need a lot of persuasion) and serve “http” as a dotless domain. But, you know, anything for a beer… (fyi: the host name is the part after @ and before the port number indicated by “:”)

dmkii · on June 5, 2022

You’re right, but only if the company wouldn’t track whether you’ve seen or even received that message. So yes, general or even contextual messages would be allowed, but “You haven’t seen X in 9 days” would imply processing personal data for marketing purposes.

scarface74 · on June 5, 2022

With iOS, they send the message to Apple’s servers, Apple sends the message to the user’s device and the device decides whether to display the pop up based on the user’s settings. Neither the third party app nor Apple knows whether the message has been seen unless the user clicks on the message that causes it to open the app.

dmkii · on Jan 7, 2022

The current state of browser tracking preventions also means that you’re unlikely to identify conversions from the same user that saw your experiment after a week or sometimes even 24 hours.

aeternum · on Jan 7, 2022

Yes, browser tracking prevention is one of those things that seems like a good idea at first but likely makes the internet slightly worse overall.

Sites can only optimize for what they can see and we've made it so they can only see short-term engagement.

Another is all the annoying cookie popups as a result of GDPR.

chunkyks · on Jan 7, 2022

You haven't convinced me that preventing browser tracking is making the internet "slightly worse overall".

If sites are having trouble converting me, perhaps it's not me that's the problem.

aeternum · on Jan 7, 2022

The issue is most sites can no longer tell if they are converting you

chunkyks · on Jan 8, 2022

It's not obvious to me that that is a problem for me, or that it makes the internet worse

def_true_false · on Jan 7, 2022

The popups are a result of tracking, not GDPR. Websites without tracking don't need to have them.

It's somewhat amusing that the overlap of garbage content farms and sites with annoying consent popups is almost perfect. I wonder if it could be used for search engine ranking.

dmkii · on June 2, 2021

Most, or at least a lot, of the prefetching is for third party libraries (think jQuery, Google Fonts, Facebook Pixel, etc). There’s a general speed advantage for users caching commonly used libraries and fonts across sites. Nonetheless I believe prefetch will still have a speed advantage even when the cache is segregated.

dmkii · on March 24, 2021

There’s definitely high demand for “technical” web analysts at the moment. That usually means someone who isn’t afraid of html and JavaScript and can help less technical marketeers and analysts implement their measurement requirements and analytics tools either through a tag manager like Google Tag Manager or directly in code.