More

simonw · 2026-04-01T04:34:44 1775018084

You can run this model on an iPhone via the latest update to this Locally AI app: https://apps.apple.com/us/app/locally-ai-local-ai-chat/id674...

For its size (1.2GB download) it's very impressive.

Here's a pelican it drew me running on my phone - the SVG comments are good, the image not so much: https://tools.simonwillison.net/svg-render#%3Csvg%20width%3D...

newman314 · 2026-04-01T07:56:49 1775030209

One thing I discovered tonight is that it appears smaller models are remarkably bad at converting time between timezones.

I tested the following using almost all available models on Locally and did not get a single model that got the right answer.

"What is 9:30 am (Taiwan Standard Time, TST) in US Pacific?"

voxelghost · 2026-04-01T05:19:47 1775020787

    <!-- Bicycle wheels -->
    <circle cx="285" cy="130" r="5" fill="#81c784" />
    <circle cx="315" cy="130" r="5" fill="#81c784" />
    <circle cx="285" cy="160" r="5" fill="#81c784" />
    <circle cx="315" cy="160" r="5" fill="#81c784" />

Did you ask for a pelican with a bicycle, or was that just an added bonus?

IshKebab · 2026-04-01T09:07:50 1775034470

It's a well known LLM test. Google "SVG pelican bicycle".

simonw · 2026-04-01T03:52:22 1775015542

How many of those competitive services would also lock the account if something like this happened?

ryandrake · 2026-04-01T04:00:19 1775016019

Yes, these stories should scare you off of cloud services in general, not one particular vendor. The root problem is that you're storing valuable information on "someone else's computer." And that someone can decide to stop serving you for any or no reason at all, and you are without recourse. This should be totally unacceptable, but somehow the world has normalized it.

Don't keep anything in a cloud service that you couldn't live with losing, unless you keep a local backup. Including and especially your identity (E-mail) which unlocks all your accounts.

eviks · 2026-04-01T04:26:28 1775017588

No, the root problem is you put all the eggs in one basket ignoring the folk wisdom that predates anything digital

> Don't keep anything in a cloud service that you couldn't live with losing, unless you keep a local backup.

Translated: so do keep everything in a cloud service, just backup it at a fraction of the effort with / insecurity / unreliability / unavailability of your own computer

ryandrake · 2026-04-01T04:42:08 1775018528

Yes, and, importantly, have a plan to be able to log in to and reset your passwords through e-mail, on all your other services, if you suddenly lose you@yourcloudemail.com

I consider “cloud” to be a single (unreliable) basket. If you have your online stuff spread across 5 cloud providers, than any of them locking you out will disrupt you in some way.

eviks · 2026-04-01T05:08:24 1775020104

This broad reclassification makes no sense. If you put literal eggs in 5 baskets, then any of them falling down will disrupt your eggs in some way. You're missing the whole point of the principle, which is that it will not disrupt you in the same big way of blocking all your digital life like in the example from the post!

eviks · 2026-04-01T04:16:31 1775016991

None: your me@NonSelf-HostedMail.com simply wouldn't know about your account at AI.com, so you'd only have your AI account banned

simonw · 2026-03-31T20:02:53 1774987373

This is really cool. I've built things on PostgreSQL ts_vector() FTS in the past which works well but doesn't have whole-index ranking algorithms so can't do BM25.

It's a bit surprising to me that this doesn't appear to have a mechanism to say "filter for just documents matching terms X and Y, then sort by BM25 relevance" - it looks like this extension currently handles just the BM25 ranking but not the FTS filtering. Are you planning to address that in the future?

I found this example in the README quite confusing:

  SELECT * FROM documents
  WHERE content <@> to_bm25query('search terms', 'docs_idx') < -5.0
  ORDER BY content <@> 'search terms'
  LIMIT 10;

That -5.0 is a magic number which, based on my understanding of BM25, is difficult to predict in advance since the threshold you would want to pick varies for different datasets.

tjgreen · 2026-03-31T20:24:04 1774988644

I actually don't love this example either, for the reasons you mention, but at some point we had questions about how to filter based on numeric ranking. Thanks for the reminder to revisit this.

Re filtering, there are often reasonable workarounds in the SQL context that caused me to deprioritize this for GA. With your example, the workaround is to apply post-filtering to select just matches with all desired terms. This is not ideal ergonomics since you may have to play with the LIMIT that you'll need to get enough results, but it's already a familiar pattern if you're using vector indexes. For very selective conditions, pre-filtering by those conditions and then ranking afterwards is also an option for the planner, provided you've created indexes on the columns in question.

All this is just an argument about priorities for GA. Now that v1.0 is out, we'll get signal about which features to prioritize next.

mbreese · 2026-03-31T21:14:02 1774991642

While we’re talking about filtering — is there a way to set a WHERE clause when you’re setting up the index? I’ve been working on this a lot recently for a hybrid vector search in pg. One of the things that I’m running up against is setting a good BM25 index for a subset of a table (the where clause). I have a document subsets with very different word frequencies, so I’m trying to make sure that the search works on a set subset.

I think I can also setup partitions for this, but while you’re here… I’m very excited to start to roll this out.

tjgreen · 2026-03-31T21:24:34 1774992274

Partitions would be one option, and we've got pretty robust partitioned table support in the extension. (Timescaledb uses partitioning for hypertables, so we had to front-load that support). Expression indexes would be another option, not yet done but there is a community PR in flight: https://github.com/timescale/pg_textsearch/pull/154

simonw · 2026-03-31T16:50:49 1774975849

It's great that this is Apache 2.0 licensed - several of Cohere's other models are licensed free for non-commercial use only.

simonw · 2026-03-31T11:43:37 1774957417

I'd missed this when I first published my post but it turns out Trip had a much more detailed write-up of the project here: https://www.estragon.news/mr-chatterbox-or-the-modern-promet...

simonw · 2026-03-31T00:06:07 1774915567

That's like saying there's no point in attending a lecture on "how to get the best out of your time at University" because University courses are taught in spoken language so you could just ask the professors.

simonw · 2026-03-31T00:04:39 1774915479

One of the things you can learn is how to get consistently useful results out of it despite it being a non-deterministic black box.

simonw · 2026-03-30T15:40:00 1774885200

Which product called Copilot did you ask?

lexh · 2026-03-30T16:44:48 1774889088

https://copilot.microsoft.com/shares/RviQKF1om6oxZY2pENL6c

Sample size is 2 now!

bundie · 2026-03-30T15:41:51 1774885311

Maybe this one?

https://copilot.microsoft.com/

ffsm8 · 2026-03-30T16:30:06 1774888206

Maybe, but Microsoft has a lot of products which they branded Copilot. Pretty sure that was his point.

neilcar · 2026-03-30T17:33:56 1774892036

Microsoft loves to do this with brand names -- a friend who's still there said they stopped counting at 30 different "Defender for ______" products.

simonw · 2026-03-30T15:39:31 1774885171

In case people missed it in the other thread, GitHub have now disabled this: https://twitter.com/martinwoodward/status/203861213108446452...

> We've disabled it already. Basically it was giving product tips which was kinda ok on Copilot originated PR's but then when we added the ability to have Copilot work on _any_ PR by mentioning it the behaviour became icky. Disabled product tips entirely thanks to the feedback.

pinkmuffinere · 2026-03-30T15:44:53 1774885493

I’m grateful they disabled it, but their response still feels a bit tone deaf to me.

> Disabled product tips entirely thanks to the feedback.

This sounds like they are saying “thanks for your input!”, when really it feels more like “if you didn’t go out of your way to complain, we would have left it in forever!”

johnnyanmac · 2026-03-30T19:52:00 1774900320

Of course they would have. The squeaky wheel gets the grease. Why do you think governments spend billions upon trillions trying to get their citizens to essentially "shut up" instead of improving their conditions?

joegibbs · 2026-03-31T05:17:04 1774934224

But why run free advertising in the first place?

da_grift_shift · 2026-03-30T15:53:52 1774886032

Accepting the megacorp euphemisms without critique ("product tips") is how enshittification festers.

simonw · 2026-03-30T16:24:02 1774887842

I've not seen any evidence that these were ads and not "tips".

Ads implies someone was paying for them. Promoting internal product features is not the same thing - if it was then every piece of software that shows a tip would be an ad product, and would be regulated as such.

matt_kantor · 2026-03-30T21:57:08 1774907828

> Ads implies someone was paying for them.

It doesn't to me.

By my understanding of the term, Netflix can most definitely advertise Netflix shows on its own platform, a flyer that a barber hangs on a public bulletin board is an advertisement, and the Oscar Mayer Weinermobile is advertising hotdogs when it drives through my town. Do you not consider these things to be advertisements?

I pretty much agree with what https://en.wiktionary.org/wiki/advertisement says.

simonw · 2026-03-30T23:37:57 1774913877

I think this particular story is a very different scandal if it turns out GitHub were charging other companies money in exchange for having Copilot include promotions for their products in PRs as opposed to Copilot adding uncompensated usage "tips" to those PRs.

matt_kantor · 2026-03-31T00:23:51 1774916631

I agree with that.

Two things:

1. People using the word "advertisement" when commenting on this situation aren't necessarily saying that's what's happening, and they may find these tips/ads distasteful anyway (I know I do).

2. Even if someone isn't literally paying Microsoft to insert these tips/ads, promoting third parties which are themselves Microsoft customers still benefits Microsoft.

wat10000 · 2026-03-30T16:59:53 1774889993

I could buy it if this was just being shown to the person who was using Copilot. Hey, here's a feature you might like. Seems OK. But it was put into the PR description. That gets seen by potentially many people, who are not necessarily using Copilot.

iso1631 · 2026-03-30T16:58:19 1774889899

When apple puts an advert for an apple show in front of for all mankind, that's an advert.

Maybe I put up with it and it just adds to my subconscious seething, or maybe I get the episode elsewhere because if I watch on jellyfin I don't have the advert. Of course that then harms the show as my viewing isn't counted, but they've cancelled it anyway so perhaps it doesn't really matter.

If it isn't an advert, then at very least there's a button to disable it.

isjciwjdieh · 2026-03-31T01:38:30 1774921110

What? For All Mankind wasn’t cancelled.

Season 5 is coming out now with season 6 already confirmed coming—which, granted, will be its last, but that’s not a cancellation in any sense of the word.

iso1631 · 2026-03-31T09:49:00 1774950540

"not renewed" or "cancelled" is the same thing

johnnyanmac · 2026-03-30T19:55:25 1774900525

ads usually implied a financial incentive. But that's not always the case. Technically, if I was to praise someone's blog and link to it, that would also be an ad.

Ads tend to also imply tangential information shown to you in an undesired area. If this was some tool tip and not embedded in the PR comment, many wouldn't call it an ad.

simonw · 2026-03-30T14:47:53 1774882073

If you have uv installed you can start a chat with the model (after a 2GB model download) with this one-liner:

  uvx --with llm-mrchatterbox llm chat -m mrchatterbox