More

danielvinson · 2026-02-17T19:10:57 1771355457

As a former competitive MtG player this is really exciting to me.

That said, I reviewed a few of the Legacy games (the format I'm most familiar with and also the hardest by far), and the level of play was so low that I don't think any of the results are valid. It's very possible for Legacy they would need some assistance for playing Blue decks, but they seem to not be able to know the most basic of concepts - Who's the beatdown?.

IMO the most important pars of current competitive Magic is mulligans and that's something an LLM should be extremely good at but none of the games I'm seeing had either player starting with less than 7 cards... in my experience about 75% of games in Legacy have at least one player mulligan their opener.

GregorStocks · 2026-02-17T19:14:42 1771355682

Yeah, the intention here is not to answer "which deck is best" - the standard of play is nowhere near high enough for that. It's meant as more of a non-saturated benchmark for different LLM models, so you can say things like "Grok plays as well as a 7-year-old, whereas Opus is a true frontier model and plays as well as a 9-year-old". I'm optimistic that with continued improvements to the harness and new model releases we can get to at least "official Pro Tour stream commentator" skill levels within the next few years.

danielvinson · 2026-02-17T23:47:04 1771372024

Hmm well, from my perspective, none of them are even really playing the game, they are just taking random actions. Any human, even a small child, would be much better.

And re: ages, it's worth noting that the youngest player to make Day 2 of a Grand Prix is 8 years old, and the youngest Pro Tour winner was 15 years old. I don't think it's realistic to get an LLM anywhere close to either of those players in skill level, though it's absolutely possible with a specialized model.

mistrial9 · 2026-02-17T21:16:47 1771363007

> , so you can say things like "Grok plays as well as a 7-year-old, whereas Opus is a true frontier model and plays as well as a 9-year-old".

no, no, no.. please think. Human child psychology is not the same as an LLM engine rating. It is both inaccurate and destructive to actual understanding to say that common phrase. Asking politely - consider not saying that about LLM game ratings.

Eridrus · 2026-02-17T23:38:08 1771371488

The model does not seem to get told what the cards actually do, it's just trying to guess.

Probably less of a problem for Legacy, but it isn't even told what format it's playing. So I expect it's easy to get the bots to do far better than this by giving them more information.

Let me analyze this opening hand:

*Hand analysis:* - 3 lands (Riverpyre Verge, Steam Vents, Willowrush Verge) - dual lands which is good - Monument to Endurance ({3}) - a 3-drop - Accumulate Wisdom ({1}{U}) - 2-drop, likely a blue advantage engine - Boomerang Basics ({U}) - 1-drop blue spell - Gran-Gran ({U}) - 1-drop blue creature (1/2)

This is actually a reasonable hand. I have: - 3 lands giving me mana access (with color fixing) - A 1-drop (Gran-Gran or Boomerang Basics to choose from) - A 2-drop (Accumulate Wisdom) - A 3-drop (Monument to Endurance)

The curve is solid. I have some interactive blue play with Boomerang Basics and early board presence with Gran-Gran. The mana base supports blue well with all three lands. This is keep-able — while not spectacular, it's a functional opening hand.

{ "hand_size": 7, "action_type": "GAME_ASK", "players": "GPT52 Vet(20hp), Haiku Worrier(you,20hp)", "land_count": 3, "context": "T1 ()", "response_type": "boolean", "respond_with": "answer=true or answer=false", "message": "Mulligan down to 6 cards?", "your_hand": [ { "name": "Riverpyre Verge", "is_land": true }, { "name": "Steam Vents", "is_land": true }, { "name": "Willowrush Verge", "is_land": true }, { "mana_cost": "{3}", "name": "Monument to Endurance" }, { "mana_cost": "{1}{U}", "name": "Accumulate Wisdom" }, { "mana_cost": "{U}", "name": "Boomerang Basics" }, { "mana_cost": "{U}", "name": "Gran-Gran", "power": "1", "toughness": "2" } ], "action_pending": true }

GregorStocks · 2026-02-18T00:28:58 1771374538

Oh, that's a good bug report - historically it was just hallucinating card effects so I made the harness throw the Oracle text for all visible cards into the context, but I bet I forgot to do that for the mulligan decision specifically (it's a weird one). Thanks!

protocolture · 2026-02-17T22:41:02 1771368062

I picked a random commander game and the first message 1/4 players left was "Well I should be getting a new hand soon because I have asked for a mulligan". Its definitely in there, whether they are using it correctly is another question.

raincole · 2026-02-18T04:03:14 1771387394

> mulligans and that's something an LLM should be extremely good

Why? I honestly can't think of any reason that LLMs should be specifically good at mulligans

danielvinson · 2026-02-18T04:07:09 1771387629

This is actually really interesting to me, but the way to determine if you should mulligan is if the 7 cards you are looking at is better than the average 6 cards in your deck. Given that games in most higher power formats end in the first 2-3 turns, the number of cards isn’t as important as the quality generally. So it’s really just math to determine what an “average” hand looks like.

raincole · 2026-02-18T04:22:37 1771388557

> it's really just math to...

Uh, LLMs are notoriously bad at basic arithmetic. I think you might be thinking about another kind of AI.

Plus I don't really believe LLMs can reliably tell which hand is better. If you remove the drawing part and simply present two hands to an LLM and ask it which one is stronger I expect it to do much much much worse than experienced player. There isn't much reason to expect otherwise (but I'm willing to be proven wrong if such benchmarks exist)

danielvinson · 2025-11-13T08:02:18 1763020938

I can't see any movement, at any distance. How likely is it something weird with my vision vs. something weird with my monitor/computer? I'm on a 360hz monitor at 2k.

koolala · 2025-11-13T08:13:59 1763021639

The sizing and distance to you your face is important so you can play with that and change #define scale at the top.

danielvinson · 2025-09-15T18:11:17 1757959877

I think this article discounts the reasons behind frontend decisions... priorities are absolutely fast execution time and ease of hiring. There is very, very little reason to care about optimizing frontend performance for a vast majority of apps. Users just don't care. It doesn't make the company more money.

If a framework is easy to use and everyone knows it, it's simply the best choice for 90%+ of teams.

croes · 2025-09-15T18:15:42 1757960142

The UX for me went downhill the last 5-7 years. I don’t know if it’s react but something changed. Pages load slow or even don’t, strange display errors, slow reaction times etc.

tracker1 · 2025-09-15T18:18:36 1757960316

Too few run output analysis on their bundles or even track bundle sizes. There's a lot of kitchen sink repos, not to mention any number of other bottlenecks between the front end and back end. Worse across split teams for larger apps.

cosmic_cheese · 2025-09-15T20:43:58 1757969038

> There is very, very little reason to care about optimizing frontend performance for a vast majority of apps. Users just don't care. It doesn't make the company more money.

There’s plenty of users who care, but when the competition is also all slow and heavy they don’t get any choice in the matter.

jonny_eh · 2025-09-15T20:53:09 1757969589

It's usually not the framework that causes apps/sites to be slow.

cosmic_cheese · 2025-09-15T20:56:22 1757969782

Not directly, but when you have devs who only know how to build with the framework and don’t have a grip on what’s going on under the hood or how it all interacts in the browser environment (increasingly common), performance is sure to take a hit.

jonny_eh · 2025-09-16T03:48:55 1757994535

It's not React's fault that people either don't know what they're doing, or don't care enough to make their software performant. This is not a new phenomenon, bad/rushed software has always existed.

wahnfrieden · 2025-09-16T17:05:11 1758042311

GitHub somehow became instantly slow to this day after they started introducing React but I guess it’s just their devs faults

jonny_eh · 2025-09-17T18:32:11 1758133931

Well, ya

muspimerol · 2025-09-16T09:32:24 1758015144

This happens regardless of which framework is used or even if no framework is used. Plenty of web developers do not understand how the browser or JS work at a deep level.

LegionMammal978 · 2025-09-16T14:37:23 1758033443

Yeah, it's pretty close to the "Imagine how great the world would be if everyone used Lisp/Haskell/WhateverLang instead of Java/JS everywhere!" take you sometimes see. As if the common developer wouldn't just write in all those languages like they're Java/JS, and keep clear of the advanced macros/type systems/whatever.

Even languages or environments that try to "steer the developer into the correct direction" have only really managed it when the new direction is something they already might've chosen to write. Otherwise, you just end up with many square pegs filed down to fit in round holes.

sublinear · 2025-09-16T11:20:29 1758021629

This isn't true at all if you're working on maintaining a web app. When ease of hiring and getting tasks done quickly have become the priority it's because the business has let too much work pile up. It has very little to do with the money unless it's a small startup.

Frontend skills are misunderstood by most of HN because it's a hard role that directly involves business and product wants. There's a ton of hiring (and firing) because it's not easy to find the right people who can communicate about the work clearly with non-devs, navigate the office politics, know what to push back on or when to ask questions, and still write good code.

danielvinson · 2025-09-16T17:02:11 1758042131

I agree that maintaining web apps is an entirely different set of skills, though in my experience (mostly small and mid size companies) PMs come in with massive projects and huge changes constantly and management has to say yes to a few. I try my best to shield my devs as much as possible from the politics but usually my teams are still ending up with huge 4-5 sprint frontend projects. It's extremely hard to find devs who can create simple technical designs when there is absolutely any frontend complexity (especially things like wizards, why are wizards so hard for people...). My standard these days for a "good hire" is anyone who can handle these sorts of projects without a huge amount of help.

danielvinson · 2025-08-25T18:55:16 1756148116

I felt this way for a long time and used a heavy mouse for daily use then switched to my lightweight mouse for gaming. I changed my mind very fast when I started developing carpel tunnel symptoms from the heavy mouse - using your wrist to move a heavy object in the same pattern for 6+ hours a day is just not healthy for you and when your job involves doing that, its 100% worth it to just use the lightweight mouse for everything. I actually found that my wrist felt better with the 55g mouse than with a trackball or a touchpad.

rootusrootus · 2025-08-26T00:38:29 1756168709

An alternative approach that has worked for me is to get out of the habit of using little muscles as much as I can. I don't use my wrist much with the mouse, I move my whole arm. I try to apply that to everything I do and I've managed to avoid repetitive motion injuries. Last time I experienced anything like carpal tunnel was in college (decades ago).

luqtas · 2025-08-25T21:03:04 1756155784

heavy? like what, 120 grams? most people don't develop RSI from their computer usage... much more if they exercise, stretch and do breaks

moron4hire · 2025-08-25T23:52:04 1756165924

In my mid 20s I started developing RSI. I went through a bunch of different devices designed to supposedly be "ergonomic" and while the problem would go away temporarily, it would eventually come back.

I eventually found that it's not any one device that cures RSI. It's much better to switch up between different devices on a regular basis.

So, now I have two keyboards I switch between about weekly. It's fairly easy because I just switch between working at the office vs home and have two different designs of devices in the two locations. Occasionally I use the laptop keyboard and track pad directly. I still program a lot and haven't had another RSI flare-up in about 7 years.

luqtas · 2025-08-26T13:03:41 1756213421

you can ask or hire any personal ergonomic service/company to evaluate what needs to get better at your workstation and they will say that devices are the last item on the list. posture, adequate chair/table height and breaks are much more important than 50 grams less on your mouse

monster_truck · 2025-08-28T05:35:18 1756359318

The OSHA Computer Workstation Posture Checklist is gospel and if anyone reading this is struggling with RSI and hasn't followed it to the letter, please do so immediately. It is frankly disgusting how quickly a few seemingly small bad habits can grow into a monster nerve issue

danielvinson · on Jan 8, 2025

Love the idea and technology - I’d much prefer if the output of this was an MPC order since that’s how almost everyone is making proxies these days. Getting my entire cube printed was only about $100 and they are indistinguishable from real cards.

_zfsy · on Jan 8, 2025

The joy of proxies is you can do whatever you want! My take was to not make these seem as real as possible, and one where you can print on demand.

Imagine a proxy only tournament of any card game, where you have to submit your decklist ahead of time and it is waiting for you when you arrive, ready to play and keep.

danielvinson · on April 2, 2024

You are basically charging money for things that frameworks like Django already have built-in, and have already had built-in for decades. There are so many good free options for this sort of thing. If a company already pays developers, there's absolutely no way they won't just build things themselves.

All of your selling points are things that business people think are difficult but any developer worth their salt will tell you is no problem.

Edit: to add a bit more info here... think of it this way: I am a frontend developer. I can build a specific dashboard in about 2 months, using things I am already familiar with and will look great. Instead, I can use your product and reduce the amount of time it will take to 1 month, but lock us into paying a monthly subscription for the rest of time. How could I possibly convince somebody to buy that?

anamexis · on April 3, 2024

Well, if it costs $10,000 all-in to pay a developer for 1 month, and Creo costs $30/month, then it would take 27 years to break even on having a developer spend the extra month building it from scratch.

swaptr · on April 3, 2024

You didnt factor the cost/time it will still take a dev to familiarize with Creo and quite possibly the NextJS/React ecosystem.

danielvinson · on Feb 1, 2024

Macaroons are the coconut cookie. Macarons are the fluffy egg white ones. Very different things.

mtlmtlmtlmtl · on Feb 1, 2024

But the word macaroon is derived from macaron. Obviously they are not the same thing, I never said that, did I?

danielvinson · on Dec 2, 2023

Impossible to find? You don’t have salary info on that listing and a quick google shows that your salaries are something like half what I’m making as a remote RN engineer. Very unlikely that it’s because of RN.

tschellenbach · on Dec 2, 2023

could be that's the problem, we went up though, not down in rates. think its around 120k base + 120k in equity. how much do you make?

danielvinson · on Dec 2, 2023

When I googled your company the only results listed total comp under $100k for software engineers. 120k is slightly better than that. Equity is worth zero at private companies unless there is a system in place to sell it, so don't try to claim that is worth anything to most people.

I make around $200k + equity as a Staff Engineer, fully remote, though as my company is also private, equity === 0. I would say I am moderately underpaid in the current market but I'm happy with that tradeoff since my company is fully remote, great work life balance, and I work with great people.

tomduncalf · on Dec 2, 2023

That’s a US salary though, right? So you would expect it to be higher. Granted some companies might pay the same salary regardless of country but in my experience it’s usually adjusted

danielvinson · on Dec 2, 2023

My company operates globally but is based in the US. We don't adjust salary based on location.

tomduncalf · on Dec 2, 2023

That’s really good!

danielvinson · on Jan 25, 2022

Not related to your actual point, but I researched corporate boards a few years back and found many instances of people on boards of two direct competitors. In fact, I found a couple of companies with CEOs who also served on the board of a competitor. This was going through maybe a few hundred large corporations, so I’d have to assume it’s incredibly common.

eagleinparadise · on Jan 26, 2022

I haven't studied this in depth but I recall that there's a book out there that sort of blasts the management consulting industry. How a lot of corporates will hire the same consulting company such as McKinsey to learn what their competitors are planning/how they are performing. IIRC

takinola · on Jan 26, 2022

This is absolutely incorrect. Consulting companies, at least the major ones, will never let any consultant work for more than one competitor throughout their career. They also strictly forbid sharing any info learnt about a company with anyone else, talk less of with competitors.

patrec · on Jan 25, 2022

Can you give some examples (because that sounds interesting, not because I don't believe you)?

danielmarkbruce · on Jan 25, 2022

Yeah I'd be interested to see the examples too.

Eric Schmidt sat on Apple's board while running Google, but left in 2009 when they became competitors in the smart phone business. Before that, Apple & Google weren't competitors.

halpert · on Jan 25, 2022

This is notoriously playing out in the ETF space. Large institutions own vast quantities in stock in competitors and competition decreases as a result. See Rise of Institutional Investors Raises Questions of Collusion.

danielmarkbruce · on Jan 25, 2022

Can you give a concrete example? This seems vague. It's certainly not notorious.

halpert · on Jan 26, 2022

Notorious is the wrong word you’re right. It’s an active area of research. It’s called the “common ownership hypothesis”.

danielvinson · on Sept 22, 2021

Yep, this is where I'm at too. 13" is just not enough space for me to get anything done productively. On my 13" MBP with my standard font size, my code editor can't show a full line of code without me having to scroll (VSCode with mostly default settings, font size 12).

samstave · on Sept 22, 2021

FYI - I have Two of these for my machine: https://i.imgur.com/NiHzTzx.jpg

They are super light and all three fit in my backpack really easily....

I HIGHLY recommend.