Instead of anti-fragility, I'd point you to the law of requisite variety instead.
You'll notice that all AI improvements are insanely good for a week or two after launch. Then you'll see people stating that 'models got worse'. What happened in fact is that people adapted to the tool, but the tool didn't adapt anymore. We're using AI as variety resistant and adaptable tools, but we miss the fact that most deployments nowadays do not adapt back to you as fast.
New models literally do get worse after launch, due to optimization. If you charted performance over time, it'd look like a sawtooth, with a regular performance drop during each optimization period.
That's the dirty secret with all of this stuff: "state of the art" models are unprofitable due to high cost of inference before optimization. After optimization they still perform okay, but way below SOTA. It's like a knife that's been sharpened until razor sharp, then dulled shortly after.
> If you charted performance over time, it'd look like a sawtooth
People have, though, and it doesn't show that. I think it's more people getting hit by the placebo effect, the novelty effect, followed by the models by-definition non-determinism leading people to say things like "the model got worse".
Is this insider info? The 'charted performance' caught my eye instantly.
Couple things I find odd tho: why sawtooth? it would likely be square waves, as I'd imagine they roll down the cost-saving version quite fast per cohort. Also, aren't they unprofitable either way? Why would they do it for 'profitability'?
It's rumors based on vibes. There are attempts to track and quantify this with repeated model evaluations multiple times per day, this but no sawtooth pattern has emerged as far as I know.
I don't want to go too far down the conspiracy rabbit hole, but the vendors know everyone's prompts so it would be trivial for them to track the trackers and spoof the results. We already know that they substitute different models as a cost-saving measure, so substituting models to fool the repeated evaluations would be trivial.
We also already know that they actively seek out viral examples of poor performance on certain prompts (e.g. counting Rs in strawberry) and then monkey-patch them out with targeted training. How can we be sure they're not trying to spoof researchers who are tracking model performance? Heck, they might as well just call it "regression testing."
If their whole gig is an "emperor's new clothes" bubble situation, then we can expect them to try to uphold the masquerade as long as possible.
It's not insider info, it's common knowledge in the industry (Google model optimization). I think they are unprofitable either way, but unoptimized models burn runway a lot faster than optimized ones.
The reason it's not a square wave is because new optimization techniques are always in development, so you can't apply everything immediately after training the new model. I also think there's a marketing reason: if the performance of a brand new model declines rapidly after release then people are going to notice much more readily than with a gradual decline. The gradual decline is thus engineered by applying different optimizations gradually.
It also has the side benefit that the future next-gen model may be compared favourably with the current-gen optimized (degraded) model, setting up a rigged benchmark. If no one has access to the original pre-optimized current-gen model, no one can perform the "proper" comparison to be able to gauge the actual performance improvement.
Lastly, I would point out that vendors like OpenAI are already known to substitute previous-gen models if they determine your prompt is "simple." You should also count this as a (rather crude) optimization technique because it's going to degrade performance any time your prompt is falsely flagged as simple (false positive).
Coding with AI is kind of like obesity in modernity: having tons of resources is the goal, but once you get there, you end up in a system you're not really adapted to.
Personally, I don't care that much about org incentives (even though they obviously matter for what OP posted) but more about what it does to my thinking. For me, actually writing code is what slows my brain down, helps me understand the problem, and helps me generate new ideas. As soon as I hand off implementation to an LLM (even if I first write a spec or model it in TLA+) my understanding drops off pretty quickly.
Just sitting around and thinking of solutions to your own problems beats giving yourself work. As refactor_nietzsche would put it, resource slack is what lets you be a refactor_master instead of a refactor_slave. If you feel pressured into self-imposed creep, it's probably because you've internalized the idea that having too much slack makes you look dangerous to your superiors, so you default to playing the worker bee.
Generating AI Content sucks, Consuming AI Content sucks, but combine them in the same loop and it's really addicting. AI Content Prosuming rocks.
Since LLMs, if I see a video I think is interesting, I take the transcript, feed it into an LLM, I summarize it and ask it a couple of questions.
I've turned 12 minute videos back into the 5 phrases news it was based on.
I suppose that when you're the one generating the request, it feels more personal. It is also very interesting that most LLMs respond like a normal person when you talk to them directly, but suddenly adopt the more annoying blogger speech patterns when you tell them 'create content'.
> I've turned 12 minute videos back into the 5 phrases news it was based on.
Why not read the original news?
Okay, there are many reasons why you might not want to do that, such as ads, tracking, having to pay for a subscription if you only want one article, and just plain boredom. I wasn't trying to call you out, it was more of a question for society at large.
Why has it become more appealing to have a "content creator" turn 5 phrases of news into a 12 minute video and then have an LLM convert it back, rather than reading the 5 phrases?
It's not that it's appealing. For example, I wanted to learn how to bend notes on harmonica, but it wasn't working. That's not something you can really understand without video, yet most tutorials are 5-15 minutes long and only show the actual technique at some random point in ~30 seconds (just search 'how to bend on harmonica' and see). So I take the transcript t check whether it's a method I've already tried or something new worth watching, and I also get an extra explainer of the technique in text.
Also, with videos like "what X said about situation Y in discourse Z". Sometimes you're just curious, and you can't realistically extract that efficiently from a full one-hour speech on a geolocked, untranscribed mass-media website, so it's easier to summarize the transcript of the 12 min video directly.
As for why everything is 12 minutes long, it's most likely because content creation isn't optimized to teach you anything or be useful, it's optimized to maximize watch time so platforms can serve more ads to you. The pattern is: I got you intrigued in something; you want the answer? pay me your time.
Exactly. Those tutorial videos are fine if you are completely new to a topic, but if you are searching for something more advanced, then getting through all those introductory videos or those pretending to be advanced is so frustrating.
That description is accurate, but that's more of an attention management problem then of a navigation problem. You have to infer information about opponent positioning based on partial information, while also moving your character with repeated clicks, each click requiring precision (and more of your attention).
It's more like poker in real-time with timers than to remembering a whole city infrastructure or planning a full route.
If LoL trained you for ambulance work, the world would look something like: there are 5 hospitals, 3 patients, 15 roads; a hospital inspector goes from hospital to hospital, panicked hospital managers open that hospital for 5 hours after inspector is about to arrive; you have another ambulance friend that tells you from time to time info about last inspector location or how panicked the managers looked; infer open hospitals and inspector location such that your patient survives, while you also have to maintain a high Candy Crush score on your phone non-stop.
Only for high level strategy I would say. Generally you have to process lots of visual information and decide/act fast according to it.
There is a small mini-map where all heroes currently visible on the map are shown. A high level player would for example reason: "I saw the enemy carry farming there 30 seconds ago, so right now he is likely in that area".
I don't think generation/discrimination is fundamental. A more general framing is evolutionary epistemology (Donald T. Campbell, 1974, essay found in "The Philosophy of Karl Popper"), which holds that knowledge emerges through variation and selective retention. As Karl Popper put it, "We choose the theory which best holds its own in competition with other theories; the one which, by natural selection, proves itself the fittest to survive."
On this view, learning in general operates via selection under uncertainty. This is less visible in individual cognition, where we tend to over-attribute agency, but it is explicit in science: hypotheses are proposed, subjected to tests, and selectively retained, precisely because the future cannot be deduced from the present.
In that sense, generation/discrimination is a particular implementation of this broader principle (a way of instantiating variation and selection) not the primitive itself.
I agree, I meant to be explicit that the one rule was "gravity";
Variation (chaos) comes from the tidal push/pull of all cumulative processes - all processes are nearly periodic (2nd law) and get slower - guaranteeing oscillator harmonics at intervals.
These intervals are astronomically convulted, but still promise a Fourier distribution of frequency: tidal effects ensure synchronization eventually, as all periods resonate eventually.
As systems are increasingly exposed to pendulums of positive and negative coherence, they will generalize for variance, and eventually for increasingly (fourier) selective filters of increasingly resiliente traits, that will generalize.
The system would eventually be increasingly resilient and eventually an awareness would develop.
Awareness of past periodic cycles would improve fitness (with or without consciousness) and eventually the mechanistic processes would be in the systems nature.
This is why we have pointless traditions, folk lore, collective unconscious artifacts, cyclical cataclysmic religions, the Fermi Paradox, the great filters...
Variation and selection are woven, but understanding how it all stems from gravity by means of nearly perioidic oscillators (spinning planets, tidal pools, celestial bodies) due to the conservation of angular momentum, due to the 3body problem.....that is what took a genius to reconcile
Awareness would be any form of agency, goal seeking, or loss minimizing.
As Briggs–Rauscher reactions can eventually lead to Belousov–Zhabotinsky reactions, the system can maintain homeostasis with its environment (and continuing to oscillate) by varying reactants in a loss minimizing fashion.
This loss minimizing would be done during scarcity to limp towards an abundance phase.
This is the mechanism that hypothetical tidal pools batteries would had exhibited to continue between periods of sunlight/darkness/acidity that eventually gets stratified as a resilency trait.
I'm not sure if you're familiar with the work from the lab of Mike Levin at Tufts but I'm betting you'll find it interesting if not. Here's a taste https://pmc.ncbi.nlm.nih.gov/articles/PMC6923654/
While I disagree with your notion that this is explicity due to gravity, the rest of your argument seems to align with some of this lab's work. Learning can be demonstrated on scales as low as a few molecules, way below what we would normally call "life".
I'm not sure what your argument is here, except stating an opinion that loss minimization is equivalent to agency. But even if that was accepted, which is a huge stretch, it doesn't stretch all the way to awareness.
It is, in context of its place in the cosmic scale.
Loss minimizing to a few problems will generalize into abstraction, and a few solutions will develop.
These systems with more generalizable resilency traits will encounter increasingly varied selective sieves.
Systems that survive this seive will exhibit increasingly sophisticated, generalizable solutions to prevent loss of needed dependent reactions/resources.
These solutions must exert influence to be effective; influencing the environment for its own benefit.
As systems influence their environment, delineation of "self" and "environment" becomes a fundamental barrier.
The system would prefer itself, or be outcompeted by a similar system that does.
This layer of semi-life like material would form between sunlight and the oscillating reaction, and eventually envelope it, minimizing surface tension by means of a spherical cell like structure.
Small stuff runs off of loss minimizing at a force level for its mechanistic affect; from covalent bonds to cellular ion transport, the path of lesser resistance is the fundamental forces.
As systems become more complex, the minimizing is less directly attributable to the fundamental forces and becomes more of a Byzantine dependency/feedback network.
This byzantine labyrinth of interactions is called biology.
The delineation of self, the ego.
At the highest levels, geopolitics. At the human level, mate suppression.
Lowest level, energy conservation.
I understand the sketch you are making and my claim isn't "you are wrong". My claim is "it isn't sufficient to explain all of the behavior". You are making massive leaps over important details. In order to feel a grasp on the big picture, you are turning a cow into a sphere.
"Awareness" isn't a well defined term and is often just a proxy for consciousness. But in as much as we can define it, it is one or both of experience and knowledge. You may (or may not be) aware of the hum some electronics in your house. At certain points in the day that hum is present in you attention, at other points it is absent from your attention. Sometimes you choose to bring previously unattended objects into your awareness, sometimes they are thrust there despite your will.
What is actually interesting about awareness, and one of the reasons it is a tricky subject, is that it isn't clearly related to agency. There are objects of your awareness that you do not act on, and you act with respect to objects that are provably not in your awareness.
There is also the question of the field within which these oscillations take place. Is it the electro-magnetic field? A quantum field? Which field are we talking about? If you are proposing some "principle of least action" in that field, can you describe it?
You seem to claim "loss minimization" and then hand wave the rest. But without descriptions of knowledge and experience it feels like you aren't actually saying anything except stating an opinion that reduces all knowledge and experience to loss minimization. That is an extraordinary claim and requires either extraordinary evidence or extraordinary reasoning.
A cool illusion, just another emergent property of our geometrical solution: higher dimensional aperiodic tilings of a 10^80 faceted complex polyhedra "walking" on another large aperioidic Penrose plane, that is getting smaller in a dimension we observe as "energy".
Basically a dice with a bajillion sides is getting rolled along an increasingly slim poker table, house winning eventually.
Time only goes one way, protons dont decay, energy is radiated unto the cosmic background hiss, until homogeneity is reached as CMB, and entrophy reaches 1.
I dont know where it comes from, but I know the shape it makes as it rolls by.
reply