In addition to that, what they don’t mention is that:
1. Other app stores like Google Play and Steam haven’t seen this rapid rise.
2. There are thousands maybe tens of thousands of apps that are just wrappers calling OpenAI APIs or similar low effort AI apps making up a large percentage of this increase.
3. There are billions of dollars pouring into AI startups and many of them launch an iOS app.
Has steam not seen a rapid rise in AI-asset shovelware?
I'm not talking about the AAA or the AA or even the A space (where AI is being incorporated into dev processes with various degrees of both success and low effort slop), I'm talking about the actual bottom of the barrel.
You never needed AI to make shovelware, you have been able to make a shitty game over a weekend ever since RPG maker was made and there are still games made using that.
AI just helps create some assets for games, it doesn't really make it easier or faster to make games but they might look a bit better.
I can’t speak to the quality of all the games released, but in January 2025 there were 1,413 games released on Steam and in January of this year there were 1,448.
> It's like that FT chart claiming that the rapid rise in iOS apps is evidence of an AI-fueled productivity boom.
I mean, there is evidence for some change. Personally, I'm sceptical of what this will amount to, but prior to EOY 2025, there really wasn't any evidence for an app/service boom, and now there's weak evidence, which is better than none.
Because so much technical functionality has been lost/paywalled/dark patterned/enshitified, I've cut the number of apps I use. I've realized building core personal functionality around the whims of corporations eventually just gets weaponized against me, so I might as well start undoing that on my own terms. Who in 2026 is really bringing in a new app/Saas to do much of anything like we naively did a decade ago? No one I know, we've been shown we will be treated as suckers for doing that.
The bird not having wings, but all of us calling it a 'solid bird' is one of the most telling examples of the AI expectations gap yet. We even see its own reasoning say it needs 'webbed feet' which are nowhere to be found in the image.
This pattern of considering 90% accuracy (like the level we've seemingly we've stalled out on for the MMLU and AIME) to be 'solved' is really concerning for me.
AGI has to be 100% right 100% of the time to be AGI and we aren't being tough enough on these systems in our evaluations. We're moving on to new and impressive tasks toward some imagined AGI goal without even trying to find out if we can make true Artificial Niche Intelligence.
This test is so far beyond AGI. Try to spit out the SVG for a pelican riding a bicycle. You are only allowed to use a simple text editor. No deleting or moving the text cursor. You have 1 minute.
As for MMLU, is your assertion that these AI labs are not correcting for errors in these exams and then self-reporting scores less than 100%?
As implied by the video, wouldn't it then take 1 intern a week max to fix those errors and allow any AI lab to become the first to consistently 100% the MMLU? I can guarantee Moonshot, DeepSeek, or Alibaba would be all over the opportunity to do just that if it were a real problem.
Yeah, I've found AI 'miracle' use-cases like these are most obvious for wealthy people who stopped doing things for themselves at some point.
Typing 'Find me reservations at X restaurant' and getting unformatted text back is way worse than just going to OpenTable and seeing a UI that has been honed for decades.
If your old process was texting a human to do the same thing, I can see how Clawdbot seems like a revolution though.
Same goes for executives who vibecode in-house CRM/ERP/etc. tools.
We all learned the lesson that mass-market IT tools almost always outperform in-house, even with strong in-house development teams, but now that the executive is 'the creator,' there's significantly less scrutiny on things like compatibility and security.
There's plenty real about AI, particularly as it relates to coding and information retrieval, but I'm yet to see an agent actually do something that even remotely feels like the result of deep and savvy reasoning (the precursor to AGI) - including all the examples in this post.
> Typing 'Find me reservations at X restaurant' and getting unformatted text back is way worse than just going to OpenTable and seeing a UI that has been honed for decades.
Your conflating the example with the opportunity:
"Cancel Service XXX" where the service is riddled with dark patterns. Giving every one an "assistant" that can do this is a game changer. This is why a lot of people who aren't that deep in tech think open claw is interesting.
> We all learned the lesson that mass-market IT tools almost always outperform in-house
Do they? Because I know a lot of people who have (as an example) terrible setups with sales force that they have to use.
> We all learned the lesson that mass-market IT tools almost always outperform in-house,
Funny, I learned the exact opposite lesson. Almost all software suck, and a good way for it not to suck is to know where the developer is and go tell them their shit is broken, in person.
If you want a large scale example, one of the two main law enforcement agency in france spun off libreoffice into their own legal writing software. Developped by LEOs that can take up to two weeks a year to work on that. Awesome software. Would cost litterally millions if bought on the market.
One of the most important details of Sacks's life which dogged him nearly to the end (and which is important to this NY piece), was a minimization by Sacks of his own sexuality. He was not "openly gay" at all.
One of the biggest problems frontier models will face going forward is how many tasks require expertise that cannot be achieved through Internet-scale pre-training.
Any reasonably informed person realizes that most AI start-ups looking to solve this are not trying to create their own pre-trained models from scratch (they will almost always lose to the hyperscale models).
A pragmatic person realizes that they're not fine-tuning/RL'ing existing models (that path has many technical dead ends).
So, a reasonably informed and pragmatic VC looks at the landscape, realizes they can't just put all their money into the hyperscale models (LP's don t want that) and they look for start-ups that take existing hyperscale models and expose them to data that wasn't in their pre-Training set, hopefully in a way that's useful to some users somewhere.
To a certain extent, this study is like saying that Internet start-ups in the 90's relied on HTML and weren't building their own custom browsers.
I'm not saying that this current generation of start-ups will be successful as Amazon and Google, but I just don't know what the counterfactual scenario is.
The question that isn't answered completely in the article is how useful are the pipelines for these startups? The article certainly implies that for at least some of these startups there very little value add in the wrapper.
Got any links to explanations of why fine tuning open models isn’t a productive solution?
Besides renting the GPU time, what other downsides exist on today’s SOTA open models for doing this?
I think it's interesting that everyone's immediate reaction now-a-days is to assume incompetence or maliciousness, rather than curiosity at the root cause (very telling this attitude has even permeated a forum for supposed 'hackers').
A high-level is that 80% of the economy is very easy to track b/c it's not very volatile (teachers, for example).
What we have seen is a huge surge in unpredictability in the most volatile 20% of jobs (mining, manufacturing, retail, etc.). The BLS can't really change their methods to catch up with this change for classic backwards compatibility and tech debt reasons.
Part of the reason 'being a quant' is so hot right now is that we truly are in weird times where volatility is much higher than most people realize across sectors of the economy (i.e. AI is changing formerly rock-solid SWE employment trends, tariffs/electricity are quickly and randomly changing domestic manufacturing profitability, etc.). This means that if you can build systems that track data better than the old official systems, you can make some decent money investing against your knowledge.
I think this is a bad state of affairs, but I don't have a good solution. Any private company won't release their data b/c it's too valuable and I am reluctant to encourage the BLS to rip up their methods when backwards compatibility is a feature worth saving.
Is there really more volatility? My gut feeling is that government interventions have flattened it over recent decades. I’d like to see some real figures on this.
Manufacturing and mining are becoming much less correlated to the overall jobs market (likely, as you point out, b/c the government smooths the other sectors).
Can you actually prove volatility is higher now than in the past? There have been plenty of volatile changes in the workforce over the past several decades, this is not anything new to the job market.
> interesting that everyone's immediate reaction now-a-days is to assume incompetence or maliciousness, rather than curiosity at the root cause
I came across this claim last week regarding recent US jobs figures:
> "All jobs gains were part time. Full-time jobs: -357K. Part-time jobs: +597K"
If this claim is true, and I have no means to tell if it is, then - regardless of one's view on whoever is in power right now - do we really expect any elected representatives to be brave enough to say that out loud at a press conference?
Explain to me please why job numbers aren’t simply a matter of querying the Federal social security database? A longstanding process of polling businesses for what they want to report, followed by corrections up to one year later, has got to be a pantomime to fudge the numbers.
Does that pass the basic common sense smell test? Everyone can see on their paycheck the amount, that is paid 30 days after any work day in the worst case. These payments are sent to a single federal bank account, and data-wise are combined with Social Security ID, sending bank id, date. It’s a bank, there’s a database. We are talking at most about 200mm records, a raspberry pi can process that query in minutes. If we can’t query this easily it’s by design. Or we could do some backflips and somersaults to try to come up with a reason for why the bureaucracy has to be more complicated.
The payments are deposited monthly or semiweekly (for employers with large payroll) but that's a lump sum. If you are looking at that from the government side all you can tell is whether total payroll has gone up or down. That won't tell if any change is due to a change in number of employees or a change in pay rates or some combination of that.
It isn't until the employer files their quarterly Form 941 that you'd see employment numbers. Form 941 includes the number of employees and total wages and withholding.
It isn't until the annual W-2 filings that you would see a breakdown that includes number of employees and the individual pay.
Not all 'normal income' is from a "job" as we think of it and assuming that does not even come close to passing any informed person's smell test.
Parsing tax or SS payments for what a "job" is would be a logistical nightmare, because that's not what the system is designed for (unlike the BLS's system, which is designed to count jobs).
When ppl want job numbers they want a reliable proxy for the state of the economy. Fixing it on changes to payroll-based social security payments would be far better than what we have now, if timely.
I only see a stat that reports the same number for full employment vs one person who fired them all and took their incomes. Is there a way to disaggregate to get some proxy for employment like we are talking about?
So the answer is payments per social security id are not reported to the social security Electronic Federal Tax Payment System (EFTPS), employers only report aggregate payments. And workers and employers only report payments by individual in W2’s in January.
Probably the only reason is because the BLS and SSA are completely separate, and SSA is probably antiquated and doesn't attempt to tag or organize their data along the same parameters as whatever the BLS defines. It likely neither has the staffing nor resources to provide those hooks and realtime anonymous aggregated data for other departments to consume.
A lot of people don't understand that collecting data is actually expensive and difficult when it doesn't involve surreptitiously stealing it via some piece of tech.
Meta is also a great example of AI leading to higher user engagement today.
Reels isn't powered by Transformers per se (likely more of a complex mix of ML techniques), but it is powered by honest-to-goodness SOTA AI/ML running on leading-edge Nvidia GPUs.
I think, because they're so impressive, people assume Transformers = AI/ML, when there's plenty of other hyperscale AI/ML products on the market today.
The article is mostly about how there are now recognized to be certain schizophrenia-like conditions that are clearly autoimmune diseases. Mentioned in the article are anti-NMDA-receptor encephalitis, which responds to immunotherapy, and a previously published case of a woman mid-diagnosed with catatonic schizophrenia fully recovering after being treated for lupus with immunosuppressive therapy.
Based on this, the article suggests that the rituximab Mary was given along with chemo was the key. However, they were unable to test conclusively for antibody evidence of this theory after the fact.
I have a family member with an incidence of autoimmune encephalitis secondary to other conditions (my entire family is an autoimmune cluster) who is actually hospitalized for it now. This almost matches my experience to a tee, though anti-NMDAR was tested for and not found. The neurologists wanted to discharge prior to attempting immunotherapy and thankfully we were able to ensure they tried (pulse steroids).
It's certainly an area which can be characterized as rare disease, whether paraneoplastic or otherwise.
Probably why we keep looking at electroconvulsive ‘therapy’ again and again. Triggering the body’s systems to do something often cleans up other situations at the same time.
There was a phenomenon where sometimes a high fever would cure STDs like syphilis. We generally use antibiotics now that we have them, because they are less dangerous.
I always ask people, in the past year, how many AI-coded apps have you 1) downloaded 2) paid for?
reply