More

josephcsible · 2026-01-26T00:19:35 1769386775

One of these things is true:

1. Training AI on copyrighted works is fair use, so it's allowed no matter what the license says.

2. Training AI on copyrighted works is not fair use, so since pretty much every open source license requires attribution (even ones as lax as MIT do; it's only ones that are pretty much PD-equivalent like CC0, WTFPL, and Unlicense that don't) and AI doesn't give attribution, it's already disallowed by all of them.

So in either case, having a license mention AI explicitly wouldn't do any good, and would only make the license fail to comply with the OSD.

TomOwens · 2026-01-26T01:08:02 1769389682

Point 2 misses the distinction between AI models and their outputs.

Let's assume for a moment that training AI (or, in other words, creating an AI model) is not fair use. That means that all of the license restrictions must be adhered to.

For the MIT license, the requirement is to include the copyright notice and permission notice "in all copies or substantial portions of the Software". If we're going to argue that the model is a substantial portion of the software, then only the model would need to carry the notices. And we've already settled on accessing over a server doesn't trigger these clauses.

Something like the AGPL is more interesting. Again, if we accept that the model is a derivative work of the content it was trained on, then the AGPL's viral nature would require that the model be released under an appropriate license. However, it still says nothing about the output. In fact, the GPL family licenses don't require the output of software under one of those licenses to be open, so I suspect that would also be true for content.

So far, though, in the US, it seems courts are beginning to recognize AI model training as fair use. Honestly, I'm not surprised, given that it was seen as fair use to build a searchable database of copyright-protected text. The AI model is an even more transformative use, since (from my understanding) you can't reverse engineer the training data out of a model.

But there is still the ethical question of disclosing the training material. Plagiarism still exists, even for content in the public domain. So attributing the complete set of training material would probably fall into this form of ethical question, rather than the legal questions around intellectual property and licensing agreements. How you go about obtaining the training material is also a relevant discussion, since even fair use doesn't allow you to pirate material, and you must still legally obtain it - fair use only allows you to use it once you've obtained it.

There are still questions for output, but those are, in my opinion, less interesting. If you have a searchable copy of your training material, you can do a fuzzy search of that material to return potential cases where the model returned something close to the original content. GitHub already does something similar with GitHub Copilot and finding public code that matches AI responses, but there are still questions there, too. It's more around matches that may not be in the training data or how much duplicated code needs to be attributed. But once you find the original content, working with licensing becomes easier. There are also questions about guardrails and how much is necessary to prevent exact reproduction of copyright protected material that, even if licensed for training, isn't licensed for redistribution.

JoshTriplett · 2026-01-26T06:51:43 1769410303

> The AI model is an even more transformative use, since (from my understanding) you can't reverse engineer the training data out of a model.

You absolutely can; the model is quite capable of reproducing works it was trained on, if not perfectly then at least close enough to infringe copyright. The only thing stopping it from doing so is filters put in place by services to attempt to dodge the question.

> In fact, the GPL family licenses don't require the output of software under one of those licenses to be open, so I suspect that would also be true for content.

It does if the software copies portions of itself into the output, which seems close enough to what LLMs do. The neuron weights are essentially derived from all the training data.

> There are also questions about guardrails and how much is necessary to prevent exact reproduction of copyright protected material that, even if licensed for training, isn't licensed for redistribution.

That's not something you can handle via guardrails. If you read a piece of code, and then produce something substantially similar in expression (not just in algorithm and comparable functional details), you've still created a derivative work. There is no well-defined threshold for "how similar", the fundamental question is whether you derived from the other code or not.

The only way to not violate the license on the training data is to treat all output as potentially derived from all training data.

josephcsible · 2026-01-25T23:50:50 1769385050

> FWIW, people here illegally are already not eligible for Medicaid, [0] so it's hard to see why ICE having access to a roster of Medicaid enrollees would help them with their stated mission of enforcing removal orders.

Presumably, it's because a lot of them are getting Medicaid despite not being eligible to. Isn't the point of every audit, investigation, etc. to find things that aren't being done correctly?

cthalupa · 2026-01-26T04:14:20 1769400860

> Presumably, it's because a lot of them are getting Medicaid despite not being eligible to

Why are you presuming this? There is no evidence this is happening in any widespread fashion.

> Isn't the point of every audit, investigation, etc. to find things that aren't being done correctly?

If it is being honest about it's intention, yes. I think we have seen an absolute mountain of evidence that this administration does "audits" as massive data collection waves to suit any and every purpose they want, though.

If this was about fixing things being done incorrectly, DHHS should be doing the audit, not DHS. Perhaps the latter doesn't understand the difference between the two, though, not noticing they're missing an H in their abbreviation.

mikeyouse · 2026-01-26T03:25:39 1769397939

ICE isn’t auditing Medicaid FFS. And no, there’s absolutely no evidence they’re getting access to Medicaid.

They’re single mindedly looking for undocumented immigrants to deport.

hwguy45 · 2026-01-26T04:31:20 1769401880

No evidence because there has been no investigation. The massive Somali fraud had no evidence until a random YouTuber started knocking on quality learing center doors, now lots of new evidence has been found.

fendy3002 · 2026-01-26T06:01:03 1769407263

if there are massive frauds, DOGE should've revealed that. The fact that people keep spewing no investigation while there should be several times shows how ignorant people is.

ljsprague · 2026-01-26T06:47:27 1769410047

DOGE would have found everything.

sethherr · 2026-01-26T04:53:50 1769403230

Not true. NYTimes had reported on it

josephcsible · 2026-01-25T23:21:55 1769383315

With one extra caveat. From `man 7 tcp`: "As currently implemented, there is a 200 millisecond ceiling on the time for which output is corked by TCP_CORK. If this ceiling is reached, then queued data is automatically transmitted."

josephcsible · 2026-01-25T23:18:14 1769383094

What advantage do you see from using eFuses and not some other way to store the password?

thesh4d0w · 2026-01-25T23:22:18 1769383338

This is the only way I could come up with that would allow an end user to do a full factory reset, and end up back in a known good secure state afterwards.

Storing it in the firmware would mean every user has the same key. Storing it in eeprom means a factory reset will clear it. This allows me to ship hardware with the default key on a sticker on the side, and let's a non technical user reset it back to that if they need to.

It gives you a 256bit block to work with - https://docs.espressif.com/projects/esp-idf/en/stable/esp32/...

josephcsible · 2026-01-26T00:13:12 1769386392

But couldn't you also just set aside a bit of the EEPROM your factory reset skips, and accomplish the same thing?

josephcsible · 2026-01-25T23:17:00 1769383020

There are not. The entire premise of eFuses are that after you buy something, the manufacturer can still make changes that you can't ever undo.

josephcsible · 2026-01-24T04:53:06 1769230386

They need to take on that liability to let the human driver stop paying attention, and being able to do that is huge.

josephcsible · 2026-01-23T15:40:17 1769182817

This story is exactly the reason to insist that: cars can be driven by people other than their owners.

watwut · 2026-01-23T21:06:25 1769202385

Sure. And legal entity that borrows those cars just need to follow the law.

Nothing prevents them from borrowing cars.

ceejayoz · 2026-01-23T18:40:17 1769193617

"Hey, you got a ticket when you borrowed my car. Pay me back or you don't get to borrow it next time."

josephcsible · 2026-01-23T20:11:34 1769199094

If you break the government's rules, that should be between you and the government. I shouldn't have to front the cost of any fines or otherwise be in the middle of it.

ceejayoz · 2026-01-23T21:31:44 1769203904

If you lend your car to someone, that’s between you and them.

Same if they crash it. Or run it out of gas.

josephcsible · 2026-01-23T15:39:38 1769182778

You've inadvertently completed both parts of a proof by cases. We don't want speeding laws enforced at all right now, because most speed limits are way too low, because they're set for reasons other than actual traffic safety. Let's raise all speed limits to the 85th percentile speed first and only then talk about stepping up enforcement.

Mawr · 2026-01-24T01:57:17 1769219837

Let's not. The Xth percentile speed is not an appropriate measure for a few reasons:

1. Humans are not generally capable of sufficiently accurate long-term low-incidence risk assessment. Meaning, you irrationally value potentially getting to work 10 seconds faster over a 50% increased chance you run over a child crossing the street.

2. Humans are subject to too many irrational psychological factors; stuff like:

• False sense of security due to sitting in a box isolated from the outside world, that's advertised to keep them "safe" in case of a collision.

• Herd mentality, e.g. "everyone's going over the limit, so I will too". Bonus points for rationalizing this behavior "because it's safer to go at the speed of traffic!".

• Delusional rationalizations like "if the limit is 50 then going 10 over must be fine too, due to <reasons>!". Bonus points for applying the "5/10/15/20 over" rule for every possible speed limit — basic maths and physics say hello!

3. The speed humans will travel at on a given road depends primarily on what speed that road seems designed for. People will drive faster on straight, wide roads and slower on winding, narrow ones, regardless of the speed limit. Changing speed limits has little effect compared to changing the physical infrastructure. Show me a picture of a road and I'll tell you how fast people will drive on it.

As such, it makes no sense to first make some sort of a road and only then figure out the limits by observing real traffic. Figure out the appropriate limit first, then design the road with it in mind.

josephcsible · 2026-01-24T02:18:07 1769221087

> Bonus points for rationalizing this behavior "because it's safer to go at the speed of traffic!".

But that's true (look up the Solomon curve), and it's exactly why the 85th percentile would be better.

> Delusional rationalizations like "if the limit is 50 then going 10 over must be fine too, due to <reasons>!". Bonus points for applying the "5/10/15/20 over" rule for every possible speed limit — basic maths and physics say hello!

You have cause and effect backwards. People think it's safe to go over the speed limit precisely because most speed limits are too low.

> Changing speed limits has little effect compared to changing the physical infrastructure. Show me a picture of a road and I'll tell you how fast people will drive on it.

Right. So even if going slower is safer, just making the speed limit lower won't accomplish that.

fc417fc802 · 2026-01-24T10:36:20 1769250980

I'll agree with you regarding major arterials but disagree when it comes to suburban neighborhoods. What feels safe from the perspective of someone operating a vehicle can be quite different than what's actually safe when there are pedestrians and cars unexpectedly popping out of driveways.

josephcsible · 2026-01-25T19:18:00 1769368680

> What feels safe from the perspective of someone operating a vehicle can be quite different than what's actually safe when there are pedestrians and cars unexpectedly popping out of driveways.

That's all the more reason to raise speed limits on the major roads. Speed limits being more reasonable there makes it more likely that drivers would abide by them even on those smaller residential streets.

josephcsible · 2026-01-23T15:37:41 1769182661

You can't avoid tickets from speed cameras just by not speeding. You can't even avoid them just by not driving! https://www.youtube.com/watch?v=KY38N4vnhzI

And besides, as other commenters pointed out, even if things get lost in the mail or the government otherwise drops the ball, they'll still consider that your fault.

josephcsible · 2026-01-23T14:12:52 1769177572

A constant cat-and-mouse game of Microsoft disabling every method to do so as soon as they become somewhat popular is not "relatively easy".

jqpabc123 · 2026-01-23T15:34:20 1769182460

Really? I haven't had any problems, even with computers that don't meet the official hardware requirements.

Download the Win11 Pro ISO, extract it to a USB drive and then execute the command below from it for a totally automated install that bypasses all the BS.

.\setup.exe /product server /auto upgrade /EULA accept /migratedrivers all /ShowOOBE none /Compat IgnoreWarning /Telemetry Disable

You're welcome!

PS: I know it says "server" but when upgrading a desktop machine, desktop is what you will get --- minus a lot of BS.

josephcsible · 2026-01-23T17:36:25 1769189785

I believe you that that way works today, but once knowledge of it starts to spread, I expect Microsoft to break it, just like they previously broke Shift+F10 "oobe\bypassnro" and "start ms-cxh:localonly".

alt227 · 2026-01-24T18:44:47 1769280287

There will always be a way to create local accounts in Windows because they are intrinsic to how windows actually works.

josephcsible · 2026-01-24T19:42:03 1769283723

There may always be a way, but every few months the existing way will stop working and people will need to discover a new way.

alt227 · 2026-01-25T13:50:55 1769349055

Thats exactly my point, they will keep closing loopholes but they will never truly stop people doing it without removing local accounts completely, which they cant do.

jqpabc123 · 2026-01-23T17:49:37 1769190577

It has worked all along and MS can't break it because I have the ISO that it works with.

It's unlikely it can be broken without totally abandoning the server market and disrupting a lot of existing installations --- which would be a marketing disaster.