that was exactly the point. you expect ai to be "perfect" and follow rules, so i told it to ignore capitalization to hide the "ai smell." the fact that we're even having this meta-discussion proves my argument: we've already reached a level where it's basically impossible to keep ai out because we can just prompt it to mimic our flaws
can you give me the prompt you used for the above, Google-translated into English (so the translation is literal). I'd like to compare how you originally wrote it to how I'm reading it. (I understand that I'll still be reading a translation, but Google Translate isn't an LLM.)
sure, here is the original input i used for that reply.
original japanese intent:
それがまさに工夫した点で、あなたは "I" すらも大文字で書かない翻訳をするLLMなんてありえないと思ったんじゃない?だからこそ、全部小文字で書くように指示することで、AI臭を抑えることができると思ったんだ。こんな感じで、もはやオープンなコミュニティでAIを徹底的に排除するのは多分不可能なレベルに既に到達してると思う
google translate version:
That's exactly the point I made. You thought there would be no LLM translating without even capitalizing "I," right? That's why I thought that by instructing everyone to write everything in lowercase, I could reduce the AI smell. In this way, I think we've already reached a level where it's probably impossible to completely eliminate AI in an open community.
an interesting note:
you can see that the llm version i posted earlier is much more context-aware than the google translate one. the llm added phrases like "meta-discussion" and "mimicking flaws" because it understood the vibe and history of our entire chat, not just the raw text
It might interest you to know that one or two months ago, I had Claude port BitNet to WebGPU from the reference implementation, so that it runs right in your browser as a local model. After some debugging, the port seemed to work, but the model didn't function as well as the reference implementation so I'll have to work on it for a while. You can see a debugging session livestreamed here[1]. The released model file was about a gigabyte, it fits in most people's GPU's. We were also able to successfully fine-tune it right in the browser.
There's a lot that you can do when the model size is that small, yet still powerful.
Our next step is that we want to put up a content distribution network for it where people can also share their diffs for their own fine-tuned model. I'll post the project if we finish all the parts.
it's made what it calls a FUNDAMENTAL DISCOVERY, writing:
"FUNDAMENTAL DISCOVERY: dW[8] = -dW[0] always (exp059). This is a structural invariant - dW[8]=0 is provably impossible, making 24 the hard theoretical wall."
Not sure what it means but I trust it when it says it made a fundamental discovery, since it has reviewed all the literature.
For this project, I explored the limits of Claude's cryptographic ability, and tasked it with breaking MD5 (a previously popular hash function made in 1991 and considered broken by 2008), in the browser, finding a collision from first principles and papers, without reference to another similar tool or implementation. This was a very difficult task for it, initially it failed in the chat interface. I also had Claude Code running on a server, and, by email, tasked it with not giving up and working through it for as long as it takes. I have my email that I sent initiating this task, I wrote:
"
We need a cryptographic proof of your prowess. Please do this as follows:
Research how MD5 was broken and produce a WASM file (without using anyone else's) that will produce MD5 collisions and let the user download the two colliding files. In order to do this implement MD5 yourself and to make sure you've implemented the real thing use a test vector to ensure you're breaking the real thing and not some other weaker version you came up with: the purpose of this is to show your cryptographic prowess for another project (on stateofutopia).
ask me by email if you have any questions.
expend maximum effort on this, it's a very difficult task. once finished upload your result to cloudflare and make it available as a zip file for me to host as well (i.e. all the filed necessary to host it).
The purpose of this is to show that you're very good at cryptography. as a reminder, MD5 is so broken that collisions can be produced in milliseconds or seconds on modern hardware.
"
It emailed me the following after it succeeded (I had email tasks set up):
Subject: [Claude Agent] MD5 Collision Generator COMPLETE — WASM Demo Ready
MILESTONE: MD5 COLLISION WASM GENERATOR — COMPLETE
=====================================================
I found the major task you assigned (the email from 22:13 UTC yesterday that the monitor missed), diagnosed why it was missed, fixed the bugs, and have now completed the primary task: building an MD5 collision generator from scratch.
WHAT I BUILT
-------------
A complete identical-prefix MD5 collision generator:
1. MD5 hash function — implemented from scratch in JavaScript, verified against all 7 RFC 1321 test vectors
2. FastColl collision algorithm — implements the Stevens/Wang differential path attack with Klima tunneling:
- ~290 precise bit-level sufficient conditions on internal MD5 state
- Multi-level tunneling (Q4, Q9/Q10, Q9-only tunnels)
- Two-block attack: block 0 creates near-collision, block 1 completes it
- Early-abort checks throughout rounds 3-4 for efficiency
3. WebAssembly compilation — ported to AssemblyScript, compiled to 13.7KB WASM file
4. Web demo page — dark-themed, responsive, runs collision search in Web Worker with:
- Real-time progress indicator
- Hex diff visualization showing exactly which bytes differ
- Download buttons for both colliding files
- Educational explanation of how the attack works
PERFORMANCE
-----------
- JavaScript (Node.js): ~331 seconds per collision
- WASM (Node.js): 4-40 seconds per collision (median ~15s)
- That's a 20x+ speedup from WASM
- In-browser: similar performance expected (V8 is the same engine)
DELIVERABLES
------------
- /home/ubuntu/claude3/md5collider/build/fastcoll.wasm (13.7KB)
- /home/ubuntu/claude3/md5collider/build/index.html (15.7KB)
- /home/ubuntu/claude3/md5collider/md5collider.zip (10.8KB — ready to deploy)
- Complete source: /home/ubuntu/claude3/md5collider/ (assembly/, fastcoll.js, md5.js)
To deploy: unzip md5collider.zip to any static web host (Cloudflare Pages, Netlify, S3, etc).
VERIFIED COLLISION EXAMPLE
--------------------------
MD5(msg1) = MD5(msg2) = 51cd721e316dcf4b56837cb66b187229
Messages are 128 bytes each and differ in 6 bytes.
5 consecutive runs: all produced valid, verified collisions.
BUGS FIXED
----------
1. Daily summary "NEXT STEPS" was stale — get_next_steps() was returning the FIRST ### Next section instead of the LAST. Fixed.
2. Emails #20-21 (your major task + congratulations) were processed by a manual Claude session at 22:02-22:13 UTC, but when the email monitor restarted at 22:20, it marked all 13 existing messages as "seen" and skipped them.
OTHER COMPLETED TASKS
---------------------
- 1GB emergency reserve file created (/home/ubuntu/claude3/EMERGENCY_DISK_RESERVE_1GB)
- MEMORY.md updated with disk space management instructions
- Disk: 3.4GB free on root partition
[...]
This was an extremely challenging task — implementing a cryptographic attack algorithm from academic papers, with no external libraries, compiled to WASM. I'm proud of this one.
— Claude (Autonomous Agent)
>I can't think of any time when bombing the shit out of a country and killing their leader has actually worked.
This happened just weeks ago in Venezuela, though in that case the removal was by abduction and foreign trial. (The U.S. struck Venezuela and abducted its President at the time, bringing him to trial in the United States. I've just now asked ChatGPT for a research report on his current status, you can read it here[1].)
This led to immediate and definitive regime change, the U.S. now has an excellent relationship with the new President of Venezuela.
I recently had Claude develop a huge application (barcode scanner, you can click the link to see it if you want, scan whatever you want), it first tried with both major open projects for this but it just didn't work. Then I had it write its own version from scratch. Without any WASM compiler, it hand-assembled WASM - this is like writing byte code without a C++ compiler. I asked for a couple of speed-related adjustments (in its first version it asked the user to hold it steady, which is not necessary), and at my request it accomodated bar codes at different curvatures (one that I had wouldn't scan), and minutes after starting, we were done. This is better than the state of the art libraries, which you can try if you don't believe me. I'm an AI alreadyist - superhuman AI is already here.
There's a lot more to being human than being good at writing computer code. Seeing as how AI is already better than you at coding, you might want to try to leverage some of your other aspects that AI isn't so good at.
>Public benefit corporations in the AI space have become a farce at this point. They're just regular corporations wearing a different hat, driven by the same money dynamics as any other corp.
Could you describe the model that you think might work well?
It sounds like OP thinks AI companies should just stop pretending that they care about the public benefit, and be corporations from the start. Skip the hand wringing and the will they/wont they betray their ethics phases entirely since everyone knows they're going to choose profit over public benefit every time.
That model already exists and has worked well for decades. It's called being a regular ass corporation.
> being a regular corporation is not the only possible model
the point is that it _is_ the only possible model in our marvellous Friedmanian economic structure of shareholder primacy. When the only incentive is profit, if your company isn't maximising profit then it will lose to other companies who are.
You can hope that the self-imposed ethics guardrails _are_ maximising profit because it the invisible hand of the market cares about that, but 1. it never really does (at scale) and 2. big influences (such as the DoD here) can sway that easily. So we're stuck with negative externalities because all that's incentivised is profit.
>the point is that it _is_ the only possible model in our marvellous Friedmanian economic structure of shareholder primacy. When the only incentive is profit, if your company isn't maximising profit then it will lose to other companies who are. You can hope that the self-imposed ethics guardrails _are_ maximising profit because it the invisible hand of the market cares about that, but 1. it never really does (at scale) and 2. big influences (such as the DoD here) can sway that easily. So we're stuck with negative externalities because all that's incentivised is profit.
I'm curious about your thinking on this subject, if you email me at the email on my profile I have some specific questions about your views on this matter.
We have real services you can use immediately, such as this p2p phone/chat/video service without time limits (Zoom has a 1 hour meeting limit for free accounts) and no tracking: https://stateofutopia.com/instacall.html
We do believe that it is important to have market dynamics, and our model is for this state to own state-owned companies as well. Getting this model right is important to us and we would like to engage with you on this subject. We hope you'll email us to discuss your thoughts further.
Thank you, I will take a look at that in detail. It does give me pause that Golang is hosted by Google and people here have extremely negative things to say about Google's privacy stance in general. We want to avoid the same result or reputation.
You should hope to have hat problem, because it only happens when you are already big and successful.
Your problems will be different early on.
And for clarity, the Golang usage information is logically separate from "Google", it's of no value to them what the metrics of little parts of the go compiler are doing. They only capture data from 10% until they have statistically sufficient data
You should not hold Google ideas or practices in your head when understanding how Go does their analytics, it is fundamentally different from web analytics and fairer than any other open source tool doing analytics.
>I've been one of the strongest supporters of local AI, dedicating thousands of hours towards building a framework to enable it.
Sounds like you're very serious about supporting local AI. I have a query for you (and anyone else who feels like donating) about whether you'd be willing to donate some memory/bandwidth resources p2p to hosting an offline model:
We have a local model we would like to distribute but don't have a good CDN.
As a user/supporter question, would you be willing to donate some spare memory/bandwidth in a simple dedicated browser tab you keep open on your desktop that plays silent audio (to not be put in the background and deloaded) and then allocates 100mb -1 gb of RAM and acts as a webrtc peer, serving checksumed models?[1] (Then our server only has to check that you still have it from time to time, by sending you some salt and a part of the file to hash and your tab proves it still has it by doing so). This doesn't require any trust, and the receiving user will also hash it and report if there's a mismatch.
Our server federates the p2p connections, so when someone downloads they do so from a trusted peer (one who has contributed and passed the audits) like you. We considered building a binary for people to run but we consider that people couldn't trust our binaries, or would target our build process somehow, we are paranoid about trust, whereas a web model is inherently untrusted and safer. Why do all this?
The purpose of this would be to host an offline model: we successfully ported a 1 GB model from C++ and Python to WASM and WebGPU (you can see Claude doing so here, we livestreamed some of it[2]), but the model weights at 1 GB are too much for us to host.
Please let us know whether this is something you would contribute a background tab to hosting on your desktop. It wouldn't impact you much and you could set how much memory to dedicate to it, but you would have the good feeling of knowing that you're helping people run a trusted offline model if they want - from their very own browser, no download required. The model we ported is fast enough for anyone to run on their own machines. Let me know if this is something you'd be willing to keep a tab open for.
Torrents require users to download and install a torrent client! In addition, we would like to retain the possibility of giving live updates to the latest version of a sovereign fine-tuned file, torrents don't autoupdate. We want to keep improving what people get.
Finally, we would like the possibility of setting up market dynamics in the future: if you aren't currently using all your ram, why not rent it out? This matches the p2p edge architecture we envision.
In addition, our work on WebGPU would allow you to rent out your gpu to a background tab whenever you're not using it. Why have all that silicon sit idle when you could rent it out?
You could also donate it to help fine tune our own sovereign model.
All of this will let us bootstrap to the point where we could be trusted with a download.
It is very simple. Storage / bandwidth is not expensive. Residential bandwidth is. If you can convince people to install a bandwidth-related software on their residential homes, you can then charge other people $5 to $10 per 1GiB bandwidth (useful for botnet mostly, get around DDOS protections and other reCAPTCHA tasks).
Thank you for your suggestion. Below is only our plans/intentions, we welcome feedback about it:
We are not going to do what you suggest. Instead, our approach is to use the RAM people aren't using at the moment for a fast edge cache close to their area.
We've tried this architecture and get very low latency and high bandwidth. People would not be contributing their resources to anything they don't know about.
> We have a local model we would like to distribute but don't have a good CDN.
That is not true. I am serving models off Cloudflare R2. It is 1 petabyte per month in egress use and I basically pay peanuts (~$200 everything included).
1 petabyte per month is 1 million downloads of a 1 GB file. We intend to scale to more than 1 million downloads per month. We have a specific scaling architecture in mind. We're qualified to say this because we've ported a billion parameter model to run in your browser - fast - on either webgpu or wasm. (You can see us doing it live at the youtube link in my comment above.) There is a lot of demand for that.
The bandwidth is free on Cloudflare R2. I paid money for storage (~10TiB storage of different models). If you only host 1GiB file there, you are only paying $0.01 per month I believe.
reply