More

jackson1372 · on March 2, 2024

It's an open secret that Mistral finetunes on GPT outputs

mise_en_place · on March 2, 2024

I’ve heard this before, but can’t speak to the veracity of it. Does anyone have sources?

loudmax · on March 2, 2024

Training large language models takes an enormous amount of data. Ideally, multiples of Wikipedia and public domain content. Plus you want high quality data, so if you're going to pull in Reddit or something you need some way to separate factually accurate comments from garbage trolls.

Using output from ChatGPT is one way to generate a large volume of high quality data. But this is expressly forbidden by OpenAI's terms of service so you can't advertise the fact that that you're doing this. OpenAI is on shaky ground if they go to sue though, because so much of their training was done on copyrighted material that they hadn't gotten permission to use to begin with.

jackson1372 · on March 2, 2024

I don't have sources, it's just a rumor.

jackson1372 · on Oct 28, 2023

claude.ai

jackson1372 · on Sept 24, 2023

The article doesn't mention two great features on iOS: - Move cursor by tap-and-hold on soace bar - tap-tap for word selection and tap-tap-tap for sentence selection.

I use these two constantly

jackson1372 · on July 15, 2023

Claude Instant 1.1 is a better comparison for the price/performance of GPT-3.5

jackson1372 · on July 11, 2023

GRE writing: 5 for Claude vs 4 for GPT-4

Bar exam: 76.5 for Claude vs 75.7 for GPT-4

jackson1372 · on May 11, 2023

See the pricing PDF[^1] and API docs[^2], but TL;DR:

- Price per token doesn't change compared to regular models

- Existing api users have access now by setting the `model` param to "claude-v1-100k" or "claude-instant-v1-100k"

- New customers can join waitlist at anthropic.com/product

[1]: https://cdn2.assets-servd.host/anthropic-website/production/... [2]: https://console.anthropic.com/docs/api/reference#parameters

jackson1372 · on Feb 1, 2022

ANTHROPIC ・ Frontend/UX Engineer ・ SF-based ・ Remote-friendly ・ https://www.anthropic.com

Rapidly build prototype interfaces for large language models. Align AI with human values.

Tech stack: Next.js, Svelte, Python

Apply: https://jobs.lever.co/Anthropic/be5f1be0-e0c8-4a43-934f-cea1...

jackson1372 · on Dec 8, 2019

Isn't the explanation for this that the world actually does work in some way or other and that it's not just infinite chaos and so if you keep throwing parameters at some problem, you will eventually stumble upon the "real" structure, but that's no guarantee of when that occurs, and with which parameters?

joe_the_user · on Dec 8, 2019

Well, the thing is that when one says "the world has structure", one is saying that there are variety of structures "out there", in the world.

But that doesn't mean there's a single structure determined by a single set of parameters. Quite possibly there are numerous structures with not-compatible parameter structures.

Moreover, common AI data sets share parameters in a fashion that isn't always obvious - most images on the web are photos taken by human photographers who tend to center their subject, effectively giving them different parameters than, say, security camera footage. IE, "normal data" may not mean what we imagine.

jackson1372 · on Dec 7, 2019

The reason you want to over-parameterize your model is that it protects you from "bad bounce" learning trajectories. You effectively spread out your overfitting risk until it's pretty close to 0.

Or at least that's the way I like to think of it.

The next step is to better compress the resulting model in a simpler, less computationally costly network.

gyuserbti · on Dec 8, 2019

Are you suggesting dd is about local minima sort of? Like if you extended the risk: parametrization curve out you'd start to see overfitting again?

jackson1372 · on March 13, 2019

If you read between the lines, it's clear that Spotify was able to make a normal watch app. But they wanted to make one that had special functionality not yet allowed by Apple, for any app, not just Spotify.

calcifer · on March 13, 2019

There is a fine line between reading between the lines and fabricating things. Can you point to what exactly in that website substantiates your implication?

dabernathy89 · on March 13, 2019

Their timeline is a bit unclear. They claim that in 2015 and 2016, they were denied outright. In 2017 they say that "Apple continues to make it challenging for us to deliver a workable streaming solution for the Apple Watch". Then in 2018 they say that "Apple finally allows enhanced functionality for the Spotify app on the Apple Watch".

graeme · on March 13, 2019

Until watchos 5, 3rd party apps couldn’t do lte streaming. That said, lte was new in watchos 4, the tslk was that apple wanted to initially restrict it for battery life reasons.

Spotify could have made a watch app that provided controls and had offline somgs. They didn’t, and kept shutting down spps that did, or buying them and shutting them. This was a major sore point against spotify on the apple watch subreddit. Even now they’re dragging their feet, iirc.

Kesty · on March 13, 2019

Their point is that Apple app could do that and it was unfair to block other apps to be able to do the same.

graeme · on March 13, 2019

The watch is a new, battery constrained device. It doesn’t seem unreasonable to limit 3rd party apps for a bit while sorting out how to work with the size constraints.

I’m guessing apple will add lte support for 3rd party apps in watchos 6. Meanwhile, 3rd party apps have had the ability to make apps with offline support since June 2018. Most audio apps have added this.

Spotify....has not. What’s the holdup? Apple watch users are pretty frustrated with spotify: if all the other audio apps can do it, why haven’t they?

(Lte is an edge case: few watches have lte, and even fewer have an active lte plan. Offline audio is the main use case spotify still isn’t doing)

dehrmann · on March 13, 2019

Ex-spotifier here. It's most likely that they wanted to use functionality available to Apple Music, but not any other app.