Hacker Newsnew | past | comments | ask | show | jobs | submit | jackson1372's commentslogin

It's an open secret that Mistral finetunes on GPT outputs


I’ve heard this before, but can’t speak to the veracity of it. Does anyone have sources?


Training large language models takes an enormous amount of data. Ideally, multiples of Wikipedia and public domain content. Plus you want high quality data, so if you're going to pull in Reddit or something you need some way to separate factually accurate comments from garbage trolls.

Using output from ChatGPT is one way to generate a large volume of high quality data. But this is expressly forbidden by OpenAI's terms of service so you can't advertise the fact that that you're doing this. OpenAI is on shaky ground if they go to sue though, because so much of their training was done on copyrighted material that they hadn't gotten permission to use to begin with.


I don't have sources, it's just a rumor.


claude.ai


The article doesn't mention two great features on iOS: - Move cursor by tap-and-hold on soace bar - tap-tap for word selection and tap-tap-tap for sentence selection.

I use these two constantly


Claude Instant 1.1 is a better comparison for the price/performance of GPT-3.5


GRE writing: 5 for Claude vs 4 for GPT-4

Bar exam: 76.5 for Claude vs 75.7 for GPT-4


See the pricing PDF[^1] and API docs[^2], but TL;DR:

- Price per token doesn't change compared to regular models

- Existing api users have access now by setting the `model` param to "claude-v1-100k" or "claude-instant-v1-100k"

- New customers can join waitlist at anthropic.com/product

[1]: https://cdn2.assets-servd.host/anthropic-website/production/... [2]: https://console.anthropic.com/docs/api/reference#parameters


ANTHROPIC ・ Frontend/UX Engineer ・ SF-based ・ Remote-friendly ・ https://www.anthropic.com

Rapidly build prototype interfaces for large language models. Align AI with human values.

Tech stack: Next.js, Svelte, Python

Apply: https://jobs.lever.co/Anthropic/be5f1be0-e0c8-4a43-934f-cea1...


Isn't the explanation for this that the world actually does work in some way or other and that it's not just infinite chaos and so if you keep throwing parameters at some problem, you will eventually stumble upon the "real" structure, but that's no guarantee of when that occurs, and with which parameters?


Well, the thing is that when one says "the world has structure", one is saying that there are variety of structures "out there", in the world.

But that doesn't mean there's a single structure determined by a single set of parameters. Quite possibly there are numerous structures with not-compatible parameter structures.

Moreover, common AI data sets share parameters in a fashion that isn't always obvious - most images on the web are photos taken by human photographers who tend to center their subject, effectively giving them different parameters than, say, security camera footage. IE, "normal data" may not mean what we imagine.


The reason you want to over-parameterize your model is that it protects you from "bad bounce" learning trajectories. You effectively spread out your overfitting risk until it's pretty close to 0.

Or at least that's the way I like to think of it.

The next step is to better compress the resulting model in a simpler, less computationally costly network.


Are you suggesting dd is about local minima sort of? Like if you extended the risk: parametrization curve out you'd start to see overfitting again?


If you read between the lines, it's clear that Spotify was able to make a normal watch app. But they wanted to make one that had special functionality not yet allowed by Apple, for any app, not just Spotify.


There is a fine line between reading between the lines and fabricating things. Can you point to what exactly in that website substantiates your implication?


Their timeline is a bit unclear. They claim that in 2015 and 2016, they were denied outright. In 2017 they say that "Apple continues to make it challenging for us to deliver a workable streaming solution for the Apple Watch". Then in 2018 they say that "Apple finally allows enhanced functionality for the Spotify app on the Apple Watch".


Until watchos 5, 3rd party apps couldn’t do lte streaming. That said, lte was new in watchos 4, the tslk was that apple wanted to initially restrict it for battery life reasons.

Spotify could have made a watch app that provided controls and had offline somgs. They didn’t, and kept shutting down spps that did, or buying them and shutting them. This was a major sore point against spotify on the apple watch subreddit. Even now they’re dragging their feet, iirc.


Their point is that Apple app could do that and it was unfair to block other apps to be able to do the same.


The watch is a new, battery constrained device. It doesn’t seem unreasonable to limit 3rd party apps for a bit while sorting out how to work with the size constraints.

I’m guessing apple will add lte support for 3rd party apps in watchos 6. Meanwhile, 3rd party apps have had the ability to make apps with offline support since June 2018. Most audio apps have added this.

Spotify....has not. What’s the holdup? Apple watch users are pretty frustrated with spotify: if all the other audio apps can do it, why haven’t they?

(Lte is an edge case: few watches have lte, and even fewer have an active lte plan. Offline audio is the main use case spotify still isn’t doing)


Ex-spotifier here. It's most likely that they wanted to use functionality available to Apple Music, but not any other app.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: