I mean, sure you can work around it, but from your own link: >since the time *an... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		RC_ITR on July 12, 2023 \| parent \| context \| favorite \| on: GPT-4 details leaked? I mean, sure you can work around it, but from your own link: >since the time and memory complexity of self-attention are quadratic in sequence length

why_only_15 on July 12, 2023 [–]

Except in practice this is not true, and hasn't been for more than a year. It's not just a workaround either -- FlashAttention is both faster at runtime and uses less memory.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact