Unfortunately although SMC could be shorter, it's even slower since the pipeline... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		userbinator on Sept 28, 2014 \| parent \| context \| favorite \| on: Counting bytes fast Unfortunately although SMC could be shorter, it's even slower since the pipeline gets flushed every time a write occurs to locations within the current fetch window (not sure exactly how long that is, but it's not small.)

pmalynin on Sept 28, 2014 [–]

Yes, and the instruction cache becomes stale as well. I guess one way to avoid is to have 16 code blocks back-to-back and then to do a like a jmp into the section that contains the right register. JMP are pretty cheap, and the end point is likely to be in cache anyway.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact