Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

PUSH/POPF* instructions push the flags register to save comparison results when you need to reuse the registers containing the original comparison arguments. They're well-pipelined, integrated with the stack engine and very fast on modern CPUs, and I've definitely seen the compiler emit them (more on i686 than x86_64 to be fair).

I won't speak to the quality of the generated code otherwise, but this bit of evidence by itself doesn't seem persuasive.

(edit: pbsd is right: per Agner Fog, these guys are slow. It's likely that the L/SAHF trick is the one I was remembering.)



Where are you getting that from? `pushfd` is OK-ish, but `popfd` is microcoded, requires 9 uops, and can only be issued once every ~20 cycles. That's not what I would call well-pipelined.

You could probably achieve the same result (assuming you want to avoid adc instructions, for some reason) by using setc r8 plus shr r8, 1 to get the carry back.


If you use `-C target-cpu=native` (Rust's equivalent of `-march=native`) you get code that uses `mulxq` in order to avoid `pushf`/`popf`. https://gist.github.com/Vurich/5cb83c773e90fc7a463ccb58e1dad...


It's still doing it, but now it uses `lahf` + `sahf` instead to (re)store the flags. These are better than pushf + popf for sure, but they cannot be used in general code because some early x86_64 chips forgot to implement them.


If your general code is only intended to be used on newer versions of Windows, you can. Windows has required it since 8.1.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: