> "You lost me there- if we’re talking about a fixed number of multipliers reduc...

dahart · on July 6, 2022

> The design with only DP multipliers uses them as N DP multipliers or as 2N SP multipliers.

Right. You’re answering a different question than what I asked.

Why is it 2N single multipliers when the area for DP is 4x? Your premise seems to be missing a factor of 2 somewhere.

adrian_b · on July 6, 2022

One DP multiplier has approximately the area of 4 SP multipliers, therefore twice the area of 2 SP multipliers.

One DP multiplier, by reconfiguring its internal and external connections can function as either 1 DP multiplier, or as 2 SP multipliers. Therefore a GPU using only DP multipliers which does N DP multiplications per clock cycle will also do 2N SP multiplications per clock cycle, like all modern CPUs.

For example, a Ryzen 9 5900X CPU does either 192 SP multiplications per cycle or 96 DP multiplications per cycle, and an old AMD Hawaii GPU does either 2560 SP multiplications per clock cycle or 1280 DP multiplications per clock cycle.

When you do not want DP multiplications, the dual function DP/SP multiplier must be replaced by two SP multipliers, to keep the same SP throughput, so that the only difference between the 2 designs is the possibility or impossibility of doing DP operations. In that case the 2 SP multipliers together have a half of the area needed by a DP multiplier with the same SP throughput.

If you would compare 2 designs having different SP throughputs, then there would be other differences between the 2 designs than the support for DP operations, so the comparison would be meaningless.

When all DP multipliers are replaced by 2 SP multipliers each, you save half of the area previously occupied by multipliers, and the DP throughput becomes 0.

When only a part of the DP multipliers are replaced by 2 SP multipliers each, the SP throughput remains the same, but the DP throughput is reduced. In that case the area saved is less than half of the original area and proportional with the number of DP multipliers that are replaced with 2 SP multipliers each.

dahart · on July 6, 2022

I understand your assumptions now. You’re saying SP mult is by convention twice the flops for half the area, and I was talking about same flops for one fourth the area. It’s a choice. There might be a current convention, but regardless, the sum total is a factor of 4 cost for each double precision mult op compared to single. Frame it how you like, divvy the cost up different ways, the DP mult cost is still 4x SP. Aaaanyway... that does answer my question & confirm what I thought, thank you for explaining and clarifying the convention.