Homework 2 for CS 222 Due May 1, 2007 1. Consider the following codes. I) Loop: L.D f0, 0(r1) ADD.D f4, f0, f2 S.D f4, 0(r1) ADDI r1, r1, #-8 BNE r1, r2, Loop II) ADDI r1, r1, #-16 L.D f0, 16(r1) ADD.D f4, f0, f2 L.D f0, 8(r1) Loop: S.D f4, 16(r1) ADD.D f4, f0, f2 L.D f0, 0(r1) ADDI r1, r1, #-8 BNE r1, r2, Loop S.D f4, 8(r1) ADD.D f4, f0, f2 S.D f4, 0(r1) a) assume that r1=r2+800. Demonstrate that the two pieces of code I) and II) are functionally equivalent b) assume the standard latencies as we used from the 5-stage pipeline. Calculate how many cycles it takes to execute the codes above. Assume that branch are predicted correctly, and there is no delay in fetching the instruction in the next iteration of the code. This technique to improve the performance is called software pipelining. 2. Consider the following loop. Loop: L.D f0, 0(r1) ADD.D f6, f0, f2 L.D f4, 0(r2) ADD.D f8, f6, f4 S.D f8, 0(r1) ADDI r1, r2, #-8 ADDI r2, r2, #-8 BNEZ r1, Loop a) Schedule the code, assuming thers is a one cycle branch-delay slot. b) How many instructions are there if we unroll the loop n times. c) Apply software pipelining to the above code, by taking one instruction per iteration. (as done in Q1 above). Is there any harzard? Suggest a method to fix the problem. 3a The formula for misses per instruction in C5 is written first in terms of three factors: miss rate, memeory accesses, and instruction count. Each of these factors represents actual events. What is the different about writing misses per instruction as miss rate times the factor memeory accesses per instruction? 3b Speculative processors will fetch instructions that do not commit. The formula for misses per instruction on page C5 refers to misses per instruction on the execution path, that is, only the instructions that must actually be executed to carry out the program. Convert the formula for misses per instruction on page C5 into one that uses only miss rate, references per instructed fetched, and fraction of fetched instructions that commit. Why reply upon these factors rather than those in the formula on page C5.