Question

Problem Set 1
1. Recall that speedup is defined as T1/T2 where T1 is the execution time for a program before some modification is made and T2 is the execution time after the modification. A certain program contains multiply instructions as well as other types of instructions. The CPI for each instruction in the program is 4. An improvement is made to the multiply instruction such that after the improvement the CPI for the multiply instruction is reduced to 3. The total number of instructions in the program is IC. The multiply instruction accounts for N% of the total instructions in the program. The fixed cycle time for the processor on which the program runs is C nano-seconds.
a) (10) A single change that only affects the multiply instruction is made to the system.
Prior to the change the running time for the program was 2.85 seconds. After the change the running time for the program is 2.28 seconds. What is the corresponding value for N? That is, to what percent of the total number of instructions does the multiply instruction correspond? Express your answer to two decimal places.
b) (10) As a different case, suppose that the multiply instruction had a CPI of 12 before the improvement and accounts for 22% of the total number of instructions in the program. All instructions other than the multiply have a CPI of 4. What speedup factor would be provided for the program by reducing the CPI for the multiply instruction from 12 to 3? Express your answer to two decimal places.
2. A program containing twenty million instructions is executed on a uni-processor system with a fixed clock cycle time of 250 pico-seconds. All instructions are executed one at a time and each instruction requires an integral number of clock cycles. The divide instruction on this machine requires 12 clock cycles and accounts for 10% of the total number of instructions executed in the program. The other 90% of the instructions in the program require an average of 5 clock cycles per instruction.
Complete the following statements:
a) (8) The clock rate for this machine is _____ GHz.
b (10) The total number of clock cycles consumed by the entire program is
c) (10) What speedup (expressed to two decimal places) would be obtained for this program by making the divide instructions twice as fast? Speedup=
d) (10) What speedup relative to the original unmodified system would be obtained for this program by making the maximum possible improvement in only the divide instructions? Express your answer to two decimal places. Speedup=
4. Compiler A compiles a program that results in 1.0E9 instructions. The program executes in 1.1 s. A clock cycle is 1 ns.
a) (6) What is the CPI?
b) (6) Compiler B compiles the same program and results in 6.0E8 instructions and has an average CPI of 1.1. What is the speedup of Compiler B over Compiler A.

Solution Preview

This material may consist of step-by-step explanations on how to solve a problem or examples of proper writing, including the use of citations, references, bibliographies, and formatting. This material is made available for the sole purpose of studying and learning - misuse is strictly forbidden.

1)
a.
Since the problem says the modification only affects the multiply instruction, then it can be assumed that IC and C remains the same in both situations. In order to find out N it is necessary to write the CPU_time for both situation.
CPU_time1=IC* CPI *C= 4*IC *C
CPU_time2=[N*IC/100 * CPI_multiply + (1- N/100) *IC*CPI_other]*C...

This is only a preview of the solution. Please use the purchase button to see the entire solution

Assisting Tutor

Related Homework Solutions

Review Journal Article About CPU Optimum Pipeline Depth (2530 words)
Homework Solution
$75.00
Pipeline
Depth
Length
Optimum
Delay
Processor
System
Execution
Instruction
Branch
Prediction
Stage
Design
Performance
Power
Parallelism
CPI
TPI
MIPS
Speed-up
Superscalar
Dependency
Hardware
Software
Cycle
Hazard
Latch
Overhead
Tomasulo's Approach - Short Report (840 words)
Homework Solution
$18.00
Tomasulo
Dynamic
Scheduling
Algorithm
Approach
Hazard
Pipeline
Stall
WAR
WAW
Scoreboard
Opcode
Instruction
Unit
CDB
Common
Data
Bus
Hardware
Study Log on Distributed Systems Based on Tannenbaum's Book
Homework Solution
$70.00
Study
Log
Distributed
System
Tannebaum
Principle
Paradigm
Twenty
Concept
Overview
Architecture
Style
Pervasive
Centralized
Decentralized
Hybrid
Middleware
Interceptor
Thread
Virtualization
Client
Server
Code
Migration
Layer
Protoc
Get help from a qualified tutor
Live Chats