CHAP 9

Cards (88)

  • Pipelining
    A technique where multiple instructions are overlapped in execution
  • Today, pipelining is the key implementation technique used to make fast CPUs
  • Pipeline
    Like an assembly line where each station does part of the whole assembly
  • Time per instruction = Time per instruction on unpipelined machine / Number of pipeline stages
  • A CPU with 5 stages can improve the performance by 5 times, assuming that the stages are perfectly balanced
  • Increasing the number of stages of the pipeline
    Decreases the time to execute an instruction
  • Basic CPU operations
    • Fetch next instruction
    • Decode instruction
    • Execute
    • Write result
  • Early CPUs did not have pipelines due to limitations of VLSI technology
  • Processor cannot start the next instruction until the present instruction is completed
  • When the instruction is being fetched, the ALU is idle
  • When the ALU is executing code, the buses are idle
  • These idle periods reduce CPU performance
  • One instruction is executed in 4 clock cycles in non-pipelined CPUs
  • Pipelining
    Used to increase the processing power of the CPU
  • A pipelined CPU can have separate stages working in parallel (concurrency)
  • Pipelined CPUs stages
    • Intel 80486 (5 stages)
    • Pentium 4 (20 stages)
    • Pentium 7 (14 stages)
    • ARM A8 (14 stages)
  • For simplicity, we will assume a four stage CPU with a single execution unit and no cache
  • Pipelined CPU can process several tasks at the same time
  • Intel 486 has a 5 stage pipeline
  • Each stage of the Intel 486 pipeline takes 1 clock cycle
  • Intel Pentium processor has 2 integer units
  • It is possible to start 2 instructions together in the Pentium processor
  • The Pentium microprocessor uses branch prediction logic to reduce the time required for a branch
  • When a branch instruction is encountered, the CPU pre-fetches instruction at the branch address
  • Instructions are loaded into the cache during branch prediction
  • Whenever a program comes to a jump instruction, there is uncertainty as to whether a jump will occur
  • Jump not zero (JNZ) instruction has two possible cycles: 7 or 3
  • Total cycle count for the looping program is 186 cycles
  • The same program takes shorter time in Pentium
  • The Pentium can use branch prediction to reduce the time required for the jump instruction to just one cycle
  • Pipelining has some problems, including three types of hazards
  • Types of hazards in pipelines
    • Structural Hazards
    • Data Hazards
    • Control Hazards
  • Structural Hazards
    Arise from resource conflicts when the hardware cannot support the instructions in overlapped execution
  • This problem can be solved by duplicating the resource
  • Structural hazards can occur if both write and fetch stages need to use the buses
  • Another structural hazard is when the execution stage of the instruction takes longer than one clock cycle
  • Data Hazards
    Arise when an instruction depends on the results of a previous instruction
  • The second instruction cannot execute until the result of the first instruction is written
  • The pipeline is stalled so that the register is updated with the result before the next instruction can be executed
  • Control Hazards
    Arise from branches and other instructions that change the program counter or instruction pointer