Few important characteristics and features of pipeline concept:

– Processes more than one instruction at a time, and doesn’t wait for one instruction to complete before starting the next. Fetch, decode, execute, and write stages are executed in parallel

– As soon as one stage completes, it passes on the result to the next stage and then begins working on another instruction

– The performance of a pipelined system depends on the time it takes only for any one stage to be completed, not on the total time for all stages as with non-pipelined designs

– Each instruction takes 1 clock cycle for each stage, so the processor can accept 1 new instruction per clock. Pipelining doesn’t improve the latency of instructions (each instruction still requires the same amount of time to complete), but it does improve the overall throughput

– Sometimes pipelined instructions take more than one clock to complete a stage. When that happens, the processor has to stall and not accept new instructions until the slow instruction has moved on to the next stage

– A pipelined processor can stall for a variety of reasons, including delays in reading information from memory, a poor instruction set design, or dependencies between instructions

– Memory speed issues are commonly solved using caches. A cache is a section of fast memory placed between the processor and slower memory. When the processor wants to read a location in main memory, that location is also copied into the cache. Subsequent references to that location can come from the cache, which will return a result much more quickly than the main memory

– Dependencies. Since each instruction takes some amount of time to store its result, and several instructions are being handled at the same time, later instructions may have to wait for the results of earlier instructions to be stored. However, a simple rearrangement of the instructions in a program (called Instruction Scheduling) can remove these performance limitations from RISC programs