#### **LECTURE NOTES**

 $\mathbf{ON}$ 

#### **COMPUTER ORGANIZATION & ARCHITECTURE**

### 4<sup>th</sup> SEMESTER

Sunanda Kumar Sahoo

ASST. PROFESSOR

#### DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING



### GANDHI INSTITUTE OF TECHNOLOGY AND MANAGEMENT (GITAM)

Affiliated to BPUT & SCTE&VT, Govt. of Odisha

Approved by AICTE, New Delhi



| (PCC54301) COMPUTER ORGANIZATION (3-0-0)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Module-1 (12 Hrs).  Basic Structures of Computers: Functional Units, operations Concepts, Bus Structures, Software, Performance, Computer Architecture Vs Computer Organization.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| Machine Instruction and programs: Memory Location and addresses, Big-endian and little-endian representations.  Memory operations, Instructions and instruction Sequencing, Addrewing modes, Assembly Language, Basic Input/output operations, Subroutine, additional Instructions.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| Module-2 (12 Hrs)  Arithmetre: Addition and Subtraction of Signed numbers  Design of Fact Adders, Multiplication of positive number  Signed-operand multiplication, Fast multiplication, Integer  Division, Floating-point Numbers, and operations.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| Baste processing units: Fundamental Concepts, execution of Complete Instructions, MultiBus Organization, Hardwired Complete Instructions, MultiBus Organization, Hardwired Complete Instructions, Marchitecture Complete Instructions, Marchitecture Complete Instructions of Marchitecture Conservations of Complete Instructions of Complete In |
| memory system: Basic Concerts, memory mapping policies, Cache updating Schemes, memory mapping policies, Cache updating Schemes, Performance Consideration, virtual memories, pagin, and page replacement policies, Memory Management and page replacement policies, Memory Management requirement, Secondary Storage.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| Text book.  1. Computer Organization: Hamachur, & Vraneste, Zaky.  2. Computer organization and Design Hardware/Software  2. Computer organization and Design Hardware/Software  They face: Dowed A. Patterson, John L. Hennery,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| Reference.  VI. Computus Architecture and organizations, Design  VI. Computus Architecture and organizations, Design  Principles & application: B. Govinda Rajalu.  V2. Computur System Architecture: Morris M. Mano                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |

Computer Architecture deals with giving operational Computer Architecture: affributes of the Computer or processor to be specific.
It deals with details like physical memory, ISA (Instruction Set Architecture) of the processor, the no. of bits used to represent the data types, Input output mechanism and technique for addressing memories.

Computer Organization:

Computer Organization is realization of what is specified by the Computer architecture. It deals with how operational attributes are linked together to meet the requirements specified by computer architecture. Some organizational attributes are hardware details, Control signals, peripherals.

Architecture and organization are independent,
you can change the organization of a computer without changing its architecture. For example, a 64-bit archi efective contre internally organized as a true 64-bit machine or as a 16-bit machine that was four cycles to handle 64 bet values.

Archétecture ès the abstract view for example Car architecture contains & a year box tour inhere as the internal implementation is down at organization level.

64-bit Computing wedths, integer size and memory address wetthe of 64 bits. Also register length is 64 bit.

## MODULE-1

BASIC STRUCTURE OF COMPUTERS

Types of Computer is a fast electronic calculations machine that accepts digitized input information, processes it according to a list of internally stored instructions and produces the redulting output information. The list of instructions is called a lomputer program, and the internal Storage is called computer memory.

Many types of Computers exist that differ widely in Size, East, Computational power, and intended use, Personal Computar/decktop Computers are Commonly used in homes, schools and business offices. Notebook Computers or Laptops are a Compact version of the personal compute Workstations with high resolution graphics input/output Capability often used in engineering design work. There are of same size as deektop computers. Enterprise systems, or Mainframes are need for business data processing in, medium to large corporations that require much more computing power and storage capacity than workstation camprovide. Servers Contains Sizable database Storage units and are Capable of handling large volume of request

to access the data, SuperComputers are used for the large Scale numerical Calculations required in applications Such as weather forecasting and aircraft design and Simulation.

# FUNCTIONAL UNITS

endependent main parts: input, memory, arithmette and logic, output and Control Units. The input unit accepts coded information from electromechantial devices such as keyboards or from other computers over degital Communication lines. These informations either stored or sent to the for decired operations. The processing Steps are determined by aprogram Stored in the memory. Finally, the results are sent

boug to the outside world through the output unit. All these actions are Coordinated by the Control but.



Information handled by a Computur Cambe data or instructions. Instructions, are the Commands to do Certain operations. Such as ALU operations or transfur data with in a Computur or between the Computur. The Computur is Completely Controlled by the Stored program, except possibly for external interruption. Data are numbers and encoded characters used for processing by a program. Even a set of instruction all program can be treated as data if it is used program can be treated as data if it is used source program used as data while compiling source program used as data while compiling to to generate object program.

ASCII (American Standard Code for Information Interchange) and EBCDIC (Extended Binary-coded Interchange Code) are two Coding Scheme Decimal Interchange Code) are two Coding Scheme Decimal Interchange Code) are two Coding Scheme Used to represent informatten in digital form. Used to represent informatten in digital form.

Used to represent Code ma EBCDIC Uses

ASCII Uses a 7-bit Code ma EBCDIC Uses

Input Unit.

The most well known input device is the Keyboard. Whenever a Key ispressed, the lorresponding letter or digit is automatically translated into its Corresponding binary Codic and transmitted over a lable to leither the memory or the Processor. Other input devices are joy etteks, trackballs had mouses. These are often used to the processor of conjuction with displays input devices in Conjuction with displays

Memony Unit. There are two classes of storage, called frimary and Selondary. frimary memory is a fast memory, which contains a large no of semiconductor storage cells, each capable of storing one bit of information, But information is read/write in group of bits, unally latted word. Typical length of word vangues from 16 to 64 (16,32) To provide easy access to any word, a diefinct address is associated with each word location. Addresses are numbers that cidentify successive locations. Normally each byte (8bit) has unique Primary memory refers to memory popularly Known as RAY (random access memory), Memory in which any location can be reached in a short and fixed amount of time known as random access memory: The memory of a Computer is normally implemented as a memory hierarchy of RAM units of different speeds and sizes. The small and fact RAM units are called caches and the largest and Slowest unit is referenced to as the mainmemory. Cache is often present on the Same I.C Chip of processor to achieve high performance. Although primary memory is essential as Programs must bestored in the primary memory while they are being executed, it tends to be expensive. Thus additional, cheper Secondary Storage is used for large amount of data. Magnette disks, tapes, optical disks (CD-ROM) had handdisks are examples of secondary memory trethmeette and Logic Unit Most computer operations such as addition, multiplicating devision, companison etc. are executed in the anithmatic and legic unit. The required operands are brought into processor and stored in highspeed storage elements Called registers. The control unit and ALU are faster than other derties Connected to a Computer System. This enables a single processor to Control Such devices like Keyboards, displays etc.

output Unit input unit. It's function is to send procused results are two such example. Display unit a the graphic display unit also display belos in injuting late. That's nhy some use I/O unit to graphic display Control Unit The memory, ALU and Input and output units Store LD process information and perform input and output operations. The operation of these units are coordinated by the Control unit. The Control unit is effectively the nerve center that sends Control Signals to other units and Senses their States. Control unit generate a timing signals had based on timing Stynal CU Controls there The operation of a computer Canbe Summarized 7 If accepts information in the form of data and programs through an input unit wo stores in memory.

If accepts information in the form of data and programs through an input unit will stores in memory. Information stored in memory o's fetched under Program control into ALU, where it is processed. I processed information given to outside through openit. If All activities inside the machine are controlled by Control Unit.

## BASIC OPERATIONAL CONCEPTS

To perform a given tack, an appropriate program Consisting of a lest of instruction needs to be executed. Instructions and data to be used as operand are stored in memory. A typical instruction may be Add LOCA, RO
This instruction adds the Confert of memory location LOCA and register RO and stores the result in RO.
First the instruction is fetched from the memory into processor, Next the operand at Loca is fetched and added to the other operand in RO. Finally Sum is Stored in register RO.

instruction can be written as The Same Load LOCA, RI below. Add RI, RO. MEMORY MDR CONTROL Ro RI Kn-1 n general purpose registers The transfers between the memory and the processor are Started by Sending the address of the memory location to be accessed to the memory unit and issuing the appropriate the appropriate control signals. ALV and the control circuitry ise the Processor Contains a number of registers used for several different purposes. Instruction Registar (IR): It holds the instruction that is currently being executed. Program contur (PC): 9+ contains the memory address of the next instruction to be fetched and executed. It also known as location counter. General purpose register med to hold temporary data tobe used to ALU operations. Memory Address Register (MAR) ? The MAR holds the address of the locatron to be accessed. Memory Data Register (MDR): The MDR lontains the data to be written into or read out of the addressed location. weally emplemented on a single NLSI thip, with at least one cache memory on the same Chip.

To make the eyetem operational the individual parts of a computer must be consected in Some organized way. BUS STRUCTURES. A group of lines that Serves as a Conserting
Path for Several devices is called a bus. There
should be a different buses for data, to address
and control signal. No. of lines in a best data
bus must equal to the size of a word to send a
bus must equal to the size of a word to send a The Cimplest way to interconnect functional vnets is to use a single bis. Input output yemory processor The main virtue of the single bus structure is its low cost and it's flexibility for attacking peripheral derices. But only two vuits lan actively use the bus at any given time. This makes system slower. To improve othe purformance system can lordain multiple buses, & which allows Concurrency in operations. This leads to better performance but at an emercased lost. are Slow compared to others such as morphical disks, memory & processor. To smooth out the timing differences Buffer regulatures are used with the devices to hold the information during transfers. Say to processor Sends Some data forprinter. Now processor waits till its buffer register got the data, then printer printer without intervention of processor. This buffu regretu allows the processor to switch rapidly from one device to another, inthrweaving it's processing activity with data transfers directing Several 2/0 devices.

SOFTWARE System software in a collection of programs that are executed to perform functions such as of Receiving and interpreting user Commands. of Managing the storage and retrieval of titles in Secondary Storage devices. ontrolling I/o units to receive input information and proadure output results. 7 Running Standard application programs luchas word processors, spreadsheets, or games, with data supplied by the une. Thinking and running und-written application programs with existing standard library routines such as here Such as numerical computation packages. System Software is thus responsible for the Coordina-Hon of all activities in a computing system Some examples of system softwares lave! Compiler Text editor, openating systems etc. Good bystem softwares are required for better purformance of systum. Printu ! Dish. os routines Programs The above progratigure shows how execution Control Pouses back and forth between the application Program and the Os routines. Notice that bet to tot, os and printer are idle no bet ty sts Os & dear are colle. Computer resources canbe used more efficiently of Several application programs and to be protected. Concurrent execution of programs is called multifrogramming a multitasking. It is

the job of operating system to lookafter concurrent execultion of application programs. PERFORMANCE of a computer is how quickly it can execute program. For best performance, it is naccusary to design Compiler, the machine instruction set and the han Oware in a coordinated way. is called elapsed time. required to execute the progression a measure of the Performance los the entire computer system. It is affected by the speed of the processor, the dixe the processor speed, we should only consider the processor time only. Some of the factors affecting the purformance of processor are discussed below. Processor Clock Signals of agual time called clock ferbod. Each machine instruction is devioled into no. et steps, where in each basic step some basic actionis completed. Each basic step 13 completed in one clock cycle. Let's P7 length of one Clock cyclis. I Electorical Engineering cycles/s known as hertz (Hz) Million is denoted by Mega (M).
Billion is denoted by Mega (M).
Billion ii ii ly Goga (G). 500 milleon cycles per Second abbreviated to 500 MHz. 8 R = 500 MHz, P = /R = 500 × 106 = 2 Narshy. 1250 million cycles persecond = 1,25- GHZ R= 1-25 9/2, P= 1-25 X109 = 0.8 ns.

Basic Performance Equation. N> No. of machine instructions to be executed for a program.
S> Avg. no. of basic steps per instruction. R7 clock rate. (No. of cycles/see). To improve the performance parameter (T) We need to decrease N, S value and increase R. . I handon't paramoto... R. N. BS, & R are not independent parameters. Pepelining and SuperScalar operation. Overlapping of the execution of successive instruction Known as populining. A substantial improvement in penformance can be achieved by this populining technique.

Add RI DO D-Add RI, R2, R3. RI+R2, 7 R3. The Contents of registers RI a R2 are tirst transferred to the inputs of the ALV. After the add operation is performed the sum is transferred to Re to R3 During addition operation, processor can transfe the next enstruction from memory. Now if their Enstruction also require ALU, they during the Sumi's transferred to R3, all the openands lande be transferered to ALU. In the ideal case, if all instructions are over lapped to the maximum degree possible, the effective value of 3' approaches 1.

basic Steps pur instruction and fully pipelines then the N'imstruction we need N+3 cyclus. A higher degree of Concurrency Can be achieved of multiple instruction pipelines are implemented in the Processor. This means that multiple functional units are used, to execute instructions parallely. This made of operation is latted superscalar.

CLOCK Rate There are two possibilities for increasing the Clock rate, R.

I Supproving the IC technology, so that it will reduces the time needed to Complete abase step. - Reducing the amount of proceeding done in one basic step, which will reduce that clock periods However of overall processing remains same, we many may need more bout steps pur instruction. Instruction SET: CISC & RISC CISC > Complex Instruction Set Computers.

RISC > Reduced Instruction Set Computers. Simple instructions require a small no of back Steps to execute. For a processor that has only simple instructions, a large no. of instructions may be needs to perform a given Programming tack. This could lead to a large value for N and a Small value Complex instructions involve a large number of Steps. Fewer instructions are required to perform a given programming tack. This lead to small value of N' nd large value of S. As for as pipelining in concerned, it is much easier to implement with simple instructions-A Compreer translates a high-level larguage COMPILER Program ento a Sequence of machine instructions. To reque N, we heed to have a Suitable machine instruction set and a compiler that makes good here of it. The compiler may rearrangel Program instructions to implement pipelining in a efficient mannin, of course, Such changes must not affait the result of the Computation. The ultimate aim is to reduce the total no. · Clock Cycles needed to perform a required programming task.

CH-2. MACHINE INSTRUCTIONS AND PROGRAMS

Binary Number System B : bn-1 -- - b, bo Dégits = 0,1 V(B) = bn-1 xxn-1+ .... Binary (4bit) + b1 x21 + b0 x20 0000 000 0010 B = 1010 . V(B) = 1 x 23 + 0x 22 + 1x 2 + 0x2 0011 0100 0101 0110 0111 1000 1001 1010 1011 12. 1100 13 1101 14 1) Sign & Magnitude. 1110 Corresponding decimal value 2's Complement.

2's Complement.

1's Complement.

1's Complement.

1's Complement.

1's Complement.

1's Complement. Signa Magnitude Complement Complement 0001 -0011 -0100 -0101-0110 -0111-1000 -1001-1010-

In threse 3 representation me negative nos are represented with MSB as I had positive nos anith MSB as O. In lave of sign & Magnitude it's straight forward. 0010 is 2 md 1010 is -2. In case of 1's Complement negative nois are 1011represented as Comprement of it's 1100 --5 1101 -Positive prepresentat 1110ere -5 as 1010 1111-7--0 hhichis complement of In case of 2's Complement 0101. 1101. will be used negative no s are represented

as - (0010) ile - 2

ast 1's complement +1 of it's

tre representation,

Using n no of bits we Can'l represent binary no.s from 0 to 29-1. To represent negative no.s we have 3 representations

2's Complement -5 > it's positive representation is 0101
is complement = 1010
+ 1 Jo. -5 is represented using 1011.
To know a ho from it's binary refresentation Of MSB is o' it's a positive no. & tinding of MSB is o' it's a positive no. & tinding of the decimal equivalent is straight forward. It MSB is it then it's a - he number. of ras are represented vaing signa Magnitude again et's spraight forward d i's Complement. > 9th this case just Complement the binary no. You will find the tre equivalent of the ho. Ind put a re Sign to get actual equivalent decimal number. 2's complement >. Conventing a beinary no. in 2's complement to artual decimal value is came as decimal to 2's complement, i.e l'complement une add '2' to it. nd add I toit. In the show some examples to student.

In the 2's complement systems preffered? As sign & magnitude & 18 complement representating Contains +0 4-0, 2 Complement used in Congutin. Also 2' complement to Contains one extra ho. In case of 4 bit representation if Contains -8, which is not their in 8ther representation. Also the addition and Subtraction process for 2's complement system is Simple. Range of In Case 2's complement if we use in bits to represent integer than it's range will be \$(27-1) to+(27-1)

ADDITION AND SUBTRACTION. Kules To add two numbers, add their n-bit representations ignoring the carry-out signal from the most significant bit (MSB) position. The Sum will be the algebraically Correct value in the 2's longlement representation as long as the answer is in the range -2"-1 to 2"-1. a. To Subtract two numbers & and y, that is to Perform X-Y, form the 2's complement of Y and then add it to X, as in rule 1. Again, the result will be algebraically correct value in the 2's complement representation system if the answer is in the range. Example. a) +0011. A) 1001 (7) 7+0101 (+2) (+3). 0101 +5. 1110 (-2) As the MSB cs i no. cs i) 1001 (+1) + 1111 2's complement - 0001 (+1) + 1111 b) 0100 (+4) 1001 -va+10010 (-6)×1000 (-8) ve/1)110 -2 0010 f) 0010 (+2) 7 If MBB is e) 1011 (-5) 0011 1 means 1-2) 0101 (+5) +1110. negativeno. M1001 & representation is 2's complement d) 0111 Sign Extension. +7 +1101. Normally there is a timed 812e to represent an \$ X0100 integer, I smaller the integer +4 needs to be o' added on the e) 11.01 (-3) 1101 -1001 (-7) =) +0111. left representing 5 in 16 bit. 6000 0000 6000 0101 X0100 (+4) Similarly Dil is added on f) 0010 (+2) 7 + 1100 the left to Smaller -ve numbers. This adding of +1100. -0100 C+4) 7 0 & 1 ta +ve & -ve ho's Called Sign Extension. -5 in 16 bit representati g) -0011 (+3) 7 0110 -0011 (+3) 7 1101 1111 1111 1111 1011 \*0011 (+3)

Integer Overflow on Anthonolic then the result of an arithmetic overflow operation is outside the representable range an arithmetic overflow has occurred. 7. overflow can occur only when adding two numbers that have the same Sign 7 The Carry-out signal from the Sign bit
Position is not a sufficient indicator of overflow when adding signed numbers. -7 when both operands x and y have the Same Sign, an overflow occurs when the signs of sign of is is not the Same as the signs of X and Y. (-8) 7 1000 Sign of X8Y: 1 1000 Sign of S = 0 1000 Overflow occurs. Rescut is not loney MEMORY LOCATIONS AND ADDRESSES. 0101-1 CITTE 11 ( 1011+

There are two ways that Byte addresses Canbe designed across words. BIG-ENDIAN & LITTLE-ENDIAN ASSIGNMENTS 4 4 5 6 7 10 3 2 1 0 47654 2-42-4 2-3 252 2-1 2-4 2-12-2 25-3/2-41 Big-Endian perignment. Little-Endian Ausgnment 97 Little-Endian Lower-byte In Big Endian Lower by te addresses are used to the Less significant by tes of word. addlesses are wed for the more significant bytes. of word. Types of Instruction based on Memory Three address instruction.

operation such as l + ++B require a

three address instruction to perform the operation in sight instruction. eg; Add , 1, B, C. operand 4,8 called Source operands and E's Called destination operand. Two-address instruction. An alternative approach to calculate C+4+B Add A,B

B + 4+B

Copy Content of B to E)

Move B.C.

B + (Lopy Content of B) Both these instructions are two-address instruction. A Do down of the source and Bis the source and destination. In More B. C. Bix the destination Three address instruction are very long to be allomed ated in a simple word. Construction to a significant for instruction does not fit into a significant word. Haven do due to processor register word. Haven due to be processor register address allumilator leads to one instruction instruction. the Address instruction Load A & loads A to Ace
Add B & Adds B to Ace & Store
Store C. & Stores result in A instruction. 100 Introduction of General Purpose register format changes the one address inst holds one operand the register holds one operand in this way the represented in instruction involved in explicitly represented in instruction involved have as one address not explicitly wroted accumulator does not explicitly wroted accumulator instruction. all in the A, Ri. Move based and B.Ri. Add Ri, C. store

# - immediate Value. Number notation. in binary #1/6 -Hexa. #\$ -11 STACKS AND QUEUES A Stack is a list of data elements, usually words or bytes, with the accessing restriction that elements can be added or removed at one end of the list only. This end is called the top of the Stack and the other end is called the bottom. The Structure is Sometimes referred to as a Plushdown Stack. It is a Common practice that Stack inserts new element in a decreasing address at memory.

A Processor register is used to keep track of the address of the element of the Stack that is at the top at any given K TOP time. This register is called the steek point in (SP). Poutine for a Safe pop operation. SAFEPOP Compare #2000, Sp: 2000 Branch >0 EMPTYERROR. Move (SP)+, ITEM. Routine for a Safe push operation SAFEPUSH Compare #1500, SP. Branch SO FULLERROR. Quewe. NEMITEM, - (SP). Move Another meeful datastructure that is Similar to the Stack is Called quewe. Datas are Stored & refrieved on a FIFO basis. In case of queue, it is a Common practice that queue grows in the direction of increasing addresses in the memory. We need two pointers to hold the addresses of first and last data in Queue. Botherends of a queue more to higher addresses as data are added at the tack and removed from the front. Without any boundary a queme would continuously more through the memory of a computer in the direction of higher addresses. 'One way to limit the queme to a fixed region in memony is to use a concular butter.

SUBROUTINES Subroutines are Subtack of aprogram, whichis Often perform manytimes on different data values. When program branches to a Subroutine, we say that of the prayro Subproutine the return state instruction is executed and control returns to thre program. The way in which a computer makes it Possible to call and return from Subroutine is referred to as its subroutine linkage method. A register called link register used to store the return address at the time of call and was this value , at the time of return. The Call instruction is just a special branch instruction that performs the following operations. > Store the contents of the PC in the Link register. -> Branch to the target address specified by the ikst? The Return instruction is a special branch institut performs the operation: -> Branch to the addrew Contained in the link register. SUE GOUTINE NESTING. A subroutine can call another Subrouting which is called subroutine neuting. It a single which register is used to store the return doores then then in case of subroutine neeting the Previous content of link register will be lost, while Calling Subroutine thom another Subroutine Hence it is sessential to save the Contents of the link register in some other location before Calling another Subsoutine. On this Scharib Stack is lised to Store the information while Stack is lised to Store the information while Store the information while Calling a Submoutines, which LIFO properties Call help us to return from the recent submoutine Call help us to return from the recent submoutine Call

THE STACK FRAME The locations in a Stack Sp used by a subroutine Saved [RI]. Saved [Ro] from subroutine Calle to Local Var3 Subroutine return is called Stack Localvarz frame LocalVari the stack frame of FP for Saved [FP] that subsoutine. Frama pointy Return address Called Subsouting Parami A Stack frame holds the Param 2 Delow information for a Param3 Subroutine. Param 4 -OLD TOS -> Before Calling the Subrouting the Calling program pushes the Parameters on to Stack. > Then the return address as pushed on to stack then the Call instruction is executed.

Then Frame pointer value is pushed on to stack 7 Then all the local variables are pushed on a) then the content of registers which are going to be used by the Subsoutines are Saves .. Frame pointer (FP) is a general purpose register.

Frame pointer points to the location whilet above the Stored return address, helps to access the Parameters and the local variables by win the Index addrewing mode. (8(FP), 12(FP), -8(FF) The Content of FP is fixed through out the executes Poor of Subroutine unlike Sp which Points to the Top of Stack. when the tack is completed the Subroutine Pops the Saved values of Rt and Sack into those regretures and removes the local variables by safting SP to Point previous FF value by Specifical FP value is poped and SP Point to return address. After returning to the Calling Program, the Calling Program, the Calling Program, program, the Calling Program removes the param and Sp foints to old 708.

Stack frames for nexted Subroutines (KO) from SUBI Stackframe [FP] from SUBI for second Return address. Subrouting Param 3. (R3) from main (R2) from yain (R1) from yain Stain tramy for terst (RO) from Main CFP) from Main Sub routing Returned address Param 1 Param 2

ADDITIONAL INSTRUCTION LOGIC INSTRUCTIONS. logic operations such as AND, OR and MOT applied to individual bits. NOT det - 9+ Complements all the bits. NOT RO 3 - tind's 2's complement of Content Add #1, RO3 - of RO " Some Computers have a Single inst? "NEGATE RO.
to do above operation. Makes all bots to Zero except first & bit (MSB) AND #\$ FF000000, RO + cheeks whether first COMPARE #\$54000000, RO charactur is equal to BRANCH = 0 YES. Z'or not. Hexa decimal of Z ix 5A (01011010) 9f there is match, branch inst" branchus to yES SHIFT AND ROTATE INSTRUCTIONS. Logical Shift. > Box Specified no. of bits Shifts
legt or right and kenolo) is filled for empty bits. The Carry tlay (C) holds the last bit shifted out from the number. togstal Shift Right. legical Shryt left. O-> RO >C> CKH RO KO before 01110 .... 011 before 01110 --- 011 0 110---- 01100 Destript Lshift #2, RO. after 0001110----0 Anothmatic Shift Last is Same Anothmatic Shift Right. aslogical Shift Left. In Case of Arithmatte Shift & Ro ->C right, the Dopostop fellin bit is the sign bet that [10011...010] [0] before may be 1 m 0: in our example it is 2. [1110011 0] [] after

Rotate right without carry ROTATE OPERATION of Ro Types Rotate Left without carry CCK- RO 01110 .... 011 before [0] [01110 ... 011] 1101110---01 After [] [110 . - . 0 1101] RotateR #2, RO Rotatel #2, RO Rotate right with carry. Rotate left with Carry. X RO JELL 01110 -- 011] [0] before[0] [01110 -- 011] Rotatele #2, Ro [1001110 - - 0] [] at [110---01100] ROTATER C #2,80 Rotate without Carry does not Use Carry, but Carry holds that the last bit Shifted out. MULTIPLICATION AND DIVISION Multiply Ri, Rj. Result of two mos noit multiplication is 2 n bits. So many Computars use two adjacent registers to store the result. (Ry, Rj+1) Dévide Ri, Rj Rj + [Rj]/[Ri]. Rj tostore quotient Some computers use mo Riti to Store the remainder, Else remainder in Lost.

Short Question Program 1) What do you mean by Stored Computer ? a) Write the tull form of ASCII , EBCDIC, VLSI. 3) what is four layers of memory hierarchy? 4) what do you mean by interrupt-service routine and when it is called for execution, what do you mean by BUS And list out advantage and disvantage of Single Bus Structure over muliple Bus Structure. Write the basic penformance equation What the clock period of 1942 md 8) What is the What do you mean by Pipelining. Differentiate CISC & RISC. Explain Spec rating and Dit's importance in measuring performance of a computer find out 2's Complement representation of below no.s Considering 8 bit to represent the integer. 87, 37, -88, -23, -53. Perform the below operations using 8 bit represe-ation of integers. ntation of integers. 18 + 26 ci) 75 + 53 iii) 77 - 47 iv) -65-64 that do you mean by arithmatte over flow, How you can know there is a arithmatte Driefle that is byte addressability 15) Differentiate Big-Endian and Little-Endian 16) that do you mean by WORDas far as memory locations are concerned? what do you mean by straightline Sequencing 4 instructions? Explain Different phases of instruction execution. 19) what do you mean by status register? with example, explain assembler directive. what is the job of loader and debugger? they of is mandatory to use stack for submoutine Needing. What do you meanly frame pointer ?

long Questions

Describe the Basic functional units of a Computy, with a neat diagram. 2) List out the jobs of Control Unit.

3) what are the jobs of below functional units of a computur? cii) MAR (v) MOR
i) pc (ii) IR (cii)

what are the jobs of System Software? Give some examples of Eyetem Software. 5) Discuss deferent aspect of a computur, which affects the performance of a computur in terms of speed of execution.

6) list the Steps needed to execute the machine instruction Add LOCA, RO and also for instruction Add RI, RZ, R3.

7) with example explain three-address, two-address, one-address and zero-address instructions.

8) Noicula Addressin mades with example

8) Diseus Addrewing modes with example 9) Books the Explain Baste Input/output operations

How a stackframe is created with in with a next diagram Stack for a subroutine call his how the Bolata are removed to when subroutine finishes

Draw a Stack frame for Subsoutine

12) Explain all Shift and Retate instructions with example.

RITHMETIC Mod-LOGIC UNIT. ADDITION/SUBTRACTION - mu Adday (Si) Carry- lout Ci ((arry-in) Si = xi y, ci + xi y; Ci 0 0 0 0 + xi yi ci + xi xi ci 0 0 xi ( y,⊕ c;) 0 + xi ( 4:00) xi & yi & ci 0 0 =yiCi +>Ci 10 0 0 Usin (Ocuso) Ci+1 = xi yi G + xi yi G + xi yi G + xi yi G Li (Xityi) + xiyi.  $x_i$ Si Sum. xi Ci+1. licion ripple - Carry adder = Xi G -> (5.7) tyici + 3,7 Js, + xi yi > 7.6 VSO (LSB) Cascade n-bit adders 261-171-170 yo n-bit n-bit Co adder adder SKN-1 521-1 S(K-1)n

fréthmatte overflow Overflow Can occure when Sign of the two operand is Same and Sign of the result is different than of operands. Overflow:  $x_{n-1}y_{n-1} \cdot 5_{n-1} + x_{n-1}y_{n-1} \cdot 5_{n-1}$ Method Alton When the Carry bits Co and Co-1 evel different, overflow occurs overflow = Cn (n-1. Doing Addition & Subtraction in a Single 25 Complement we can do that logte trit with help of χη-1 χ2 x1 χ0 n-bit adder. Binary addition-sustraction logic. Add/Sub Control. Addition - it is 0 and you you · Will be Supplied as it is. Subtraction - it is 1. So you - you is Complemented Sus ? due to STOB. XOR. Co is 1, so the Complemented value cs added with I making it 25 complement.

Complement.

And we know that A.X-Y's equivalent to . X + (2's complement of y) = X + ((23 complement of y) +1)

DESIGN OF FAST ADDERS. 9 9f an n-bit rapple-Carry adder is used in the addition/Subtraction, it may have too much delay in FA gives Sum and Carry in 2 gate delay So for getting Cn, we need 2ngate delays. If we consider the XOR gates on the yinput and the Cn DCn-, for overflow, the total delay os 2012 gate delay for the whole n-bit ripple-carry addies CARRY - LOOKAHEAD ADDITION. Si = xi & yi & ci and Cit = xiyit xicityici = xix + ci(xi+yi) = Git Pice ( Subare Gi = xi yi & Pi = xity) Gi and Pi are Called generate and propagate function citi=1, if Gi the generate function=1, ie xi84; both are 2, independent of ili. Xi pi In thes, emplementation Pi is, xi Dyi, which diffus, from actual. Pi, when xi & yi both 1. But inthis case Gi ( \$ 1, So et does matter phiether vg. Pc pi is 1 or zero. Citi = Gi + Pili = gi + Pi (gi-1 + Pi-fi-1) = gi + Pigi-1 + PiPi-1 Ci-1 So Continuing this type of expression for any Carry Varaiable lis Citi = gi + Pigi-, + Rigo Pili-19i-2+ ··· + Pili-i Pigo + Pi Pi-1 - Po Co for a 4-bit addur C1 = G0 + Po Co , C2 = G1 + Pi Go + Pi Po Co C3 = 92 + P291 + P2P190 + P2P190Co C4 = 93+ 1392+ 131291+ 13121,90+ 13 12/1600.

20 40 X2 42 4 41 1 1 V Viez Blell K Co B Coll K B Cell Black G P, G2 P2 1 VS3 Carry-Lookahead Logte. The Carries are implemented in the block labeled carry-lookahead logte. Delay through the adflux is 3 gate delays for all Carry bits and 4 gate delays for all Sumbits. In Companison, a 4-bit ropple-Carry addus requires 7 gate delays for S3 hd 8 gate delays for C4. order of to longer operands, we run into a problem of gate for in Constraints, For Cy in the 4-bit addir, a fan-in of 5 is required. This is abt the limit for Practical gates. NIS-12 10-12 211-8 448 AV 47-14 X3-0 193-0

Abitaglia 12 Avida Co

A Eight 4-bit Carry lookahead adders Canbe Connected delays in getting C4 from the Lowerder adders is available in 3 gate delay, Then, C8 is available after a further 2 gate delay, C12 is available after a further 2 gate delays delay and C32 available after 17 gate delay.

for the Sum available after 18 gate delay. Higher-Level Generate and Propagate functions X15-12 X5-12 X1-8 X1-8 X7-4 77-4 73-0 53-0
4 bit adder (8 4-bit adder) (4 4bit adder) 93 1 18 92 P2 S=11-8 91 SI-4 90 P0 S3-6 High ordu Carry-Lookahead Logic Each 4 bit adder Provide new output functions defined as Gr and Pk, where K=0 for the first 4-bit block, K=1 for the Second 4-bit block and soon. Po = P3 P2 P, Po ~ Go = G3 + BG2 + BP2 G1 + BP2 RG With these new functions available its is not necessary to wait for carries to rapple through the 4-bit blocks C16 = 98 + P392 + P3P291 + P3P2P190 + P3P2P1P0 Co. We should note that in this design we may ignore C4, C8, C12 had C16 generated internally by the 4-bit addar blocks as they are generated by the higher Cevel Carry-lookahead Circuits. delay respectively, after the generation of Gi and Pi.

Therefore, all Chrina produced by the Carry & lookaker

Circulitis and enailable 5/gate delays after X/ymilo Then to generate carry C4, C8, C12, &16 we need another two gate delay and another one gate delay so over all to lit adoler need ggate delay to fraduce the sum. Carry (C16) Canbe generated in I have generated in 7 gate delay. But if four 4-bit-added careaded together for 16 bit addition it added careaded together for 16 bit addition it requires 9 and 10 gate delays togenerate larry (CB)

Delay. within 4bit addu Gi & Pi takes I gate delay. So overall after 3 gate delay we get Gk & Ph So from Gk & Pk Carries (4, Cx, (12,616. Produced after 5 gate delay. 53-0 @ was generated after 4 gate delays. Boots &DA Strong, SISAZ & generated with C1-3 generated after 3 gate delays.

Cy generated within 4bit addu is ignored. But Carry generated within 14 bit addus Sheh as C5-7, Egs-11 & C13-15 requires (4) C8, C12 as carry to the 4 bit addies. Same for S15-4. All these larries need another 2 gate delays After that another I gate delay for Soum-So S15-4 needs 5+2+1=8 gate delays C16. needs 5 gate delay 1, 1, . where as . 16 bit adder implemented by Cascading 4 bit Carry-Lookahead addust block requires 9 and 10 gate delays for developing C16 and S15. 32 bit addu. a 32-bet adder. blocks lande implemented 2nd 16 bit adder gets the larmy C16 after 5 gate delays, within that Peniad Gh&Ph of 2rd 16 bit add was ready. So after

in the high-order block. Total 7 gate delay. After getting (20, C24, C280 OALL Other Carry Car-23, Cas-27, Cag-31 will require another agate delay. So C31 is out after 9 gate delay. md S31 is available after 10 gate delay. available aftu 7 8 10 \$ Overall (32 S3, mare gate delay. where as C32 & S31 in Case of carcado of eight that 4-bit addu need 17 & 18 gate delay. POSITIVE NUMBERS MULTI PLICATION Bit of in Coming Partial Product (PPi 1101 3 Multiplicand M 1011:1 Multiplier Q Typical Cell 1101 Carry-in. 0000 1101 1000111 1 (43) Product P Bit of outgoing Partial Product. Multiplicand [PP(i+i) mg Partial Product o m2 0 m1 0 m0 PP4 = P7. P6, --- Po (Product

The Simplest way to perform multiplication is Sequential Steps. Adder will add two binary no. 5 and result is Stored in a register. Register if is used to store the partial products.
Register Q is used to store the Multiplier. Bo Instead of Shifting the Multiplicand to the

lift as done by hand, here the Combination



At the Start, the multiplier is loaded into regester Q, the multiplicand into register M, and C and A are cleared to O. Register A and a Combined hold PPi while multiplier bit gi generates the Signal Add/Moadd. This Signal Controls the addition of the Multiplicand, M, to PP: to generate PPciti).

19 If gi = 0, then MUX gives all 0, wo ex gi=1; then Mux gives Multipletrand M to the adder. The product is computed in neycles. The partial Product grows on length by one bot per cycle from the initial vertor, PPO of n O's in regestion A. The carry out from the adopte is stored in flipflotic.

At the end of each Cycle, C, A, and Q are Shifted right one bit position to allow for growth of the partial product as the multiplier in shryted out of register a. Because of the shifting LSB of register a contains qui to Add/Noadd O'gnal. After n-cycles the high-order half of the product is held in registed and the low-order half is in register Q. 1101 g. 0000 0 1011 - Add. } first Cycle
- shift } 0 1101 10.11 0110 0 - Add 7 2nd Cylle.
Shift 5 1101 0011 1001 - No add ? 3rd Gile - Shift option of 1001 0100 0001 1000 Result. SIGNED-OPERAND MULTIPLICATION. 2's Complemen (-13) Signed Case-1 -ve yuetoplicand 10011 tre Multiplier 01011 When we add a -ve 1111110011. This method multiplicand to a 11.1110011 does not work Partial product, we for +ve multipli 00000000 must extend the cand and -ve 1110011 mutiplie Sign bit value of 000000 the multiplicand to 101110001(-143) the left as far as the product will extend. The hardware discussed earlier can be used for negative multiplicands of it provides for sign extensions of the partial products. Case-II for negative multiplier, a straight forward Solution is to form the 2's complement of both the multiplier and the multiplicand and proceed as in the Case of a positive multiplier. [ Remember Originally

Signed Onsigned multiplited ton flow (bast for ndo neombie Sign and magnifude Multiply Operation. Multiplicand in B Multipliar in a Re 134 G, OBs Ast 17. 70. of bits Qs + Q, +B Process will run far  $A \leftarrow 0, E \leftarrow 0$ (n-1) times as one SC < n-1 bit for signbit Register 4, B, a. are of size (0-1) =0 and B, a hold) yultiplicand and yulfiglice EAFATB without signlit = 0 SC 0110(+6) ProfuetinAQ 0101. Agto. as 60 9 0110 Q A contrato 0101 0101 - Add. 0110 a 06 shift 0010 -0011 10/1 0010 - Noadd 0011 10001 - Shaft 0001 0 - Add . 001 1 001 0111 0 - swft 1100 0 1100 (32+16+8+4)

Dooth Multiplication Algorithm Booth algorithm gives a procedure for multiplying binary lintegers in signed-2's complement representation In this method the multiplier is recoded and according to that multiplicand is selected for ooth Multiplier Recording table Vertion of multiplicand Multiplia Reroded Selected by bet i Biti-1 Bit i OXM In this lave 2's complement of multiplicandis addition 0 -1 XM -1 +1 XM. +1 OXM 0010110011101011 00 0 +1-1+10-10 +100-1+1-1+10-100 rightmost bit i.e LSB, assume it's frevious bet is O. 0101010101 Worst Case. Multiplier +1 -1 +1 -1 +1 -1 +1 -1 110001011011100 Ordinary Multipliar 0-100+1-1+10-1+100-100 0000111110000111 9000 multiplies. 0 0 0 41 0 0 0 0 -1 0 0 0 +1 0 0 -1 Booth algorithm has two attractive features 1- 9+ handles both positive and negative Uniforms 2 - 9+ achieves some efficiency in the number of additions required when the multipliur has a few large blocks of 15. On any, the speed of doing multiplication with the booth algorithm is the Same as with the normal algorithm.

Flow chart for Booth Algo Multiplication Start  $Z \leftarrow O(1bit)$   $A \leftarrow O(nbit)$ M < Multiplicand (nbit) a < Multipliar (nbit) Count < 7 AKATM of goz 00 1 bet right shift of AQZ Count ← Count - 1 Court == 0 yes. Stop. No Hardware, Configuration of Register Configuration Initially Multiplia Control. Sequences Mo Add Sub Control M Frable.

## HAST MULTIPLICATION BIT-PAIR RECODING OF MULTIPLIERS Bit-pair recoding halves the maximum number of Summands. It is derived from booth algorithm. The Pair (+1,-1) is equivalent to (0,+1). Adding al XM at position i to tIXM at position its es equivalent to tIXM at position c. Similarly (+1.0) equivalent to (0,+2) and. to (0,-1) map so on. (-1,+1) 11 1. 1. 0 1 0 @ Emplied o toright of LSB 0 0 7 1 1 -1 0 -1 -2-1 booth recoding

example of bit pair recoding derived from Table of Multiplicand Selection Lecisions to

bet-pair recoding .. Multiplien bit pair Multiplier bit Multiplicand Selected

|     |    | on the right | at position i |                |
|-----|----|--------------|---------------|----------------|
| ê+1 | i  | (-)          | 011011        | 0              |
| 0   | 0  | 0            | OXM           | 6              |
| 0   | 0  | 1            | +1 X M .      | - 17+1         |
| 0   | 1  | 0            | HXM.          | - 2            |
| 0   | 1  | 1            | +2X M         | -3+2           |
| 1   | 0  | 0            | -2 X M        | -4-2           |
| 1   | D  | 01           | -1 x m        | -57            |
| 1   | 1  | 0            | FIX M.        | -617           |
| 7   | 1  |              | OXM.          | <del>7</del> , |
|     | T. |              | 4             | , 0            |

01101

11010 1111100110 10011 110011 1100

- anotient -TEGER DIVISION > Restoring Division 101015 1101)1000100106 > Non Restoring Division. DIVINOT 1101 10000 1101 E comminder 1110 Shift left. 90 an any Dividend Q n+1 Add/subtract bêt addes Control Sequencer Divisor M Circuit arrangement for binary devision. At the end Restoring Quotient (nbit) START). is in register a Division A ( o (n+1 bit) and Remainder, in Register A Q ¿ Dividend (n bit) M ( Divisor (n+1 bit) Count (1) Leftshift AQ for 1 bit A CA-M. Signbet 90 = 0 ACATM restore A 90=1 T Count & count -1 NO Count=0 yes Stop)

Division. Shift A and a left one binary Position (don't fill to) Subtract M from A, and place the answer backing. of the Sign of A is 1, set go to 0 wo add M back to A (that is, restore A); otherwise, Set go to 1. Do the above nationes. 11)1000 example 10. A 00000 Initially a 1000 M 00011 Shift 0000 00001 11101 Subtract (Add 2's (omp) 11110 1 1 Set 90 = 0. 0000 Restore 1 00001 00000 (CD) 00000 00010 Shift 11101 Subtract Inthally 1111 Set 90= 0 HAME 0000 Restore A' 00010 traction? 81111 00100 0000 of to Shift Subtract 11101. 00001 . . . . . 000010000 Set que 1 0001000000 Shift 11.01 Subtract 1 1 8 11111 Set 90=0 080100010 An Ofbit positive divisor is loaded into registur M and an West positive dividend as louded into register Q. Register A is set to 0. After the division is complete, the n-bit quotient is in register of the remainder is in register of the remainder is in register of the remainder is in register of the required subtraction is done by 2's complete the remainder of the required subtraction is done by 2's complete the remainder of the required subtraction is done by 2's complete the remainder of the rem

Nonrestoring-division; Restore operations are notion needs No need to add the divisor to restore 4. or Subtract the divisor depending on the Sign of in times. Do the following in times. & storgroup De of Sign bit Bof A is O 1. Shift A & Q left one bit position and Subtract of from A mod Set go opposite of Sign bit of A after Subtract. 2. of sign bet of A is I, shift ARR ley one bit md add M to A md Set go offord, of sign bit of A' after addition. 3. At the end of incycles it stan of A'is I add M to A to leave the proper positive remaining Inifially 00000 1000 00011 0000 first Cycle.
0000 Cycle
0000 Second Cycle Short 00001 Subtract 11101 11110 Set 90 Sheft 11100 Add 00011 11111 setgo ODOU Third cycle. Shift 11110 Add 00011 Set To 00001 Shift. OOIO ( fourth Cycle 00010 Subtract 11101 61111 anotient: Sign of A' is I add M to A As 00011. - Remainder .

(START) Dividend (n bit)
- Divisor (ntibit) Count < n Leftshirt 10 for 1 bit est shift 1Q. Set % to opposite Count = Count -Count == 0 foring Division.

FLOATING-POINT NUMBERS AND OPERATIONS 1x20 + 1x2 + 1x2 + 1x2 5 = 1+1×4+1×1 +1×1/32 = 1+0.25+8.125+0.03125 - 1.40625. Range of values for 3abit, signed, fixed-paint format = 0 to ± 2.15 × 109 (only Integral)

for 3abit fraction = ± 4.55 × 10 to ±1. Neighan this range is enough to represent values Such as Avogadro's no. (6.0247 x 10 mole) h planek's Constant (6.6254 × 10-27 erg.s). To represent very large integers and very Small fractions we must a have a floating binary Point, Bothop Such no, are called floating Point no.s. where are fixed-point number, have it's binary point tixed: In Case of below no. 6.0247 ×1023, 6.6254 × 10-27 the no. of Significant digits = 5.

Scale factors = 1023, 10-27 Scale factors indicate the position of decimal If the decimal point is given just after tirst Significant bit, then those no.s point. are called normalized nos. O String of Significant digits Commonly Called the omarticea. exponent. State factor Known as

IEEE Standard for Floating-point Numbers By Single-Precision (32 bit) Double Precision (64bit) Extended Precision (80 bit): Single Precision Signed 8 bit Signed exponenting 23-bit mantissa fraction. excess-127 Instead of the Signed exponent E, the value actually stored in the exponent field is an uniqued integris E= E+127. This is called the excess-129 format. Thus E' is in the range OSE'S255. The end values O mid 255 are used to represent special values as described below. Therefore E'for normal values is 1 SE (254. This means E is in the range - 128 to 127. As normalized & mantissa is used to store in mantissa, the Significant bit of mantissa in left of decimal point Is always I. 23 bit mantissa does not explicitly stores this bit, only the tractional Part is represented. Exporent. 0 00101000 001010 - - - - 0 40-127 =-87. Special values. When E's 0 md martissa fraction M is 0, the value when E = 255 md M=0, then values is wi As sign bit is still part of these representation, ±0,±10 canbe represented. when E = 0 md M + 0, denormal no.s cie Leus then smallest normal no. is represented (±0.M x 2-126) This is to allow gradual underflow. when E'= 255 md M + 0, the value is not a number An Nan is the result of furforming an correction such as 0/0 mit.

0.0010110. x29 is represented as follows 0 10000101 0110. normalized version 6+127 E1 = 138 = 1280+4+1 Double Precision, The Scalefactor of Single precision has a range of 2-126 to 2127, which is approximately equal to 10±38. The 24-bit manties a provides approximately the Same precistor as a topisit decimal values The double Precision has 11- bit exces-1023 exponent E', which has the range -1022 (Ex 1023) Providing State factor of 2-1022 to 27023. The 52-bit mantissa provides a pareciston equivalent about 16 decimal degits. If number Can't be represent within the representable range then we might not get actual value. If the no requires exponent less that -126 incase of single Precision we lace if undurflow. On othersisk if exponent requires morethan 127 welall if overflow. 64 bit 52-bit mantrisa. 11-bet excess-1023 exponent. A processor must set exception flags, if any of the following seeurs. Undurflow, overflow, divide by zero, dinexact, invalid. Inexact in the name for a result that requires rounding in orde, to be represented in one of the normal formats. An invalid exception occurs of operations such as 010 to N-1 are attempted. Point sing 07 1) Convent 32.5 x 102 in Hoating Point Single 162.75 X 10223 157.5

ARITHMATIC OPERATIONS ON FLOATING-POINT Add/ Subtract Rule 1. Choose the number with the smaller exponent and shift it's mantissa right a number of steps equal to the difference in exponents. 2. Set the exponent of the result equal to the languer 3. perform addition/subtraction on the mantissas and determine the Styn of the result. 4. Normalize the resulting value, if necessary. A = 1.02356 × 1015 , B = 1.37853 × 1018 as exponent of A is less, shift mantissa of A A = 0.00102356 x 1018 Add manticsa of A&B. 0.00102356. 1-37955356 Result of 4+B = 1.37955356 × 1018. a Add two single preciator floating point number. B= 42400000 H. A = 44900000H. mantissa Segn exponent as Ea is 4 more than Eb, Shift right 4-bit of mantissa 1000 100 -(D) 100 0000000 0800 0800 fractional part at mantissa of A&B. 1.0010000 0000 0000 0000 0000 0.00010100

Result.

[0 | 10001001 | 00110100 0000 0000 000

Multiply Rule 1. Add the exponents and subtract 127. 2. Multiply the mantissas and determine the Sign of the result. 3. Normalize the resulting value, if necessary Divide Rule, 1. Subtract the exponents m) add 127. 2. Divide the mantissas and defermine the sign of the session of the resulting value, if necessary GUARD BITS AND TRUNCATION. Mantiesa de limited to 24 bits, including the implies leading 1. During intermediate steps or the finalrew, We may get mantessa morethan 24 bits. These extra bits are called Guard Bits. Keeping quardbets for intermediate Steps will increase the accuracy. However quard bits of fine results will be truncated. When we simple remove the guard bits, it is from bit ranges from 0 to 0.000111 eile from 0 to almost!

Promobit ranges from 0 to 0.000111 eile from 0 to almost! in the least significant position of retained bits.

The result of chapping is a briased approximating
because the error range is not symmetrical about on Von Neumann nounding, the Least of Sognifical bit of retained bits Set to it et any of the quand bit is it otherwise guard bits are simple ignored bit is it otherwise guard bits are simple ignored The error in this truncatton method ranges from -1 to +1 in the LSB position of retained bets. In rounding procedure, the LSB of retained bets of added with I if MSB of guard bits is I, the error otherwise guard bits are ignored. The error range is approximately -1 to the in the LSB position at the retained bits. This is the FEEE default one of true of true of the true of true of the true of true of the true o 110-51 THE MEMORY SYSTEM some Basic Concepts The maximum size of memory that can be used in any computer is determined by the addressing scheme. A Boot addresses is capable of addressing up to 216=64x memory location. 20= 1KB, 220= 1MB, 230= 19B, 240= 17B. 4 machine whose instructions generate 32-bit addresses can utilize a memory that contains up to a 32: 46 18 189 memory locations. The no. of locations represents the size of the address space of the computer. Most modern Computers are byte addressable. As for as the memory exputure is concerned, there is no Substantial difference between by endlay glittle endian schemes. The memory is usually designed to store and retrieve deta in word-length quantities. In a Read operation, other bytes may be fetched from the memory, but they are ignored by the processor, Similarly of a byte of data needs to be wriften then central cercuity of the memory must essure that the contents of other bytes of the same word are not changed 97 MAR is Kbits long then the memory writ may contain up to 25 addrenable locations. If MDR is noits long, then during a memony Cycle, noits of data are transferred between the memory and the processor. (n- length of word) The Processor bus has k address lines and n'datalines The bus also includes the control lines Read/write and Memory Function Completed (MFC) for coordinating data transfers. Other Control lines may be added to indicate the no of bytes to be transferred. During read, processor Set R/W to 1 md the load the address in MAR. Henory Controller Sets MFC after it load data in datalines. Set R/W to 0 md load data in During write, processor Set R/W to 0 md load data in MDR.

Temory Acteu times: Memory acceutime is the time that clapses between the initiation of an leg: Time bet? Read and MFC lignel. openation and the Memony Cycle time? It is the minimum time dela required between initration of two Successive ej: Time bet? two Successive Readoperation. These two are confortant measure of the speed Slightly longer than the accus time. A memory unit is called random - access memory amount of time that is independent of the locations address. Cache Memory This is a Small, fast memory that is inserted between the langur, clower main memony not the processor. It holds the Currently active Segments Processor Lan lunally Process instructions and data fastis than canbe fatched from menony Unit. To require the memory across time, Cathe memory is used. Virtual Memory Size of maior memony is Small and most programs no data don't fit into main memony. Virtual one mony concept makes in feel like we were memony have memony much more than the main memony. The memory control circuitry translates the address specified by the program into an address that canbe used to access the Physical memory. This address generated by the Processor Called vertual address. Data are addressed in a virtual address spea that can be as large as the addressing Capability of the processor. But a to of the processor. But at any given time, they the active portion of this space is mapped onto locations in the

Physical memory. The remaining vertual addresses are mapped onto the bulk Istorage devices used, which are usually magnetic disk. CACHE MEMORY The speed of the main memory is very low in Comparison with the speed of modern processors. Processor Cannot Spend much of it's time waiting to

access instructions modata in main memory. An efficient Solution is to use a fact cache memory which essentially makes the main memory appear to the processor to be faster than it really is.

The effectiveness of the Cache mechanism as based on a property of computer programs called coeality of reference Analysis of programe shows that many inst? in localized areas of the program ax accounts matively and the remainder of the program is accounted relatively infrequently. This is referred to as locality of reference There are two aspects of Locality of reference:

1- Temporal 2- Spatial.

Temporal ; Recently executed instruction is likely to be executed again very soon. Spatial > Instructions in close proximity to a recently executed inst? are most likely To take advantage at this locality of reference. It is useful to fetch Several etems that recide

at adjacent addresses as well. This block of Configuous address locations of some size also refer to Carcheline in case of cache.

If the active Segments of a grogram Carba placed on a fact cache memory, then the total execution time carbe reduced significantly. But Size of Cache is not so big. Usually, the Cache memory can Store a reasonable no of blocks at any given time, but this number is small Compared to the total number of blocks in makin memory.

replacement Algorithm. When the cache is full had a memory word that's not in the lachers referenced, the lache central hardwere must decide which block of data must be replaced to the new block. The Collection of rules for making this decides constitutes the replacement algorithm. If the referenced data is available in hit mice Cache then we call it as read or write hit. Otherwise we call it room Cache miss Readmiss. Write-through and write-back or Copy-back When we read from Cache there is no problem of weaneed to write Something onto memory then we can proceed in two ways. and the main memory localition are updated Simultaneously. update the Caetye Cocatton and mark the removed from the Caehe when a readmiss occurs, the block loutains Read Miles. the word come loaded into cache. The requested word canbe sent to processor after loading it to cache or canbe sent after reading from main memory. This latter approach as called load shrough, or early restart, reduces the Processor's waiting Period at the expense of more Complex circuitry. In case of write through Protocol, information Writemiles. is written directly ento the main menory. In the lace of the write-back, the block Containing the addressed word is first brought into he cache had then the desired word in the lacke is overwrotten with the new enformation.

MAPPING FUNCTIONS The correspondence between the main memory blacks and those in the cache is specified by a mapping function. Assumption, For Simplicity let's assume Cache is of 128 blocks of 16 words each, for a total of 2048 words. and Main memory is addrewable by a 16-bit address. The main memory has 64k words, which we will view as 4k blocks of 16 words each. Also we will assume that consecutive addresses refer to consecutive words Tag Block Word Main Mergery.

5 7 4 Block 0

Main Mergery adolres Block 1 Direct Mapping. This is the Simplest way to determine the Cache Cache Locattons in tag Block b Block 127 Which to Store memory blacks. Itag Block 1 BLOCK 128 In this technique, block \* 分 j' of main memory maps onto block i module 128 of Block ASS the cache ( jg. 128). Block 256 eg = 0, 128, 256 loaded Block 1, 129, 257 ... loaded on the Block I' on the Block o' of lacke of Cache Disadvantage: Since One block of Cache of reserved for more than one block of main memory, Contention may write for that Position even when the cache is not full. eg: Instruction of a program may start in block 1 and continue in block 129, possibly after a branch Replacement Strategy es trivial. No replacementalgo. 7 Memory to 16 bet memory address of Main memory 4bit from LSB decides the word. Next 7bit decides the block no. where the block is stored in Cache. Rest is 5 bit is tag which is used to find whether that particular block fresent in Cache or not by Comparing the tag bits of corresponding Cache block with this 5 bit tag bits.

Associative Mapping mapping is a much more though Associative MainMemory that thereet masterny . A main mapping Blocko block can be placed in an memoty Block block of Cache minory. (alhel 4 LSB bits of main memor Blocks Tag 1 appreces is for word. Reep 12 bits used for tag. During Blocks 7 Search 12bits tag is Compan Block 127 tag & Beggs with aces the tag bets of all the blocks of lache. Searching of 12 bits tag is known as ausociatele Search. For Parformance reaso, tage must be searched in parallel.
Tag word. Block 4095 In this Case we need replacement algorithm. Hardware cost is high due to accountive Searce Set Associative Mapping. A Combination of the direct. Block o Seto & Block o tay) and ausociative-mapping technique Block tag Cante used and is Chilled tag Block 2 Set-Associative Mapping. Block3 In this Blocks of the Calle Block 63 are grouped into sets, and Block by. the mapping allows a block of the main memory to Block 126 tags Set 63 Block 127 +ag reside in any block of BLACK 127, Block 128 Specufic Set. Tag Set Word Hence the Contention froblem is eased by having a few choice. Hardware Cast is reduced as the size of In our example for 2-block Set, memory blocks o, 64, 128, ..., 4032 map into cache Set o, and they Block 4099 Can accupy either of the two block positions within this Set. Having By Sets means that 6-bit set field determines the Set and 6-bit tag is Compared by that cost divine a second is Compared for that set during Comparation.

A Cache that has K-blocks Pursut reflered referred to as a K-way set anociative Cache. 2-way - Set Tag 2-way - Set Tag/ 6 6 4-way - 5 7 8-" - 4 8 Valed bet Their es a Control bet which indicates, defferent that dirty is modified bet a for validate of invalidate of invalid / stale data. I - validate of ensures stale data does not precent in lache main memory are Transfers from Disk to the main memory are Carried out by a DM4 mechanism. ( Direct yemony Access) Normally DMA bypaus Cache for Costa performance realing. If write-back protocol is used, until we tranite Cache Coherence problem: the block to mainmemory from lacke, mainmemory does not have the updated copy, go that this period DMA frankter is made from main memory to diex, we have two codifferent versions of data i.e one with disk and other with processor. This problem Known as Caine coherence problem. transfer the blocks with dirty bit I' to mainment, and then perform DMA. Replacement Algorithm. In a direct-mapped Cache, the position of each block is predetermined heree no replacement strategy exists. In associative and Set-associative caches there exists some flexibility. The property of locality of reference in programs jeres a chué to a reasonable strategy. Because Programs usually Stay in localized areas for reasonable periods of time, there is a high probability that the blocks that have been referenced recently well be referenced again Soon. Therefore, when ablock is to be overwritten if is sensible to overwrite the one that has gone the longest time without being referenced. This block is called the heast recently used (LRV) block and the technique is laked the LRV replacement algo.

LRU Replacement Algorithm.

Cache controller vies a Counter

to track references to that block for 4- block set ensociative cache a 2 bit location exrequired to cach block. The country of each block is let by following rule.

that as referenced ix set to 0. Country with values originally lower than the referenced one are incremented by one, all other remains unchange when a mice occurs and the Set is not full, the

do when a modes occurs and the Set is not full, the country ausciated with the new block loaded from maining policy is set to o' and the all other countries are increased by one.

3 they a miss occurs and the Set is full, the block with the Nounter value of its removed, the new block is put in its place, and its counteris set to 0. The other block counters are incremented by 1.

LRU puntorms well except when access are made to Sequential elements of an array that is too large to fit into eache. Puntormance of LRU algorithm can be improved by introducing a small amount of randomness in deciding which block to replace.

The Simplest algorithm is to randomly choose the block to be overwritten. In fact this simple algorithm has been found to be quite effective.

## PERFORMANCE CONSIDERATIONS

Two key factors in the Commercial Success of a Computer are purformance and cost. A Common measure of Success ex the price/performance ratio.

Interleaving If the main memory of a computer is structured as a collection of Physically Separate modules, each with it's own address buffer register (ABR) and data buffer register (DBR), Memory access operations may proceed in more than one module at the same time.

SKbits + mbits ->

ABR DBR.

Module

2K-1

ABR DBR

Module

Consecutive Words in a Module Module Address in Modhls. If whole main memory is divided into 2k modules, ABR DBR ABR BBR Keeping Conselutive Words Module Module En a Module, then the first Module K-bits decides the Module hid rest m-bits decided the address in module.

In this method of Consecutive locations are accessed Only one module is involved. At the same time, however, devices with DMA ability may be accessing informatton in other modulas.

Consecutive words in Consecutive modules. mbits - K kbits Main Address in module module memory on this method Consentive words are Stored in Consentive

ABR DBR

Module

modules. If request is generated to access

Consecutive memory Locations, then Sevenal modules will be bury

This results in both factor access to a block of data at any one time. and higher avg. utilization of the memory system as a whole. This method of dividing main memory into modules known as memory literleaving In this method the low-order k bits of the memory address select a module and brigh-order in bits name

a location with in that madule.