+ All Categories
Home > Documents > ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE...

ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE...

Date post: 07-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
46
ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California, Santa Barbara Department of Electrical and Computer Engineering Project 1 “Designing a Simple Cache” Ali Umut IRTURK 789139-3 ECE Department & ECON Department Graduate Student 10/15/2006
Transcript
Page 1: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

ECE 254A

Advanced Computer Architecture: Supercomputers

Fall 2006

University of California, Santa Barbara

Department of Electrical and Computer Engineering

Project 1

“Designing a Simple Cache”

Ali Umut IRTURK 789139-3

ECE Department & ECON Department

Graduate Student

10/15/2006

Page 2: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

1) The Basics of Memory Hierarchy and Overview of the Project

The memory system is organized as hierarchy which means a level closer to the processor is generally a subset of any level further away, and all the data is stored at the lowest level. The user has the illusion of a memory that is as large as the largest level of the hierarchy by implementing the it as a hierarchy, but it can be accessed as if it were all built from the fastest memory.

Figure 1.The basic structure of a memory hierarchy.

The aim of the first project is to “design a simple cache.” According to this general hierarchy, my design must be like Figure 2. The goal is to present the user with as much memory as is available in the cheapest technology, while providing access at the speed offered by the fastest memory.

Page 3: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

Figure 2.The basic design of the project. The input and output ports are not designated at

this point.

As can be seen from the figure 2, I need to decide which input and output ports I need.

2) Discovering the Input and Output ports

I will consider every component one by one, and find these input and output ports. However, at this step I didn’t specify the length of the inputs and outputs.

a) Cpu:

When the information is needed from the cache or the information is needed to write, the cpu accesses to the cache. Thus; When any need of information is considered; i) The Cpu must inform this situation by “a read signal.” ii) The Cpu must inform where the data is by “address bits”. When writing is considered iii) The Cpu must inform this situation by “a write signal.” iv) The Cpu must inform which data is need to be written by “data bits”.

Page 4: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

v) The Cpu must inform where the data will be written by “address bits”. (used in the any need of information process)

This shows that there must be 4 outputs from Cpu to the Cache (Cache inputs from Cpu). I

designated them using cpu_cac_NAME. Basically the read signal and write signal can be accomplished by 1 bit. However the address and data bits will be decided later.

b) Memory:

If a miss occurs in the Cache after Cpu’s request. The cache must access to the memory, for

retrieving data. Thus, memory needs an output to the Cache for transferring it to the cache: i) The requested data send by “data bits” from Memory to the Cache. I designated this

using mem_cac_data. The length of the data bits is considered later.

c) Cache:

The cache is the most important part of this design. There must be several outputs from Cache

to the Cpu and Memory. The relationship between Cache and Cpu

If the Cpu gives read signal and send the address of the data i) If the data requested by the processor appears in the Cache, this is called hit. First this

information must be given to Cpu by sending “a hit bit.” And the found data must be sent back to Cpu, so Cache needs an output to the Cpu to send “data bits”.

ii) If the data is not found in the Cache, the request is called a miss. The memory is then accessed to retrive the block containing the requested data. This information must be given to the Cpu by sending “a miss bit.”

This shows that there must be 3 outputs from Cache to the Cpu (Cpu inputs from Cache).

These are designated by cac_cpu_NAME. Basically the hit signal and miss signal can be accomplished by 1 bit. However the data bits must be considered later.

Page 5: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

The relationship between Cache and Memory

As I mentioned before, if a miss occurs, the memory must be accessed for to retrieve the

desired block or if the Cpu wants to write information to the Memory using Cache, there must be several outputs form Cache to the Memory.

If a miss occurs in Cache i) This information must be given to Memory by sending “a read bit.” ii) The Cache must inform where the data is by “address bits”. If writing situation is considered iii) The Cache must inform this situation by “a write bit.” iv) The Cache must inform which data is need to be written by “data bits”.

This shows that there must be 4 outputs from Cache to the Memory (Memory inputs from Cache). They are designated as cac_mem_NAME. Basically the read bit and write bit can be accomplished by 1 bit. However the address and data bits must be considered. At this point, I know the general inputs and outputs which can be seen in Figure 3.

Figure 3. The inputs and outputs of the design.

Page 6: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

At this point the things that I didn’t considered is how many bits must be the address and data bits, the signals which come from outside of the design like clock and reset bits and the cache structure.

3) Cache Architecture

In this simple cache design, I designed 8 by 8 cache. Each entry in the cache consists of 8 bits and there are 8 entries.

There are four important questions to answer at this point:

a. How do we know if a data item is in the cache?

b. If it is, how do we find it?

The answers to these two questions are related. If each 8 bits can go in exactly one place in

the cache, then it is straightforward to find this 8 bits if it is in the cache. The simplest way to assign a location in the cache for each 8 bits in memory is to assign the cache location based on the address of these 8 bits in the memory. This cache structure is called direct mapped, since each memory location is mapped directly to exactly one location in the cache. I used direct mapped cache structure in my design. The typical mapping between addresses and cache locations for a direct-mapped caches use the mapping (Block address) modulo (Number of cache blocks in the cache)

c. Because each cache location can contain the contents of a number of different memory

locations, how do we know whether the data in the cache corresponds to a requested

information? That is, how do we know whether a requested information is in the cache

or not?

The answer of this question is adding a set of tags to the cache. The tags contain the

address information required to identify whether an information in the cache corresponds to the requested information.

d. We also need a way to recognize that a cache block does not have valid information.

The most common method is to add a valid bit to indicate whether an entry contains a valid address. Basically, if the bit is not set, there cannot be a match for this block.

Page 7: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

Thus;

If we consider the address output from the Cpu, since there are 8 blocks (entries) in the cache, there must be 3 bits of an address to give the block number. And a tag field which is used to compare with the value of the tag field of the cache. This is illustrated by figure 4.

Figure 4. The address which is sent by Cpu matches to the cache. Cpu Tag and Cpu Index

constructs the address which is sent by Cpu.

The index of a cache block, together with the tag contents of that block, uniquely

specifies the memory address of the information contained in the cache block. This shows us that we need 5 bits for address and data. This specifies the required information which we are seeking. Besides these, it is important that I need to add the clock and reset signals into the design.

Page 8: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

As a result the following Figure shows the resulting design.

Figure 5. The resulting design of the project. The signals are specified in the interfaces part

which is below.

Page 9: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

Interfaces

Connections Variable Name in Verilog Codes Bits

1 mem_cac_data 5

2 cac_mem_read 1

3 cac_mem_wrt 1

4 cac_mem_data 5

5 cac_mem_add 5

6 cpu_cac_read 1

7 cpu_cac_wrt 1

8 cpu_cac_data 5

9 cpu_cac_add 5

10 cac_cpu_hit 1

11 cac_cpu_miss 1

12 cac_cpu_data 5

Table 1. The names are given according to the usage of the signals.

Page 10: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

Data Flow Diagrams

When I was discovering the input and output ports, and working on cache architecture, I gave information what happens when a read or write occurs. However, displaying using data flow diagrams is always very useful for better understanding.

Page 11: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

Figure 6. Data Flow Diagram for Reading

Figure 7. Data Flow Diagram for Writing

Page 12: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

Figure 8. State Flow diagram

4) Testbenchs and results Hit Test

1) Cpu requests the data (cac_cpu_data) from cache by giving an address(cpu_cac_add).

And sets the read bit (cpu_cac_read). 2) As we know, the address (cpu_cac_add) given by cache is 5 bits, and consists of cpu

tag (cpu_tag) and cpu index (cpu_index). 3) In this situation, cpu sends the address b10011 (cpu_cac_add), and set the read bit

(cpu_cac_read <= 1’b1). 4) As can be seen from the snapshot, cpu tag(cpu_tag) = 11, cpu index(cpu_index) = 100

which is true if we look at the address. 5) In the cache, the data in the index = 100 is 11100100. 6) Cache compare the current tag (cur_tag) at the index which is 11 and cpu

tag(cpu_tag). They match. And checks the valid bit which is 1 too. 7) Thus cache finds the requested data at the given cpu index(cpu_index). Sends a hit

signal to the cpu (cac_cpu_hit <= 1’b1).

Page 13: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

8) At last, cache sends the required data b00100 which is true.

Figure 9. Snapshot of the Hit process

Page 14: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

Figure 10. Snapshot of the Hit process

Page 15: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

Miss Test

1) Cpu requests the data (cac_cpu_data) from cache by giving an address(cpu_cac_add).

And sets the read bit (cpu_cac_read). 2) As we know, the address (cpu_cac_add) given by cache is 5 bits, and consists of cpu

tag (cpu_tag) and cpu index (cpu_index). 3) In this situation, cpu sends the address b11010 (cpu_cac_add), and set the read bit

(cpu_cac_read <= 1’b1). 4) As can be seen from the snapshot, cpu tag(cpu_tag) = 10, cpu index(cpu_index) = 010

which is true if we look at the address. 5) In the cache, the data in the index = 100 is 10100010. 6) Cache compare the current tag (cur_tag) at the index which is 01 and cpu

tag(cpu_tag). They doen’t match. 7) Thus cache couldn’t find the requested data at the given cpu index(cpu_index). Sends a miss signal to the cpu (cac_cpu_miss <= 1’b1). 8) Cache needs to take the required data from memory.

Page 16: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

Figure 11. Snapshot of the Miss process

Page 17: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

Write Test

1) Cpu requests write a data (cpu_cac_data) from cache by giving an

address(cpu_cac_add) and data(cpu_cac_data). And sets the write bit (cpu_cac_wrt). 2) As we know, the address (cpu_cac_add) given by cache is 5 bits, and consists of cpu

tag (cpu_tag) and cpu index (cpu_index). 3) In this situation, cpu sends the address b11010 (cpu_cac_add), and set the write bit

(cpu_cac_wrt <= 1’b1). 4) Cache receives the data and the address, sends them to the memory using

cac_mem_data and cac_mem_add.(cac_mem_data <= 11111 cac_mem_add <= 11010). And cache sets the read bit for Memory (cac_mem_read <= 1’b1).

5) Cache replaces the new data (cac_mem_data) using the address(cac_mem_add) given.

Page 18: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

Figure 12. Snapshot of the Write process

Page 19: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

Miss Occurs, and the data is retrieved from memory

Basically, this part consists of combining the first three tests and adding the part – retrieving the data from memory. At the first three tests, hit, miss and write can be implemented in one state, however if we want to retrieve the data from memory, we need 3 states. These states are illustrated in Figure ?.

1) Cpu requests the data (cac_cpu_data) from cache by giving an address(cpu_cac_add). And sets the read bit (cpu_cac_read).

2) As we know, the address (cpu_cac_add) given by cache is 5 bits, and consists of cpu tag (cpu_tag) and cpu index (cpu_index).

3) In this situation, cpu sends the address b11010 (cpu_cac_add), and set the read bit (cpu_cac_read <= 1’b1).

4) As can be seen from the snapshot, cpu tag(cpu_tag) = 10, cpu index(cpu_index) = 010 which is true if we look at the address.

5) In the cache, the data in the index = 100 is 10100010. 6) Cache compare the current tag (cur_tag) at the index which is 01 and cpu

tag(cpu_tag). They doen’t match. 7) Thus cache couldn’t find the requested data at the given cpu index(cpu_index). Sends a miss signal to the cpu (cac_cpu_miss <= 1’b1). 8) Cache needs to take the required data from memory. Cache sets the read bit for Memory (cac_mem_read <= 1’b1), Cache sends the data to the memory by cac_mem_add. 9) Memory finds the data(mem_cac_data) using the address (cac_mem_add), cac_mem_add =11010 which is 26 in decimal. Mem_data[26] = b11010. So memory sends this data mem_data[26] using mem_cac_data. 10) After cache receives the desired data, it sends the data cpu using cac_cpu_data.

Page 20: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

Figure 13. Snapshot of the retrieving the data from memory after a Miss and sending it to Cpu.

Page 21: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

5) Codes

A) For Hit test

//Cache//

//***********************************// //Timescale// //***********************************// `timescale 1ns/100ps //***********************************// //Module// //***********************************// module cache( //Inputs from outside clock, //clock of the system rst_1, // Asynchronous active low reset if rst_1 = 0 Hardreset// //Inputs from Cpu// cpu_cac_read, // 1-bit indicates the read signal cpu_cac_wrt, // 1-bit indicates the write signal cpu_cac_data, // 5-bits data cpu_cac_add, // 5-bits address //Outputs from Cache cac_cpu_hit, // 1-bit indicates the hit cac_cpu_miss, // 1-bit indicates the miss cac_cpu_data // 5 bits data ); //************************************// //Ports// //************************************// // Inputs to Cache // input clock; input rst_1; input cpu_cac_read; input cpu_cac_wrt; input [0:4] cpu_cac_data;

Page 22: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

input [0:4] cpu_cac_add; // Outputs from Cache// output cac_cpu_hit; output cac_cpu_miss; output [0:4] cac_cpu_data; //************************************// //Registers// //************************************// // Cpu // reg cac_cpu_hit; reg cac_cpu_miss; reg[0:4] cac_cpu_data; //Cache// reg[7:0] cache[0:7]; //************************************// //Wires// //************************************// wire[7:0] cpu_buf; wire[2:0] cpu_index; wire[1:0] cpu_tag; wire[1:0] cur_tag; //************************************// //Interface// //************************************// assign cpu_index = cpu_cac_add[0:2]; assign cpu_buf = cache[cpu_index]; assign cur_tag = cpu_buf[6:5]; assign cpu_tag = cpu_cac_add[3:4]; //************************************// // Work// //************************************//

Page 23: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

always @(posedge clock or negedge rst_1) begin if (rst_1 == 0) begin //Store the initial values// //Outputs from Cache cac_cpu_hit <= 1'b0; // 1-bit indicates the hit cac_cpu_miss <= 1'b0; // 1-bit indicates the miss cac_cpu_data <= 5'b0; // 5 bits data // Store the cache with the initial values// cache[7] <= 8'b10000001; cache[6] <= 8'b10100010; cache[5] <= 8'b11000011; cache[4] <= 8'b11100100; cache[3] <= 8'b10000101; cache[2] <= 8'b10100110; cache[1] <= 8'b11000111; cache[0] <= 8'b11101000; end else begin if (cpu_cac_read == 1) begin if ((cpu_tag == cur_tag)&(cpu_buf[7] == 1'b1)) //Read Hit occurs begin $display("READ HIT"); cac_cpu_hit <= 1'b1; cac_cpu_miss <= 1'b0; cac_cpu_data <=cpu_buf[4:0]; end end end end endmodule

//Test bench for Hit//

`timescale 1ns/100ps module test_write(); //Registers//

Page 24: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

reg clock, rst_1, cpu_cac_read, cpu_cac_wrt; reg[4:0] cpu_cac_data, cpu_cac_add; //Wires// wire cac_cpu_hit, cac_cpu_miss; wire[4:0] cac_cpu_data; //Instantiate Cache cache Cac(.clock(clock), .rst_1(rst_1), .cpu_cac_add(cpu_cac_add), .cpu_cac_read(cpu_cac_read), .cpu_cac_wrt(cpu_cac_wrt), .cpu_cac_data(cpu_cac_data), .cac_cpu_hit(cac_cpu_hit), .cac_cpu_miss(cac_cpu_miss), .cac_cpu_data(cac_cpu_data) ); //clock// always #2 clock <= ~clock; // Start initial begin clock <= 1'b0; rst_1 <= 1'b1; cpu_cac_read <= 1'b0; cpu_cac_wrt <= 1'b0; cpu_cac_add <= 5'b0; cpu_cac_data <= 5'b0; #5 rst_1 <=1'b0; #10 rst_1 <= 1'b1; #20

Page 25: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

cpu_cac_read <= 1'b1; cpu_cac_add <= 5'b10011; #10 cpu_cac_read <= 1'b0; $stop;

end endmodule

A) ForMiss test

//Cache//

//***********************************// //Timescale// //***********************************// `timescale 1ns/100ps //***********************************// //Module// //***********************************// module cache( //Inputs from outside clock, //clock of the system rst_1, // Asynchronous active low reset if rst_1 = 0 Hardreset// //Inputs from Cpu// cpu_cac_read, // 1-bit indicates the read signal cpu_cac_wrt, // 1-bit indicates the write signal cpu_cac_data, // 5-bits data cpu_cac_add, // 5-bits address

Page 26: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

//Outputs from Cache cac_cpu_hit, // 1-bit indicates the hit cac_cpu_miss, // 1-bit indicates the miss cac_cpu_data // 5 bits data ); //************************************// //Ports// //************************************// // Inputs to Cache // input clock; input rst_1; input cpu_cac_read; input cpu_cac_wrt; input [0:4] cpu_cac_data; input [0:4] cpu_cac_add; // Outputs from Cache// output cac_cpu_hit; output cac_cpu_miss; output [0:4] cac_cpu_data; //************************************// //Registers// //************************************// // Cpu // reg cac_cpu_hit; reg cac_cpu_miss; reg[0:4] cac_cpu_data; //Cache// reg[7:0] cache[0:7]; //************************************// //Wires// //************************************// wire[7:0] cpu_buf; wire[2:0] cpu_index; wire[1:0] cpu_tag;

Page 27: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

wire[1:0] cur_tag; //************************************// //Interface// //************************************// assign cpu_index = cpu_cac_add[0:2]; assign cpu_buf = cache[cpu_index]; assign cur_tag = cpu_buf[6:5]; assign cpu_tag = cpu_cac_add[3:4]; //************************************// // Work// //************************************// always @(posedge clock or negedge rst_1) begin if (rst_1 == 0) begin //Store the initial values// //Outputs from Cache cac_cpu_hit <= 1'b0; // 1-bit indicates the hit cac_cpu_miss <= 1'b0; // 1-bit indicates the miss cac_cpu_data <= 5'b0; // 5 bits data // Store the cache with the initial values// cache[7] <= 8'b10000001; cache[6] <= 8'b10100010; cache[5] <= 8'b11000011; cache[4] <= 8'b11100100; cache[3] <= 8'b10000101; cache[2] <= 8'b10100110; cache[1] <= 8'b11000111; cache[0] <= 8'b11101000; end else begin if (cpu_cac_read == 1) begin

Page 28: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

if ((cpu_tag == cur_tag)&(cpu_buf[7] == 1'b1)) //Read Hit occurs begin $display("READ HIT"); cac_cpu_hit <= 1'b1; cac_cpu_miss <= 1'b0; cac_cpu_data <=cpu_buf[4:0]; end else begin $display("READ MISS"); cac_cpu_hit <= 1'b0; cac_cpu_miss <= 1'b1; end end end end endmodule

//Test bench for Miss//

`timescale 1ns/100ps module test_write(); reg clock, rst_1, cpu_cac_read, cpu_cac_wrt; reg[4:0] cpu_cac_data, cpu_cac_add; wire cac_cpu_hit, cac_cpu_miss; wire[4:0] cac_cpu_data; //Instattiate Cache cache Cac(.clock(clock), .rst_1(rst_1), .cpu_cac_add(cpu_cac_add), .cpu_cac_read(cpu_cac_read), .cpu_cac_wrt(cpu_cac_wrt), .cpu_cac_data(cpu_cac_data), .cac_cpu_hit(cac_cpu_hit), .cac_cpu_miss(cac_cpu_miss), .cac_cpu_data(cac_cpu_data) );

Page 29: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

// clock always #2 clock <= ~clock; // Start initial begin clock <= 1'b0; rst_1 <= 1'b1; cpu_cac_read <= 1'b0; cpu_cac_wrt <= 1'b0; cpu_cac_add <= 5'b0; cpu_cac_data <= 5'b0; #5 rst_1 <=1'b0; #10 rst_1 <= 1'b1; #20 cpu_cac_read <= 1'b1; cpu_cac_add <= 5'b11010; $stop #10 cpu_cac_read <= 1'b1; $stop; end endmodule

C) For Write test

//**********************************************// // Includes //*********************************************// `timescale 1ns/100ps //********************************************//

Page 30: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

// Module Begin //*******************************************// module Cache( //Global Inputs clock, //System clock rst_l, //Asynchronous active low reset //Inputs from CPU cpu_cac_add, //5-bit address from CPU cpu_cac_read, //CPU Read to Cache cpu_cac_wrt, //CPU Write to Cache cpu_cac_data, //5-bit data from CPU //Outputs to CPU cac_cpu_hit, //Cache hit to CPU cac_cpu_miss,//Cache stall to CPU cac_cpu_data, //5-bit data to CPU //Outputs to Main Memory cac_mem_add, //5-bit address to Main Memory cac_mem_data, //5-bit data to Main Memory cac_mem_read, //Read signal to Main Memory cac_mem_wrt //Write signal to Main Memory ); //****************************************************// // Input Ports //***************************************************// //Global Inputs input clock; input rst_l; //CPU Inputs input [0:4]cpu_cac_data; input [0:4]cpu_cac_add; input cpu_cac_read; input cpu_cac_wrt; //****************************************************// // Output Ports //****************************************************//

Page 31: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

//CPU Outputs output [4:0] cac_cpu_data; output cac_cpu_hit; output cac_cpu_miss; //Main Memory Outputs output [4:0] cac_mem_data; output [4:0] cac_mem_add; output cac_mem_read; output cac_mem_wrt; //*************************************************// // Register Variables //************************************************// //CPU registered outputs reg [4:0] cac_cpu_data; reg cac_cpu_hit; reg cac_cpu_miss; //Main Memory registered outputs reg [4:0] cac_mem_data; reg [4:0] cac_mem_add; reg cac_mem_read; reg cac_mem_wrt; //Cache Buffer reg [7:0] cache [0:7]; //************************************// //Wires// //************************************// wire[7:0] cpu_buf; wire[2:0] cpu_index; wire[1:0] cpu_tag; wire[1:0] cur_tag; wire[4:0] mem_addr; //************************************// //Interface//

Page 32: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

//************************************// assign cpu_index = cpu_cac_add[0:2]; assign cpu_buf = cache[cpu_index]; assign cur_tag = cpu_buf[6:5]; assign cpu_tag = cpu_cac_add[3:4]; assign mem_addr = cpu_cac_add; //************************************// // Work// //************************************// always @(posedge clock or negedge rst_l) begin if (rst_l == 0) begin //Store the initial values// //Outputs from Cache cac_cpu_hit <= 1'b0; // 1-bit indicates the hit cac_cpu_miss <= 1'b0; // 1-bit indicates the miss cac_cpu_data <= 5'b0; // 5 bits data cac_mem_data <= 5'b0; cac_mem_add <= 5'b0; cac_mem_read <= 1'b0; cac_mem_wrt <= 1'b0; // Store the cache with the initial values// cache[7] <= 8'b10000001; cache[6] <= 8'b10100010; cache[5] <= 8'b11000011; cache[4] <= 8'b11100100; cache[3] <= 8'b10000101; cache[2] <= 8'b10100110; cache[1] <= 8'b11000111; cache[0] <= 8'b11101000; end else begin if (cpu_cac_read == 1) begin if ((cpu_tag == cur_tag)&(cpu_buf[7] == 1'b1)) //Read Hit occurs begin

Page 33: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

$display("READ HIT"); cac_cpu_hit <= 1'b1; cac_cpu_miss <= 1'b0; cac_cpu_data <=cpu_buf[4:0]; end else begin $display("READ MISS"); cac_cpu_hit <= 1'b0; cac_cpu_miss <= 1'b1; end end if (cpu_cac_wrt == 1) //Write Occurs begin cac_mem_wrt <= 1'b1; cac_mem_add <= mem_addr; cac_mem_data <= cpu_cac_data; end end end endmodule//

//Memory//

//****************************// //Timescale// //****************************// `timescale 1ns/100ps //Memory Module// module memory(rst_l, cac_mem_add, cac_mem_data, cac_mem_wrt, cac_mem_read ); //Inputs input rst_l; input cac_mem_wrt; input cac_mem_read;

Page 34: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

input[4:0] cac_mem_data; input[4:0] cac_mem_add; //Registers reg[10:0] mem_data[31:0]; //Start always @(negedge rst_l) begin if (rst_l == 0) begin mem_data[0] = 5'b00000; mem_data[1] = 5'b00001; mem_data[2] = 5'b00010; mem_data[3] = 5'b00011; mem_data[4] = 5'b00100; mem_data[5] = 5'b00101; mem_data[6] = 5'b00110; mem_data[7] = 5'b00111; mem_data[8] = 5'b01000; mem_data[9] = 5'b01001; mem_data[10] = 5'b01010; mem_data[11] = 5'b01011; mem_data[12] = 5'b01100; mem_data[13] = 5'b01101; mem_data[14] = 5'b01110; mem_data[15] = 5'b01111; mem_data[16] = 5'b10000; mem_data[17] = 5'b10001; mem_data[18] = 5'b10010; mem_data[19] = 5'b10011; mem_data[20] = 5'b10100; mem_data[21] = 5'b10101; mem_data[22] = 5'b10110; mem_data[23] = 5'b10111; mem_data[24] = 5'b11000; mem_data[25] = 5'b11001; mem_data[26] = 5'b11010; mem_data[27] = 5'b11011; mem_data[28] = 5'b11100; mem_data[29] = 5'b11101; mem_data[30] = 5'b11110;

Page 35: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

mem_data[31] = 5'b11111; end else if (cac_mem_wrt==1'b1) //Memory write begin mem_data[cac_mem_add] <= cac_mem_data; end end endmodule

//Test bench for Write//

`timescale 1ns/100ps module test_write(); reg clock, rst_l, cpu_cac_read, cpu_cac_wrt; reg[4:0] cpu_cac_data, cpu_cac_add; wire cac_cpu_hit, cac_cpu_miss; wire[4:0] cac_cpu_data; //Instantiate Cache Cache Cac(.clock(clock), .rst_l(rst_l), .cpu_cac_add(cpu_cac_add), .cpu_cac_read(cpu_cac_read), .cpu_cac_wrt(cpu_cac_wrt), .cpu_cac_data(cpu_cac_data), .cac_cpu_hit(cac_cpu_hit), .cac_cpu_miss(cac_cpu_miss), .cac_cpu_data(cac_cpu_data), .cac_mem_read(cac_mem_read), .cac_mem_wrt(cac_mem_wrt),

Page 36: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

.cac_mem_add(cac_mem_add), .cac_mem_data(cac_mem_data) ); memory MemoryMain(.rst_l(rst_l), .cac_mem_add(cac_mem_add), .cac_mem_data(cac_mem_data), .cac_mem_wrt(cac_mem_wrt), .cac_mem_read(cac_mem_read) ); // clock always #2 clock <= ~clock; // Start initial begin clock <= 1'b0; rst_l <= 1'b1; cpu_cac_read <= 1'b0; cpu_cac_wrt <= 1'b0; cpu_cac_add <= 5'b0; cpu_cac_data <= 5'b0; #5 rst_l <=1'b0; #10 rst_l <= 1'b1; #20 cpu_cac_wrt <= 1'b1; cpu_cac_add <= 5'b11010; cpu_cac_data <= 5'b11111; #5 cpu_cac_wrt <= 1'b0; $stop; end endmodule

D) For Retrieving the data from memory after a Miss and sending it to Cpu

//Cache//

Page 37: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

//***********************************// //Timescale// //***********************************// `timescale 1ns/100ps //***********************************// //Module// //***********************************// module cache( //Inputs from outside clock, //clock of the system rst_1, // Asynchronous active low reset if rst_1 = 0 Hardreset// //Inputs from Cpu// cpu_cac_read, // 1-bit indicates the read signal cpu_cac_wrt, // 1-bit indicates the write signal cpu_cac_data, // 5-bits data cpu_cac_add, // 5-bits address //Inputs from Memory// mem_cac_data, // 5-bits data //Outputs from Cache to Cpu// cac_cpu_hit, // 1-bit indicates the hit cac_cpu_miss, // 1-bit indicates the miss cac_cpu_data, // 5 bits data //Outputs from Cache to Memory// cac_mem_read, // 1-bit indicates the read cac_mem_wrt, // 1-bit indicates the write cac_mem_add, // 5-bits address cac_mem_data // 5-bits data ); //************************************// //Ports// //************************************// // Inputs to Cache //

Page 38: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

input clock; input rst_1; input cpu_cac_read; input cpu_cac_wrt; input [0:4] cpu_cac_data; input [0:4] cpu_cac_add; input mem_cac_data; // Outputs from Cache// output cac_cpu_hit; output cac_cpu_miss; output [0:4] cac_cpu_data; output [0:4] cac_mem_data; output [0:4] cac_mem_add; output cac_mem_read; output cac_mem_wrt; //************************************// //Registers// //************************************// // Cpu // reg cac_cpu_hit; reg cac_cpu_miss; reg[0:4] cac_cpu_data; //Memory// reg[0:4] cac_mem_add; reg[0:4] cac_mem_data; reg cac_mem_read; reg cac_mem_wrt; //State Diagram reg[2:0] state; //Cache// reg[7:0] cache[0:7]; parameter S0 = 0; parameter S1 = 1; parameter S2 = 2;

Page 39: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

//************************************// //Wires// //************************************// wire[7:0] cpu_buf; wire[2:0] cpu_index; wire[1:0] cpu_tag; wire[1:0] cur_tag; wire[4:0] mem_add; wire[4:0] mem_data; wire[4:0] cache_data; //************************************// //Interface// //************************************// assign cpu_index = cpu_cac_add[0:2]; assign cpu_buf = cache[cpu_index]; assign cur_tag = cpu_buf[6:5]; assign cpu_tag = cpu_cac_add[3:4]; assign mem_add = cpu_cac_add; assign mem_data = mem_cac_data[0:4]; //************************************// // Work// //************************************// always @(posedge clock or negedge rst_1) begin if (rst_1 == 0) begin //Store the initial values// //Outputs from Cache cac_cpu_hit <= 1'b0; // 1-bit indicates the hit cac_cpu_miss <= 1'b0; // 1-bit indicates the miss cac_cpu_data <= 5'b0; // 5 bits data cac_mem_data <= 5'b0; cac_mem_read <= 1'b0; cac_mem_wrt <= 1'b0; cac_mem_add <= 1'b0; state <= S0;

Page 40: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

// Store the cache with the initial values// cache[7] <= 8'b10000001; cache[6] <= 8'b10100010; cache[5] <= 8'b11000011; cache[4] <= 8'b11100100; cache[3] <= 8'b10000101; cache[2] <= 8'b10100110; cache[1] <= 8'b11000111; cache[0] <= 8'b11101000; end else begin case (state) // State 1// S0: if (cpu_cac_read == 1) begin cac_cpu_hit <= 1'b1; if ((cpu_tag == cur_tag)&(cpu_buf[7] == 1'b1)) //Read Hit occurs begin $display("READ HIT"); cac_cpu_hit <= 1'b1; cac_cpu_miss <= 1'b0; cac_cpu_data <=cpu_buf[4:0]; end else begin $display("READ MISS"); cac_cpu_hit <= 1'b0; cac_cpu_miss <= 1'b1; cac_mem_read <= 1'b1; cac_mem_add <= mem_add; state <= S1; end end // State 1 S1:// State 2 begin

Page 41: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

cac_mem_read <=1'b0; cac_cpu_miss <= 1'b1; state <= S2; end //State 2 S2://State 3 begin cac_cpu_data <= mem_data; cac_mem_read <= 1'b0; cac_cpu_miss <= 1'b0; cache[cpu_index] <= {1,b1,cpu_tag, mem_cac_data}; state <= S0; end //State 3 endcase end end endmodule

//Memory//

//****************************// //Timescale// //****************************// `timescale 1ns/100ps //Memory Module// module memory(rst_1, cac_mem_add, cac_mem_data, cac_mem_wrt, cac_mem_read, mem_cac_data ); //Inputs input rst_1; input cac_mem_wrt; input cac_mem_read; input[4:0] cac_mem_data; input[4:0] cac_mem_add; //Output

Page 42: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

output [4:0] mem_cac_data; //Registers reg [10:0] mem_data[31:0]; reg [4:0] mem_cac_data; //Start always @(negedge rst_1) begin if (rst_1 == 0) begin mem_data[0] = 5'b00000; mem_data[1] = 5'b00001; mem_data[2] = 5'b00010; mem_data[3] = 5'b00011; mem_data[4] = 5'b00100; mem_data[5] = 5'b00101; mem_data[6] = 5'b00110; mem_data[7] = 5'b00111; mem_data[8] = 5'b01000; mem_data[9] = 5'b01001; mem_data[10] = 5'b01010; mem_data[11] = 5'b01011; mem_data[12] = 5'b01100; mem_data[13] = 5'b01101; mem_data[14] = 5'b01110; mem_data[15] = 5'b01111; mem_data[16] = 5'b10000; mem_data[17] = 5'b10001; mem_data[18] = 5'b10010; mem_data[19] = 5'b10011; mem_data[20] = 5'b10100; mem_data[21] = 5'b10101; mem_data[22] = 5'b10110; mem_data[23] = 5'b10111; mem_data[24] = 5'b11000; mem_data[25] = 5'b11001; mem_data[26] = 5'b11010; mem_data[27] = 5'b11011; mem_data[28] = 5'b11100; mem_data[29] = 5'b11101; mem_data[30] = 5'b11110;

Page 43: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

mem_data[31] = 5'b11111; end else begin if (cac_mem_wrt==1'b1) //Memory write begin mem_data[cac_mem_add] <= cac_mem_data; end if (cac_mem_read == 1'b1) //Memory read begin mem_cac_data <= mem_data[cac_mem_add]; end end end endmodule

//Testbench//

`timescale 1ns/100ps module test_write(); reg clock, rst_1, cpu_cac_read, cpu_cac_wrt; reg[4:0] cpu_cac_data, cpu_cac_add; wire cac_cpu_hit, cac_cpu_miss, cac_mem_read, cpu_mem_wrt; wire[4:0] cac_cpu_data, cac_mem_add, cac_mem_data, mem_cac_data; //Instantiate Cache cache Cac(.clock(clock), .rst_1(rst_1), .cpu_cac_read(cpu_cac_read), .cpu_cac_wrt(cpu_cac_wrt), .cpu_cac_data(cpu_cac_data), .cpu_cac_add(cpu_cac_add), .cac_cpu_hit(cac_cpu_hit), .cac_cpu_miss(cac_cpu_miss), .cac_cpu_data(cac_cpu_data), .cac_mem_read(cac_mem_read),

Page 44: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

.cac_mem_wrt(cac_mem_wrt), .cac_mem_add(cac_mem_add), .cac_mem_data(cac_mem_data), mem_cac_data(mem_cac_data) ); memory Mem(.rst_1(rst_1), .cac_mem_add(cac_mem_add), .cac_mem_data(cac_mem_data), .cac_mem_wrt(cac_mem_wrt), .cac_mem_read(cac_mem_read), .mem_cac_data(mem_cac_data) ); // clock always #2 clock <= ~clock; // Start initial begin clock <= 1'b0; rst_1 <= 1'b1; cpu_cac_read <= 1'b0; cpu_cac_wrt <= 1'b0; cpu_cac_add <= 5'b0; cpu_cac_data <= 5'b0; #7 rst_1 <=1'b0; #10 rst_1 <= 1'b1; #20 cpu_cac_read <= 1'b1; cpu_cac_add <= 5'b11010; #10 cpu_cac_read <= 1'b0;

Page 45: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

$stop; end endmodule

Page 46: ECE 254A Project1 - University of California, San Diegocseweb.ucsd.edu/~airturk/others/1.pdf · ECE 254A Advanced Computer Architecture: Supercomputers Fall 2006 University of California,

References

1) Computer Architecture “A Quantitative Approach,” John L. Hennessy & David A. Patterson

2) Computer Organization and Design, John L. Hennessy & David A. Patterson 3) Advanced Digital Design with the Verilog HDL, Michael D. Ciletti


Recommended