-
Notifications
You must be signed in to change notification settings - Fork 43
/
Address_Translator_Static.v
125 lines (100 loc) · 5.31 KB
/
Address_Translator_Static.v
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
//# Address Translator (Static)
// *This is the fixed-range version of the [Arithmetic Address
// Translator](./Address_Translator_Arithmetic.html).*
// Translates a *fixed*, arbitrary, unaligned range of N locations into an
// aligned range (starting at zero, up to N-1) so the translated address can
// be used to sequentially index into other addressed components
// (multiplexers, address decoders, RAMs, etc...).
// When memory-mapping a small memory or a number of control registers to
// a base address that isn't a power-of-2, the Least Significant Bits (LSBs)
// which vary over the address range will not address the mapped entries in
// order. This addressing offset scrambles the order of the control registers
// and of the memory locations so that the mapped order no longer matches the
// physical order of the actual hardware, which makes design and debugging
// harder.
// If the address range does not start at a power-of-2 boundary, the LSBs will
// not count in strictly increasing order and will not start at zero: the
// order of the numbers the LSBs represent is rotated by the offset to the
// nearest power-of-2 boundary. Also, if the address range does not span
// a power-of-2 sized block, then not all possible LSB values will be present.
// However, we can describe a translation table as a small asynchronous
// read-only memory which translates only the raw LSBs into consecutive LSBs
// to then directly address the memory or control registers. A separate
// [Address Decoder](./Address_Decoder_Behavioural.html) signals when the
// translation is valid.
// For example, take 4 locations at addresses 7 to 10, which we would like to
// map to a range of 0 to 3. We want address 7 to access the zeroth location,
// and so on. To map 4 locations, the address' two LSBs must be translated as
// follows:
//<pre>
//01<b>11</b> (7) --> 01<b>00</b> (4)
//10<b>00</b> (8) --> 10<b>01</b> (5)
//10<b>01</b> (9) --> 10<b>10</b> (6)
//10<b>10</b> (10) --> 10<b>11</b> (7)
//</pre>
// You can see the raw two LSBs are in the sequence 3,0,1,2, which we must map
// to 0,1,2,3. We can pre-fill a table with the right values to do that, where
// address 3 will contain value 0, address 0 will contain value 1, etc...
// Translating the LSBs like so re-aligns the addresses to the nearest
// power-of-2 boundary, leaving the LSBs in the order we need.
// Typically, you'll need this Address Translator alongside an Address Decoder
// module which decodes a *fixed* address range: either
// [Static](./Address_Decoder_Static.html), or
// [Behavioural](./Address_Decoder_Behavioural.html) or
// [Arithmetic](./Address_Decoder_Arithmetic.html) with constant base and
// bound addresses.
// Since we only translate the LSBs, the higher address bits are unused here,
// and would trigger CAD tool warnings about module inputs not driving
// anything. Thus, we don't select the LSBs from the whole address here, and
// it is up to the enclosing module to feed the Address Translator only the
// necessary LSBs, which is fine since the enclosing module must already know
// the number of LSBs to translate. The output is the translated LSBs in
// consecutive order.
`default_nettype none
module Address_Translator_Static
#(
parameter INPUT_ADDR_BASE = 0,
parameter OUTPUT_ADDR_WIDTH = 0
)
(
input wire [OUTPUT_ADDR_WIDTH-1:0] input_addr,
output reg [OUTPUT_ADDR_WIDTH-1:0] output_addr
);
// Let's create the translation table and specify its implementation as LUT
// logic, otherwise it might end up as a Block RAM or asynchronous LUT RAM at
// random, and then the table cannot get optimized into other logic, and will
// likely be too slow. We make the table deep enough to hold all possible LSB
// values, which enables better logic optimization (no special cases for
// unused addresses).
localparam OUTPUT_ADDR_DEPTH = 2**OUTPUT_ADDR_WIDTH;
(* ramstyle = "logic" *) // Quartus
(* ram_style = "distributed" *) // Vivado
reg [OUTPUT_ADDR_WIDTH-1:0] translation_table [OUTPUT_ADDR_DEPTH-1:0];
// Now lets construct the translation table entry index **j**. When
// initializing **j**, we must zero-pad the narrow input address up to the
// width of a Verilog integer, else the width mismatch raises warnings at its
// assigment.
// We also need a simple loop counter **i** to iterate over the number of
// addresses to translate, which is the entire range described by the LSBs.
integer i;
integer j;
localparam INTEGER_WIDTH = 32;
localparam PADSIZE = INTEGER_WIDTH - OUTPUT_ADDR_WIDTH;
localparam PADDING = {PADSIZE{1'b0}};
// We then initialize **j** by slicing out the LSBs from the integer base
// address, and then padding it back to an integer. We then store the
// (strictly incrementing) corresponding LSBs from the **i** counter into each
// translation table entry indexed by **j**, wrapping around to the start of
// the table if necessary.
initial begin
j = {PADDING, INPUT_ADDR_BASE[OUTPUT_ADDR_WIDTH-1:0]};
for(i = 0; i < OUTPUT_ADDR_DEPTH; i = i + 1) begin
translation_table[j] = i[OUTPUT_ADDR_WIDTH-1:0];
j = (j + 1) % OUTPUT_ADDR_DEPTH;
end
end
// Finally, after all these details, the translation itself is trivial.
always @(*) begin
output_addr = translation_table[input_addr];
end
endmodule