1 The Acorn Electron ULA
2 ======================
3
4 Principal Design and Feature Constraints
5 ----------------------------------------
6
7 The features of the ULA are limited in sophistication by the amount of time
8 and resources that can be allocated to each activity supporting the
9 fundamental features and obligations of the unit. Maintaining a screen display
10 based on the contents of RAM itself requires the ULA to have exclusive access
11 to various hardware resources for a significant period of time.
12
13 Whilst other elements of the ULA can in principle run in parallel with the
14 display refresh activity, they cannot also access the RAM at the same time.
15 Consequently, other features that might use the RAM must accept a reduced
16 allocation of that resource in comparison to a hypothetical architecture where
17 concurrent RAM access is possible at all times.
18
19 Thus, the principal constraint for many features is bandwidth. The duration of
20 access to hardware resources is one aspect of this; the rate at which such
21 resources can be accessed is another. For example, the RAM is not fast enough
22 to support access more frequently than one byte per 2MHz cycle, and for screen
23 modes involving 80 bytes of screen data per scanline, there are no free cycles
24 for anything other than the production of pixel output during the active
25 scanline periods.
26
27 Another constraint is imposed by the method of RAM access provided by the ULA.
28 The ULA is able to access RAM by fetching 4 bits at a time and thus managing
29 to transfer 8 bits within a single 2MHz cycle, this being sufficient to
30 provide display data for the most demanding screen modes. However, this
31 mechanism's timing requirements are beyond the capabilities of the CPU when
32 running at 2MHz.
33
34 Consequently, the CPU will only ever be able to access RAM via the ULA at
35 1MHz, even when the ULA is not accessing the RAM. Fortunately, when needing to
36 refresh the display, the ULA is still able to make use of the idle part of
37 each 1MHz cycle (or, rather, the idle 2MHz cycle unused by the CPU) to itself
38 access the RAM at a rate of 1 byte per 1MHz cycle (or 1 byte every other 2MHz
39 cycle), thus supporting the less demanding screen modes.
40
41 Timing
42 ------
43
44 According to 15.3.2 in the Advanced User Guide, there are 312 scanlines, 256
45 of which are used to generate pixel data. At 50Hz, this means that 128 cycles
46 are spent on each scanline (2000000 cycles / 50 = 40000 cycles; 40000 cycles /
47 312 ~= 128 cycles). This is consistent with the observation that each scanline
48 requires at most 80 bytes of data, and that the ULA is apparently busy for 40
49 out of 64 microseconds in each scanline.
50
51 (In fact, since the ULA is seeking to provide an image for an interlaced
52 625-line display, there are in fact two "fields" involved, one providing 312
53 scanlines and one providing 313 scanlines. See below for a description of the
54 video system.)
55
56 Access to RAM involves accessing four 64Kb dynamic RAM devices (IC4 to IC7,
57 each providing two bits of each byte) using two cycles within the 500ns period
58 of the 2MHz clock to complete each access operation. Since the CPU and ULA
59 have to take turns in accessing the RAM in MODE 4, 5 and 6, the CPU must
60 effectively run at 1MHz (since every other 500ns period involves the ULA
61 accessing RAM) during transfers of screen data.
62
63 The CPU is driven by an external clock (IC8) whose 16MHz frequency is divided
64 by the ULA (IC1) depending on the screen mode in use. Each 16MHz cycle is
65 approximately 62.5ns. To access the memory, the following patterns
66 corresponding to 16MHz cycles are required:
67
68 Time (ns): 0-------------- 500------------- ...
69 2 MHz cycle: 0 1 ...
70 16 MHz cycle: 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 ...
71 /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\ ...
72 ~RAS: /---\___________/---\___________ ...
73 ~CAS: /-----\___/-\___/-----\___/-\___ ...
74 Address events: A B C A B C ...
75 Data events: F S F S ...
76
77 ~RAS ops: 1 0 1 0 ...
78 ~CAS ops: 1 0 1 0 1 0 1 0 ...
79
80 Address ops: a b c a b c ...
81 Data ops: s f s f ...
82
83 ~WE: ......W ...
84 PHI OUT: \_______________/--------------- ...
85 CPU (RAM): L D ...
86 RnW: R ...
87
88 PHI OUT: \_______/-------\_______/------- ...
89 CPU (ROM): L D L D ...
90 RnW: R R ...
91
92 ~RAS must be high for 100ns, ~CAS must be high for 50ns.
93 ~RAS must be low for 150ns, ~CAS must be low for 90ns.
94 Data is available 150ns after ~RAS goes low, 90ns after ~CAS goes low.
95
96 Here, "A" and "B" respectively indicate the row and first column addresses
97 being latched into the RAM (on a negative edge for ~RAS and ~CAS
98 respectively), and "C" indicates the second column address being latched into
99 the RAM. Presumably, the first and second half-bytes can be read at "F" and
100 "S" respectively, and the row and column addresses must be made available at
101 "a" and "b" (and "c") respectively at the latest. Data can be read at "f" and
102 "s" for the first and second half-bytes respectively.
103
104 For the CPU, "L" indicates the point at which an address is taken from the CPU
105 address bus, on a negative edge of PHI OUT, with "D" being the point at which
106 data may either be read or be asserted for writing, on a positive edge of PHI
107 OUT. Here, PHI OUT is driven at 1MHz. Given that ~WE needs to be driven low
108 for writing or high for reading, and thus propagates RnW from the CPU, this
109 would need to be done before data would be retrieved and, according to the
110 TM4164EC4 datasheet, even as late as the column address is presented and ~CAS
111 brought low.
112
113 The TM4164EC4-15 has a row address access time of 150ns (maximum) and a column
114 address access time of 90ns (maximum), which appears to mean that ~RAS must be
115 held low for at least 150ns and that ~CAS must be held low for at least 90ns
116 before data becomes available. 150ns is 2.4 cycles (at 16MHz) and 90ns is 1.44
117 cycles. Thus, "A" to "F" is 2.5 cycles, "B" to "F" is 1.5 cycles, "C" to "S"
118 is 1.5 cycles.
119
120 Note that the Service Manual refers to the negative edge of RAS and CAS, but
121 the datasheet for the similar TM4164EC4 product shows latching on the negative
122 edge of ~RAS and ~CAS. It is possible that the Service Manual also intended to
123 communicate the latter behaviour. In the TM4164EC4 datasheet, it appears that
124 "page mode" provides the appropriate behaviour for that particular product.
125
126 The CPU, when accessing the RAM alone, apparently does not make use of the
127 vacated "slot" that the ULA would otherwise use (when interleaving accesses in
128 MODE 4, 5 and 6). It only employs a full 2MHz access frequency to memory when
129 accessing ROM (and potentially sideways RAM). The principal limitation is the
130 amount of time needed between issuing an address and receiving an entire byte
131 from the RAM, which is approximately 7 cycles (at 16MHz): much longer than the
132 4 cycles that would be required for 2MHz operation.
133
134 See: Acorn Electron Advanced User Guide
135 See: Acorn Electron Service Manual
136 http://chrisacorns.computinghistory.org.uk/docs/Acorn/Manuals/Acorn_ElectronSM.pdf
137 See: http://mdfs.net/Docs/Comp/Electron/Techinfo.htm
138 See: http://stardot.org.uk/forums/viewtopic.php?p=120438#p120438
139 See: One of the Most Popular 65,536-Bit (64K) Dynamic RAMs The TMS 4164
140 http://smithsonianchips.si.edu/augarten/p64.htm
141
142 A Note on 8-Bit Wide RAM Access
143 -------------------------------
144
145 It is worth considering the timing when 8 bits of data can be obtained at once
146 from the RAM chips:
147
148 Time (ns): 0-------------- 500------------- ...
149 2 MHz cycle: 0 1 ...
150 8 MHz cycle: 0 1 2 3 0 1 2 3 ...
151 /-\_/-\_/-\_/-\_/-\_/-\_/-\_/-\_ ...
152 ~RAS: /---\___________/---\___________ ...
153 ~CAS: /-------\_______/-------\_______ ...
154 Address events: A B A B ...
155 Data events: E E ...
156
157 ~RAS ops: 1 0 1 0 ...
158 ~CAS ops: 1 0 1 0 ...
159
160 Address ops: a b a b ...
161 Data ops: f s f ...
162
163 ~WE: ........W ...
164 PHI OUT: \_______/-------\_______/------- ...
165 CPU: L D L D ...
166 RnW: R R ...
167
168 Here, "E" indicates the availability of an entire byte.
169
170 Since only one fetch is required per 2MHz cycle, instead of two fetches for
171 the 4-bit wide RAM arrangement, it seems likely that longer 8MHz cycles could
172 be used to coordinate the necessary signalling.
173
174 Another conceivable simplification from using an 8-bit wide RAM access channel
175 with a single access within each 2MHz cycle is the possibility of allowing the
176 CPU to signal directly to the RAM instead of having the ULA perform the access
177 signalling on the CPU's behalf. Note that it is this more leisurely signalling
178 that would allow the CPU to conduct accesses at 2MHz: the "compressed"
179 signalling being beyond the capabilities of the CPU.
180
181 Note that 16MHz cycles would still be needed for the pixel clock in MODE 0,
182 which needs to output eight pixels per 2MHz cycle, producing 640 monochrome
183 pixels per 80-byte line.
184
185 An obvious consideration with regard to 8-bit wide access is whether the ULA
186 could still conduct the "compressed" signalling for its own RAM accesses:
187
188 Time (ns): 0-------------- 500------------- ...
189 2 MHz cycle: 0 1 ...
190 16 MHz cycle: 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 ...
191 /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\ ...
192 ~RAS: /---\___________/---\___________ ...
193 ~CAS: /-----\___/-\___/-----\___/-\___ ...
194 Address events: A B C A B C ...
195 Data events: 1 2 1 2 ...
196
197 ~RAS ops: 1 0 1 0 ...
198 ~CAS ops: 1 0 1 0 1 0 1 0 ...
199
200 Address ops: a b c a b c ...
201 Data ops: s f s f ...
202
203 ~WE: ......W ...
204 PHI OUT: \_______/-------\_______/------- ...
205 CPU: L D L D ...
206 RnW: R R ...
207
208 Here, "1" and "2" in the data events correspond to whole byte accesses,
209 effectively upgrading the half-byte "F" and "S" events in the existing ULA
210 arrangement.
211
212 Although the provision of access for the CPU would adhere to the relevant
213 timing constraints, providing only one byte per 2MHz cycle, the ULA could
214 obtain two bytes per cycle. This would then free up bandwidth for the CPU in
215 screen modes where the ULA would normally be dominant (MODE 0 to 3), albeit at
216 the cost of extra buffering. Such buffering could also be done for modes where
217 the bandwidth is shared (MODE 4 to 6), consolidating pairs of ULA accesses into
218 single cycles and freeing up an extra cycle for CPU accesses.
219
220 A further consideration is whether the CPU and ULA could access the memory on
221 interleaved 4MHz cycles, thus replicating the arrangement used by the CPU and
222 Video ULA on the BBC Micro. One potential obstacle is that the apparent 4MHz
223 access rate employed by the ULA does not involve the complete process for
224 accessing the RAM: upon setting up the address and issuing the ~RAS signal,
225 the ULA is able to make a pair of column accesses on the same "row" of memory,
226 effectively achieving an average access rate of 4MHz in an 8-bit
227 configuration.
228
229 However, if arbitrary pairs of column accesses were to be attempted, as would
230 be required by CPU and ULA interleaving, the ~RAS signal would need to be
231 re-issued with different addresses being set up. This would expand the time to
232 access a memory location to beyond the period of a 4MHz cycle, making it
233 impossible to employ interleaved accesses at such a rate.
234
235 In conclusion, a strict interleaving strategy is not possible, but by using
236 pixel data buffering and employing two ULA accesses per 2MHz cycle to obtain
237 two bytes in that cycle, each adjacent 2MHz cycle can be given to the CPU,
238 thus achieving an effective throughput during display update periods of 3
239 bytes for every pair of cycles (2 bytes for the ULA, 1 byte for the CPU), and
240 thus 1.5 bytes per cycle, giving an illusion of 3MHz access to RAM.
241
242 CPU Clock Notes
243 ---------------
244
245 "The 6502 receives an external square-wave clock input signal on pin 37, which
246 is usually labeled PHI0. [...] This clock input is processed within the 6502
247 to form two clock outputs: PHI1 and PHI2 (pins 3 and 39, respectively). PHI2
248 is essentially a copy of PHI0; more specifically, PHI2 is PHI0 after it's been
249 through two inverters and a push-pull amplifier. The same network of
250 transistors within the 6502 which generates PHI2 is also tied to PHI1, and
251 generates PHI1 as the inverse of PHI0. The reason why PHI1 and PHI2 are made
252 available to external devices is so that they know when they can access the
253 CPU. When PHI1 is high, this means that external devices can read from the
254 address bus or data bus; when PHI2 is high, this means that external devices
255 can write to the data bus."
256
257 See: http://lateblt.livejournal.com/88105.html
258
259 "The 6502 has a synchronous memory bus where the master clock is divided into
260 two phases (Phase 1 and Phase 2). The address is always generated during Phase
261 1 and all memory accesses take place during Phase 2."
262
263 See: http://www.jmargolin.com/vgens/vgens.htm
264
265 Thus, the inverse of PHI OUT provides the "other phase" of the clock. "During
266 Phase 1" means when PHI0 - really PHI2 - is high and "during Phase 2" means
267 when PHI1 is high.
268
269 Bandwidth Figures
270 -----------------
271
272 Using an observation of 128 2MHz cycles per scanline, 256 active lines and 312
273 total lines, with 80 cycles occurring in the active periods of display
274 scanlines, the following bandwidth calculations can be performed:
275
276 Total theoretical maximum:
277 128 cycles * 312 lines
278 = 39936 bytes
279
280 MODE 0, 1, 2:
281 ULA: 80 cycles * 256 lines
282 = 20480 bytes
283 CPU: 48 cycles / 2 * 256 lines
284 + 128 cycles / 2 * (312 - 256) lines
285 = 9728 bytes
286
287 MODE 3:
288 ULA: 80 cycles * 24 rows * 8 lines
289 = 15360 bytes
290 CPU: 48 cycles / 2 * 24 rows * 8 lines
291 + 128 cycles / 2 * (312 - (24 rows * 8 lines))
292 = 12288 bytes
293
294 MODE 4, 5:
295 ULA: 40 cycles * 256 lines
296 = 10240 bytes
297 CPU: (40 cycles + 48 cycles / 2) * 256 lines
298 + 128 cycles / 2 * (312 - 256) lines
299 = 19968 bytes
300
301 MODE 6:
302 ULA: 40 cycles * 24 rows * 8 lines
303 = 7680 bytes
304 CPU: (40 cycles + 48 cycles / 2) * 24 rows * 8 lines
305 + 128 cycles / 2 * (312 - (24 rows * 8 lines))
306 = 19968 bytes
307
308 Here, the division of 2 for CPU accesses is performed to indicate that the CPU
309 only uses every other access opportunity even in uncontended periods. See the
310 2MHz RAM Access enhancement below for bandwidth calculations that consider
311 this limitation removed.
312
313 A summary of the bandwidth figures is as follows (with extra timing details
314 described below):
315
316 Standard ULA % Total Slowdown BBC-10s BBC-34s
317 MODE 0, 1, 2 9728 bytes 24% 4.11 43s 105s
318 MODE 3 12288 bytes 31% 3.25 34s
319 MODE 4, 5 19968 bytes 50% 2 20s
320 MODE 6 19968 bytes 50% 2 20s 50s
321
322 The review of the Electron in Practical Computing (October 1983) provides a
323 concise overview of the RAM access limitations and gives timing comparisons
324 between modes and BBC Micro performance. In the above, "BBC-10s" is the
325 measured or stated time given for a program taking 10 seconds on the BBC
326 Micro, whereas "BBC-34s" is the apparently measured time given for the
327 "Persian" program taking 34 seconds to complete on the BBC Micro, with a
328 "quick" mode presumably switching to MODE 6 using the ULA directly in order to
329 reduce display bandwidth usage while the program draws to the screen.
330 Evidently, the measured slowdown is slightly lower than the theoretical
331 slowdown, most likely due to the running time not being entirely dominated by
332 RAM access performance characteristics.
333
334 Video Timing
335 ------------
336
337 According to 8.7 in the Service Manual, and the PAL Wikipedia page,
338 approximately 4.7µs is used for the sync pulse, 5.7µs for the "back porch"
339 (including the "colour burst"), and 1.65µs for the "front porch", totalling
340 12.05µs and thus leaving 51.95µs for the active video signal for each
341 scanline. As the Service Manual suggests in the oscilloscope traces, the
342 display information is transmitted more or less centred within the active
343 video period since the ULA will only be providing pixel data for 40µs in each
344 scanline.
345
346 Each 62.5ns cycle happens to correspond to 64µs divided by 1024, meaning that
347 each scanline can be divided into 1024 cycles, although only 640 at most are
348 actively used to provide pixel data. Pixel data production should only occur
349 within a certain period on each scanline, approximately 262 cycles after the
350 start of hsync:
351
352 active video period = 51.95µs
353 pixel data period = 40µs
354 total silent period = 51.95µs - 40µs = 11.95µs
355 silent periods (before and after) = 11.95µs / 2 = 5.975µs
356 hsync and back porch period = 4.7µs + 5.7µs = 10.4µs
357 time before pixel data period = 10.4µs + 5.975µs = 16.375µs
358 pixel data period start cycle = 16.375µs / 62.5ns = 262
359
360 By choosing a number divisible by 8, the RAM access mechanism can be
361 synchronised with the pixel production. Thus, 256 is a more appropriate start
362 cycle, where the HS (horizontal sync) signal corresponding to the 4µs sync
363 pulse (or "normal sync" pulse as described by the "PAL TV timing and voltages"
364 document) occurs at cycle 0.
365
366 To summarise:
367
368 HS signal starts at cycle 0 on each horizontal scanline
369 HS signal ends approximately 4µs later at cycle 64
370 Pixel data starts approximately 12µs later at cycle 256
371
372 "Re: Electron Memory Contention" provides measurements that appear consistent
373 with these calculations.
374
375 The "vertical blanking period", meaning the period before picture information
376 in each field is 25 lines out of 312 (or 313) and thus lasts for 1.6ms. Of
377 this, 2.5 lines occur before the vsync (field sync) which also lasts for 2.5
378 lines. Thus, the first visible scanline on the first field of a frame occurs
379 half way through the 23rd scanline period measured from the start of vsync
380 (indicated by "V" in the diagrams below):
381
382 10 20 23
383 Line in frame: 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8
384 Line from 1: 0 22 3
385 Line on screen: .:::::VVVVV::::: 12233445566
386 |_________________________________________________|
387 25 line vertical blanking period
388
389 In the second field of a frame, the first visible scanline coincides with the
390 24th scanline period measured from the start of line 313 in the frame:
391
392 310 336
393 Line in frame: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
394 Line from 313: 0 23 4
395 Line on screen: 88:::::VVVVV:::: 11223344
396 288 | |
397 |_________________________________________________|
398 25 line vertical blanking period
399
400 In order to consider only full lines, we might consider the start of each
401 frame to occur 23 lines after the start of vsync.
402
403 Again, it is likely that pixel data production should only occur on scanlines
404 within a certain period on each frame. The "625/50" document indicates that
405 only a certain region is "safe" to use, suggesting a vertically centred region
406 with approximately 15 blank lines above and below the picture. However, the
407 "PAL TV timing and voltages" document suggests 28 blank lines above and below
408 the picture. This would centre the 256 lines within the 312 lines of each
409 field and thus provide a start of picture approximately 5.5 or 5 lines after
410 the end of the blanking period or 28 or 27.5 lines after the start of vsync.
411
412 To summarise:
413
414 CSYNC signal starts at cycle 0
415 CSYNC signal ends approximately 160µs (2.5 lines) later at cycle 2560
416 Start of line occurs approximately 1632µs (5.5 lines) later at cycle 28672
417
418 See: http://en.wikipedia.org/wiki/PAL
419 See: http://en.wikipedia.org/wiki/Analog_television#Structure_of_a_video_signal
420 See: The 625/50 PAL Video Signal and TV Compatible Graphics Modes
421 http://lipas.uwasa.fi/~f76998/video/modes/
422 See: PAL TV timing and voltages
423 http://www.retroleum.co.uk/electronics-articles/pal-tv-timing-and-voltages/
424 See: Line Standards
425 http://www.pembers.freeserve.co.uk/World-TV-Standards/Line-Standards.html
426 See: Horizontal Blanking Interval of 405-, 525-, 625- and 819-Line Standards
427 http://www.pembers.freeserve.co.uk/World-TV-Standards/HBI.pdf
428 See: Re: Electron Memory Contention
429 http://www.stardot.org.uk/forums/viewtopic.php?p=134109#p134109
430
431 RAM Integrated Circuits
432 -----------------------
433
434 Unicorn Electronics appears to offer 4164 RAM chips (as well as 6502 series
435 CPUs such as the 6502, 6502A, 6502B and 65C02). These 4164 devices are
436 available in 100ns (4164-100), 120ns (4164-120) and 150ns (4164-150) variants,
437 have 16 pins and address 65536 bits through a 1-bit wide channel. Similarly,
438 ByteDelight.com sell 4164 devices primarily for the ZX Spectrum.
439
440 The documentation for the Electron mentions 4164-15 RAM chips for IC4-7, and
441 the Samsung-produced KM41464 series is apparently equivalent to the Texas
442 Instruments 4164 chips presumably used in the Electron.
443
444 The TM4164EC4 series combines 4 64K x 1b units into a single package and
445 appears similar to the TM4164EA4 featured on the Electron's circuit diagram
446 (in the Advanced User Guide but not the Service Manual), and it also has 22
447 pins providing 3 additional inputs and 3 additional outputs over the 16 pins
448 of the individual 4164-15 modules, presumably allowing concurrent access to
449 the packaged memory units.
450
451 As far as currently available replacements are concerned, the NTE4164 is a
452 potential candidate: according to the Vetco Electronics entry, it is
453 supposedly a replacement for the TMS4164-15 amongst many other parts. Similar
454 parts include the NTE2164 and the NTE6664, both of which appear to have
455 largely the same performance and connection characteristics. Meanwhile, the
456 NTE21256 appears to be a 16-pin replacement with four times the capacity that
457 maintains the single data input and output pins. Using the NTE21256 as a
458 replacement for all ICs combined would be difficult because of the single bit
459 output.
460
461 Another device equivalent to the 4164-15 appears to be available under the
462 code 41662 from Jameco Electronics as the Siemens HYB 4164-2. The Jameco Web
463 site lists data sheets for other devices on the same page, but these are
464 different and actually appear to be provided under the 41574 product code (but
465 are listed under 41464-10) and appear to be replacements for the TM4164EC4:
466 the Samsung KM41464A-15 and NEC µPD41464 employ 18 pins, eliminating 4 pins by
467 employing 4 pins for both input and output.
468
469 Pins I/O pins Row access Column access
470 ---- -------- ---------- -------------
471 TM4164EC4 22 4 + 4 150ns (15) 90ns (15)
472 KM41464AP 18 4 150ns (15) 75ns (15)
473 NTE21256 16 1 + 1 150ns 75ns
474 HYB 4164-2 16 1 + 1 150ns 100ns
475 µPD41464 18 4 120ns (12) 60ns (12)
476
477 See: TM4164EC4 65,536 by 4-Bit Dynamic RAM Module
478 http://www.datasheetarchive.com/dl/Datasheets-112/DSAP0051030.pdf
479 See: Dynamic RAMS
480 http://www.unicornelectronics.com/IC/DYNAMIC.html
481 See: New old stock 8x 4164 chips
482 http://www.bytedelight.com/?product=8x-4164-chips-new-old-stock
483 See: KM4164B 64K x 1 Bit Dynamic RAM with Page Mode
484 http://images.ihscontent.net/vipimages/VipMasterIC/IC/SAMS/SAMSD020/SAMSD020-45.pdf
485 See: NTE2164 Integrated Circuit 65,536 X 1 Bit Dynamic Random Access Memory
486 http://www.vetco.net/catalog/product_info.php?products_id=2806
487 See: NTE4164 - IC-NMOS 64K DRAM 150NS
488 http://www.vetco.net/catalog/product_info.php?products_id=3680
489 See: NTE21256 - IC-256K DRAM 150NS
490 http://www.vetco.net/catalog/product_info.php?products_id=2799
491 See: NTE21256 262,144-Bit Dynamic Random Access Memory (DRAM)
492 http://www.nteinc.com/specs/21000to21999/pdf/nte21256.pdf
493 See: NTE6664 - IC-MOS 64K DRAM 150NS
494 http://www.vetco.net/catalog/product_info.php?products_id=5213
495 See: NTE6664 Integrated Circuit 64K-Bit Dynamic RAM
496 http://www.nteinc.com/specs/6600to6699/pdf/nte6664.pdf
497 See: 4164-150: MAJOR BRANDS
498 http://www.jameco.com/webapp/wcs/stores/servlet/Product_10001_10001_41662_-1
499 See: HYB 4164-1, HYB 4164-2, HYB 4164-3 65,536-Bit Dynamic Random Access Memory (RAM)
500 http://www.jameco.com/Jameco/Products/ProdDS/41662SIEMENS.pdf
501 See: KM41464A NMOS DRAM 64K x 4 Bit Dynamic RAM with Page Mode
502 http://www.jameco.com/Jameco/Products/ProdDS/41662SAM.pdf
503 See: NEC µ41464 65,536 x 4-Bit Dynamic NMOS RAM
504 http://www.jameco.com/Jameco/Products/ProdDS/41662NEC.pdf
505 See: 41464-10: MAJOR BRANDS
506 http://www.jameco.com/webapp/wcs/stores/servlet/Product_10001_10001_41574_-1
507
508 Interrupts
509 ----------
510
511 The ULA generates IRQs (maskable interrupts) according to certain conditions
512 and these conditions are controlled by location &FE00:
513
514 * Vertical sync (bottom of displayed screen)
515 * 50MHz real time clock
516 * Transmit data empty
517 * Receive data full
518 * High tone detect
519
520 The ULA is also used to clear interrupt conditions through location &FE05. Of
521 particular significance is bit 7, which must be set if an NMI (non-maskable
522 interrupt) has occurred and has thus suspended ULA access to memory, restoring
523 the normal function of the ULA.
524
525 ROM Paging
526 ----------
527
528 Accessing different ROMs involves bits 0 to 3 of &FE05. Some special ROM
529 mappings exist:
530
531 8 keyboard
532 9 keyboard (duplicate)
533 10 BASIC ROM
534 11 BASIC ROM (duplicate)
535
536 Paging in a ROM involves the following procedure:
537
538 1. Assert ROM page enable (bit 3) together with a ROM number n in bits 0 to
539 2, corresponding to ROM number 8+n, such that one of ROMs 12 to 15 is
540 selected.
541 2. Where a ROM numbered from 0 to 7 is to be selected, set bit 3 to zero
542 whilst writing the desired ROM number n in bits 0 to 2.
543
544 See: http://stardot.org.uk/forums/viewtopic.php?p=136686#p136686
545
546 Keyboard Access
547 ---------------
548
549 The keyboard pages appear to be accessed at 1MHz just like the RAM.
550
551 See: https://stardot.org.uk/forums/viewtopic.php?p=254155#p254155
552
553 Shadow/Expanded Memory
554 ----------------------
555
556 The Electron exposes all sixteen address lines and all eight data lines
557 through the expansion bus. Using such lines, it is possible to provide
558 additional memory - typically sideways ROM and RAM - on expansion cards and
559 through cartridges, although the official cartridge specification provides
560 fewer address lines and only seeks to provide access to memory in 16K units.
561
562 Various modifications and upgrades were developed to offer "turbo"
563 capabilities to the Electron, permitting the CPU to access a separate 8K of
564 RAM at 2MHz, presumably preventing access to the low 8K of RAM accessible via
565 the ULA through additional logic. However, an enhanced ULA might support
566 independent CPU access to memory over the expansion bus by allowing itself to
567 be discharged from providing access to memory, potentially for a range of
568 addresses, and for the CPU to communicate with external memory uninterrupted.
569
570 Sideways RAM/ROM and Upper Memory Access
571 ----------------------------------------
572
573 Although the ULA controls the CPU clock, effectively slowing or stopping the
574 CPU when the ULA needs to access screen memory, it is apparently able to allow
575 the CPU to access addresses of &8000 and above - the upper region of memory -
576 at 2MHz independently of any access to RAM that the ULA might be performing,
577 only blocking the CPU if it attempts to access addresses of &7FFF and below
578 during any ULA memory access - the lower region of memory - by stopping or
579 stalling its clock.
580
581 Thus, the ULA remains aware of the level of the A15 line, only inhibiting the
582 CPU clock if the line goes low, when the CPU is attempting to access the lower
583 region of memory.
584
585 Hardware Scrolling (and Enhancement)
586 ------------------------------------
587
588 On the standard ULA, &FE02 and &FE03 map to a 9 significant bits address with
589 the least significant 5 bits being zero, thus limiting the scrolling
590 resolution to 64 bytes. An enhanced ULA could support a resolution of 2 bytes
591 using the same layout of these addresses.
592
593 |--&FE02--------------| |--&FE03--------------|
594 XX XX 14 13 12 11 10 09 08 07 06 XX XX XX XX XX
595
596 XX 14 13 12 11 10 09 08 07 06 05 04 03 02 01 XX
597
598 Arguably, a resolution of 8 bytes is more useful, since the mapping of screen
599 memory to pixel locations is character oriented. A change in 8 bytes would
600 permit a horizontal scrolling resolution of 2 pixels in MODE 2, 4 pixels in
601 MODE 1 and 5, and 8 pixels in MODE 0, 3 and 6. This resolution is actually
602 observed on the BBC Micro (see 18.11.2 in the BBC Microcomputer Advanced User
603 Guide).
604
605 One argument for a 2 byte resolution is smooth vertical scrolling. A pitfall
606 of changing the screen address by 2 bytes is the change in the number of lines
607 from the initial and final character rows that need reading by the ULA, which
608 would need to maintain this state information (although this is a relatively
609 trivial change). Another pitfall is the complication that might be introduced
610 to software writing bitmaps of character height to the screen.
611
612 See: http://pastraiser.com/computers/acornelectron/acornelectron.html
613
614 Enhancement: Mode Layouts
615 -------------------------
616
617 Merely changing the screen memory mappings in order to have Archimedes-style
618 row-oriented screen addresses (instead of character-oriented addresses) could
619 be done for the existing modes, but this might not be sufficiently beneficial,
620 especially since accessing regions of the screen would involve incrementing
621 pointers by amounts that are inconvenient on an 8-bit CPU.
622
623 However, instead of using a Archimedes-style mapping, column-oriented screen
624 addresses could be more feasibly employed: incrementing the address would
625 reference the vertical screen location below the currently-referenced location
626 (just as occurs within characters using the existing ULA); instead of
627 returning to the top of the character row and referencing the next horizontal
628 location after eight bytes, the address would reference the next character row
629 and continue to reference locations downwards over the height of the screen
630 until reaching the bottom; at the bottom, the next location would be the next
631 horizontal location at the top of the screen.
632
633 In other words, the memory layout for the screen would resemble the following
634 (for MODE 2):
635
636 &3000 &3100 ... &7F00
637 &3001 &3101
638 ... ...
639 &3007
640 &3008
641 ...
642 ... ...
643 &30FF ... &7FFF
644
645 Since there are 256 pixel rows, each column of locations would be addressable
646 using the low byte of the address. Meanwhile, the high byte would be
647 incremented to address different columns. Thus, addressing screen locations
648 would become a lot more convenient and potentially much more efficient for
649 certain kinds of graphical output.
650
651 One potential complication with this simplified addressing scheme arises with
652 hardware scrolling. Vertical hardware scrolling by one pixel row (not supported
653 with the existing ULA) would be achieved by incrementing or decrementing the
654 screen start address; by one character row, it would involve adding or
655 subtracting 8. However, the ULA only supports multiples of 64 when changing the
656 screen start address. Thus, if such a scheme were to be adopted, three
657 additional bits would need to be supported in the screen start register (see
658 "Hardware Scrolling (and Enhancement)" for more details). However, horizontal
659 scrolling would be much improved even under the severe constraints of the
660 existing ULA: only adjustments of 256 to the screen start address would be
661 required to produce single-location scrolling of as few as two pixels in MODE 2
662 (four pixels in MODEs 1 and 5, eight pixels otherwise).
663
664 More disruptive is the effect of this alternative layout on software.
665 Presumably, compatibility with the BBC Micro was the primary goal of the
666 Electron's hardware design. With the character-oriented screen layout in
667 place, system software (and application software accessing the screen
668 directly) would be relying on this layout to run on the Electron with little
669 or no modification. Although it might have been possible to change the system
670 software to use this column-oriented layout instead, this would have incurred
671 a development cost and caused additional work porting things like games to the
672 Electron. Moreover, a separate branch of the software from that supporting the
673 BBC Micro and closer derivatives would then have needed maintaining.
674
675 The decision to use the character-oriented layout in the BBC Micro may have
676 been related to the choice of circuitry and to facilitate a convenient
677 hardware implementation, and by the time the Electron was planned, it was too
678 late to do anything about this somewhat unfortunate choice.
679
680 Pixel Layouts
681 -------------
682
683 The pixel layouts are as follows:
684
685 Modes Depth (bpp) Pixels (from bits)
686 ----- ----------- ------------------
687 0, 3, 4, 6 1 7 6 5 4 3 2 1 0
688 1, 5 2 73 62 51 40
689 2 4 7531 6420
690
691 Since the ULA reads a half-byte at a time, one might expect it to attempt to
692 produce pixels for every half-byte, as opposed to handling entire bytes.
693 However, the pixel layout is not conducive to producing pixels as soon as a
694 half-byte has been read for a given full-byte location: in 1bpp modes the
695 first four pixels can indeed be produced, but in 2bpp and 4bpp modes the pixel
696 data is spread across the entire byte in different ways.
697
698 An alternative arrangement might be as follows:
699
700 Modes Depth (bpp) Pixels (from bits)
701 ----- ----------- ------------------
702 0, 3, 4, 6 1 7 6 5 4 3 2 1 0
703 1, 5 2 76 54 32 10
704 2 4 7654 3210
705
706 Just as the mode layouts were presumably decided by compatibility with the BBC
707 Micro, the pixel layouts will have been maintained for similar reasons.
708 Unfortunately, this layout prevents any optimisation of the ULA for handling
709 half-byte pixel data generally.
710
711 Enhancement: The Missing MODE 4
712 -------------------------------
713
714 The Electron inherits its screen mode selection from the BBC Micro, where MODE
715 3 is a text version of MODE 0, and where MODE 6 is a text version of MODE 4.
716 Neither MODE 3 nor MODE 6 is a genuine character-based text mode like MODE 7,
717 however, and they are merely implemented by skipping two scanlines in every
718 ten after the eight required to produce a character line. Thus, such modes
719 provide a 24-row display.
720
721 In principle, nothing prevents this "text mode" effect being applied to other
722 modes. The 20-column modes are not well-suited to displaying text, which
723 leaves MODE 1 which, unlike MODEs 3 and 6, can display 4 colours rather than
724 2. Although the need for a non-monochrome 40-column text mode is addressed by
725 MODE 7 on the BBC Micro, the Electron lacks such a mode.
726
727 If the 4-colour, 24-row variant of MODE 1 were to be provided, logically it
728 would occupy MODE 4 instead of the current MODE 4:
729
730 Screen mode Size (kilobytes) Colours Rows Resolution
731 ----------- ---------------- ------- ---- ----------
732 0 20 2 32 640x256
733 1 20 4 32 320x256
734 2 20 16 32 160x256
735 3 16 2 24 640x256
736 4 (new) 16 4 24 320x256
737 4 (old) 10 2 32 320x256
738 5 10 4 32 160x256
739 6 8 2 24 320x256
740
741 Thus, for increasing mode numbers, the size of each mode would be the same or
742 less than the preceding mode.
743
744 Enhancement: Display Mode Property Control
745 ------------------------------------------
746
747 It is rather curious that the ULA supports the mode numbers directly in bits 3
748 to 5 of &FE07 since these would presumably need to be decoded in order to set
749 the fundamental properties of the display mode. These properties are as
750 follows:
751
752 * Screen data retrieval rate: number of fetches per pair of 2MHz cycles
753 * Pixel colour depth
754 * Text mode vertical spacing
755
756 From these, the following properties emerge:
757
758 Property Influences
759 -------- ----------
760 Character row size (bytes) Retrieval rate
761
762 Number of character rows Text mode setting
763
764 Display size (bytes) Retrieval rate (character row size)
765 Text mode setting (number of rows)
766
767 Pixel frequency Retrieval rate
768 Horizontal resolution (pixels) Colour depth
769
770 One can imagine a register bitfield arrangement as follows:
771
772 Field Values Formula
773 ----- ------ -------
774 Pixel depth 00: 1 bit per pixel log2(depth)
775 01: 2 bits per pixel
776 10: 4 bits per pixel
777
778 Retrieval rate 0: twice 2 - fetches per cycle pair
779 1: once
780
781 Text mode enable 0: disable/off text mode enabled
782 1: enable/on
783
784 This arrangement would require four bits. However, one bit in &FE07 is
785 seemingly inactive and might possibly be reallocated.
786
787 The resulting combination of properties would permit all of the existing modes
788 plus some additional ones, including the missing MODE 4 mentioned above. With
789 the bitfields above ordered from the most significant bits to the least
790 significant bits providing the low-level "mode" values, the following table
791 can be produced:
792
793 Screen mode Depth Rate Text Size (K) Colours Rows Resolution
794 ----------- ----- ---- ---- -------- ------- ---- ----------
795 0 (0000) 1 twice off 20 2 32 640x256 (MODE 0)
796 1 (0001) 1 twice on 16 2 24 640x256 (MODE 3)
797 2 (0010) 1 once off 10 2 32 320x256 (MODE 4)
798 3 (0011) 1 once on 8 2 24 320x256 (MODE 6)
799 4 (0100) 2 twice off 20 4 32 320x256 (MODE 1)
800 5 (0101) 2 twice on 16 4 24 320x256
801 6 (0110) 2 once off 10 4 32 160x256 (MODE 5)
802 7 (0111) 2 once on 8 4 24 160x256
803 8 (1000) 4 twice off 20 16 32 160x256 (MODE 2)
804 9 (1001) 4 twice on 16 16 24 160x256
805 10 (1010) 4 once off 10 16 32 80x256
806 11 (1011) 4 once on 8 16 24 80x256
807
808 The existing modes would be covered in a way that is incompatible with the
809 existing numbering, thus requiring a table in software, but additional text
810 modes would be provided for MODE 1, MODE 5 and MODE 2. An additional two lower
811 resolution modes would also be conceivable within this scheme, requiring the
812 stretching of 16MHz pixels by a factor of eight to yield 80 pixels per
813 scanline. The utility of such modes is questionable and such modes might not
814 be supported.
815
816 Enhancement: 2MHz RAM Access
817 ----------------------------
818
819 Given that the CPU and ULA both access RAM at 2MHz, but given that the CPU
820 when not competing with the ULA only accesses RAM every other 2MHz cycle (as
821 if the ULA still needed to access the RAM), one useful enhancement would be a
822 mechanism to let the CPU take over the ULA cycles outside the ULA's period of
823 activity comparable to the way the ULA takes over the CPU cycles in MODE 0 to
824 3.
825
826 Thus, the RAM access cycles would resemble the following in MODE 0 to 3:
827
828 Upon a transition from display cycles: UUUUCCCC (instead of UUUUC_C_)
829 On a non-display line: CCCCCCCC (instead of C_C_C_C_)
830
831 In MODE 4 to 6:
832
833 Upon a transition from display cycles: CUCUCCCC (instead of CUCUC_C_)
834 On a non-display line: CCCCCCCC (instead of C_C_C_C_)
835
836 This would improve CPU bandwidth as follows:
837
838 Standard ULA Enhanced ULA % Total Bandwidth Speedup
839 MODE 0, 1, 2 9728 bytes 19456 bytes 24% -> 49% 2
840 MODE 3 12288 bytes 24576 bytes 31% -> 62% 2
841 MODE 4, 5 19968 bytes 29696 bytes 50% -> 74% 1.5
842 MODE 6 19968 bytes 32256 bytes 50% -> 81% 1.6
843
844 (Here, the uncontended total 2MHz bandwidth for a display period would be
845 39936 bytes, being 128 cycles per line over 312 lines.)
846
847 With such an enhancement, MODE 0 to 3 experience a doubling of CPU bandwidth
848 because all access opportunities to RAM are doubled. Meanwhile, in the other
849 modes, some CPU accesses occur alongside ULA accesses and thus cannot be
850 doubled, but the CPU bandwidth increase is still significant.
851
852 Unfortunately, the mechanism for accessing the RAM is too slow to provide data
853 within the time constraints of 2MHz operation. There is no time remaining in a
854 2MHz cycle for the CPU to receive and process any retrieved data once the
855 necessary signalling has been performed.
856
857 The only way for the CPU to be able to access the RAM quickly enough would be
858 to do away with the double 4-bit access mechanism and to have a single 8-bit
859 channel to the memory. This would require twice as many 1-bit RAM chips or a
860 different kind of RAM chip, but it would also potentially simplify the ULA.
861
862 The section on 8-bit wide RAM access discusses the possibilities around
863 changing the memory architecture, also describing the possibility of ULA
864 accesses achieving two bytes per 2MHz cycle due to the doubling of the memory
865 channel, leaving every other access free for the CPU during the display period
866 in MODE 0 to 3...
867
868 Standard display period: UUUUUUUU
869 Modified display period: UCUCUCUC
870
871 ...and consolidating accesses in MODE 4 to 6:
872
873 Standard display period: UCUCUCUC
874 Modified display period: UCCCUCCC
875
876 Together with the enhancements for non-display periods, such an "Enhanced+ ULA"
877 would perform as follows:
878
879 Standard ULA Enhanced+ ULA % Total Bandwidth Speedup
880 MODE 0, 1, 2 9728 bytes 29696 bytes 24% -> 74% 3.1
881 MODE 3 12288 bytes 32256 bytes 31% -> 81% 2.6
882 MODE 4, 5 19968 bytes 34816 bytes 50% -> 87% 1.7
883 MODE 6 19968 bytes 36096 bytes 50% -> 90% 1.8
884
885 Of course, the principal enhancement would be the wider memory channel, with
886 more buffering in the ULA being its contribution to this arrangement.
887
888 Enhancement: Region Blanking
889 ----------------------------
890
891 The problem of permitting character-oriented blitting in programs whilst
892 scrolling the screen by sub-character amounts could be mitigated by permitting
893 a region of the display to be blank, such as the final lines of the display.
894 Consider the following vertical scrolling by 2 bytes that would cause an
895 initial character row of 6 lines and a final character row of 2 lines:
896
897 6 lines - initial, partial character row
898 248 lines - 31 complete rows
899 2 lines - final, partial character row
900
901 If a routine were in use that wrote 8 line bitmaps to the partial character
902 row now split in two, it would be advisable to hide one of the regions in
903 order to prevent content appearing in the wrong place on screen (such as
904 content meant to appear at the top "leaking" onto the bottom). Blanking 6
905 lines would be sufficient, as can be seen from the following cases.
906
907 Scrolling up by 2 lines:
908
909 6 lines - initial, partial character row
910 240 lines - 30 complete rows
911 4 lines - part of 1 complete row
912 -----------------------------------------------------------------
913 4 lines - part of 1 complete row (hidden to maintain 250 lines)
914 2 lines - final, partial character row (hidden)
915
916 Scrolling down by 2 lines:
917
918 2 lines - initial, partial character row
919 248 lines - 31 complete rows
920 ----------------------------------------------------------
921 6 lines - final, partial character row (hidden)
922
923 Thus, in this case, region blanking would impose a 250 line display with the
924 bottom 6 lines blank.
925
926 See the description of the display suspend enhancement for a more efficient
927 way of blanking lines than merely blanking the palette whilst allowing the CPU
928 to perform useful work during the blanking period.
929
930 To control the blanking or suspending of lines at the top and bottom of the
931 display, a memory location could be dedicated to the task: the upper 4 bits
932 could define a blanking region of up to 16 lines at the top of the screen,
933 whereas the lower 4 bits could define such a region at the bottom of the
934 screen. If more lines were required, two locations could be employed, allowing
935 the top and bottom regions to occupy the entire screen.
936
937 Enhancement: Screen Height Adjustment
938 -------------------------------------
939
940 The height of the screen could be configurable in order to reduce screen
941 memory consumption. This is not quite done in MODE 3 and 6 since the start of
942 the screen appears to be rounded down to the nearest page, but by reducing the
943 height by amounts more than a page, savings would be possible. For example:
944
945 Screen width Depth Height Bytes per line Saving in bytes Start address
946 ------------ ----- ------ -------------- --------------- -------------
947 640 1 252 80 320 &3140 -> &3100
948 640 1 248 80 640 &3280 -> &3200
949 320 1 240 40 640 &5A80 -> &5A00
950 320 2 240 80 1280 &3500
951
952 Screen Mode Selection
953 ---------------------
954
955 Bits 3, 4 and 5 of address &FE*7 control the selected screen mode. For a wider
956 range of modes, the other bits of &FE*7 (related to sound, cassette
957 input/output and the Caps Lock LED) would need to be reassigned and bit 0
958 potentially being made available for use.
959
960 Enhancement: Palette Definition
961 -------------------------------
962
963 Since all memory accesses go via the ULA, an enhanced ULA could employ more
964 specific addresses than &FE*X to perform enhanced functions. For example, the
965 palette control is done using &FE*8-F and merely involves selecting predefined
966 colours, whereas an enhanced ULA could support the redefinition of all 16
967 colours using specific ranges such as &FE18-F (colours 0 to 7) and &FE28-F
968 (colours 8 to 15), where a single byte might provide 8 bits per pixel colour
969 specifications similar to those used on the Archimedes.
970
971 The principal limitation here is actually the hardware: the Electron has only
972 a single output line for each of the red, green and blue channels, and if
973 those outputs are strictly digital and can only be set to a "high" and "low"
974 value, then only the existing eight colours are possible. If a modern ULA were
975 able to output analogue values (or values at well-defined points between the
976 high and low values, such as the half-on value supported by the Amstrad CPC
977 series), it would still need to be assessed whether the circuitry could
978 successfully handle and propagate such values. Various sources indicate that
979 only "TTL levels" are supported by the RGB output circuit, and since there are
980 74LS08 AND logic gates involved in the RGB component outputs from the ULA, it
981 is likely that the ULA is expected to provide only "high" or "low" values.
982
983 Short of adding extra outputs from the ULA (either additional red, green and
984 blue outputs or a combined intensity output), another approach might involve
985 some kind of modulation where an output value might be encoded in multiple
986 pulses at a higher frequency than the pixel frequency. However, this would
987 demand additional circuitry outside the ULA, and component RGB monitors would
988 probably not be able to take advantage of this feature; only UHF and composite
989 video devices (the latter with the composite video colour support enabled on
990 the Electron's circuit board) would potentially benefit.
991
992 Flashing Colours
993 ----------------
994
995 According to the Advanced User Guide, "The cursor and flashing colours are
996 entirely generated in software: This means that all of the logical to physical
997 colour map must be changed to cause colours to flash." This appears to suggest
998 that the palette registers must be updated upon the flash counter - read and
999 written by OSBYTE &C1 (193) - reaching zero and that some way of changing the
1000 colour pairs to be any combination of colours might be possible, instead of
1001 having colour complements as pairs.
1002
1003 It is conceivable that the interrupt code responsible does the simple thing
1004 and merely inverts the current values for any logical colours (LC) for which
1005 the associated physical colour (as supplied as the second parameter to the VDU
1006 19 call) has the top bit of its four bit value set. These top bits are not
1007 recorded in the palette registers but are presumably recorded separately and
1008 used to build bitmaps as follows:
1009
1010 LC 2 colour 4 colour 16 colour 4-bit value for inversion
1011 -- -------- -------- --------- -------------------------
1012 0 00010001 00010001 00010001 1, 1, 1
1013 1 01000100 00100010 00010001 4, 2, 1
1014 2 01000100 00100010 4, 2
1015 3 10001000 00100010 8, 2
1016 4 00010001 1
1017 5 00010001 1
1018 6 00100010 2
1019 7 00100010 2
1020 8 01000100 4
1021 9 01000100 4
1022 10 10001000 8
1023 11 10001000 8
1024 12 01000100 4
1025 13 01000100 4
1026 14 10001000 8
1027 15 10001000 8
1028
1029 Inversion value calculation:
1030
1031 2 colour formula: 1 << (colour * 2)
1032 4 colour formula: 1 << colour
1033 16 colour formula: 1 << ((colour & 2) + ((colour & 8) * 2))
1034
1035 For example, where logical colour 0 has been mapped to a physical colour in
1036 the range 8 to 15, a bitmap of 00010001 would be chosen as its contribution to
1037 the inversion operation. (The lower three bits of the physical colour would be
1038 used to set the underlying colour information affected by the inversion
1039 operation.)
1040
1041 An operation in the interrupt code would then combine the bitmaps for all
1042 logical colours in 2 and 4 colour modes, with the 16 colour bitmaps being
1043 combined for groups of logical colours as follows:
1044
1045 Logical colours
1046 ---------------
1047 0, 2, 8, 10
1048 4, 6, 12, 14
1049 5, 7, 13, 15
1050 1, 3, 9, 11
1051
1052 These combined bitmaps would be EORed with the existing palette register
1053 values in order to perform the value inversion necessary to produce the
1054 flashing effect.
1055
1056 Thus, in the VDU 19 operation, the appropriate inversion value would be
1057 calculated for the logical colour, and this value would then be combined with
1058 other inversion values in a dedicated memory location corresponding to the
1059 colour's group as indicated above. Meanwhile, the palette channel values would
1060 be derived from the lower three bits of the specified physical colour and
1061 combined with other palette data in dedicated memory locations corresponding
1062 to the palette registers.
1063
1064 Interestingly, although flashing colours on the BBC Micro are controlled by
1065 toggling bit 0 of the &FE20 control register location for the Video ULA, the
1066 actual colour inversion is done in hardware.
1067
1068 Enhancement: Palette Definition Lists
1069 -------------------------------------
1070
1071 It can be useful to redefine the palette in order to change the colours
1072 available for a particular region of the screen, particularly in modes where
1073 the choice of colours is constrained, and if an increased colour depth were
1074 available, palette redefinition would be useful to give the illusion of more
1075 than 16 colours in MODE 2. Traditionally, palette redefinition has been done
1076 by using interrupt-driven timers, but a more efficient approach would involve
1077 presenting lists of palette definitions to the ULA so that it can change the
1078 palette at a particular display line.
1079
1080 One might define a palette redefinition list in a region of memory and then
1081 communicate its contents to the ULA by writing the address and length of the
1082 list, along with the display line at which the palette is to be changed, to
1083 ULA registers such that the ULA buffers the list and performs the redefinition
1084 at the appropriate time. Throughput/bandwidth considerations might impose
1085 restrictions on the practical length of such a list, however.
1086
1087 A simple form of palette definition might be useful in text modes. Within the
1088 blank region between lines, the foreground palette could be changed to apply
1089 to the next line. Palette values could be read from a table in RAM, perhaps
1090 preceding the screen data, with 24 2-byte entries providing palette
1091 redefinition support in 2- and 4-colour modes.
1092
1093 Enhancement: Display Synchronisation Interrupts
1094 -----------------------------------------------
1095
1096 When completing each scanline of the display, the ULA could trigger an
1097 interrupt. Since this might impact system performance substantially, the
1098 feature would probably need to be configurable, and it might be sufficient to
1099 have an interrupt only after a certain number of display lines instead.
1100 Permitting the CPU to take action after eight lines would allow palette
1101 switching and other effects to occur on a character row basis.
1102
1103 The ULA provides an interrupt at the end of the display period, presumably so
1104 that software can schedule updates to the screen, avoid flickering or tearing,
1105 and so on. However, some applications might benefit from an interrupt at, or
1106 just before, the start of the display period so that palette modifications or
1107 similar effects could be scheduled.
1108
1109 Enhancement: Palette-Free Modes
1110 -------------------------------
1111
1112 Palette-free modes might be defined where bit values directly correspond to
1113 the red, green and blue channels, although this would mostly make sense only
1114 for modes with depths greater than the standard 4 bits per pixel, and such
1115 modes would require more memory than MODE 2 if they were to have an acceptable
1116 resolution.
1117
1118 Enhancement: Display Suspend
1119 ----------------------------
1120
1121 Especially when writing to the screen memory, it could be beneficial to be
1122 able to suspend the ULA's access to the memory, instead producing blank values
1123 for all screen pixels until a program is ready to reveal the screen. This is
1124 different from palette blanking since with a blank palette, the ULA is still
1125 reading screen memory and translating its contents into pixel values that end
1126 up being blank.
1127
1128 This function is reminiscent of a capability of the ZX81, albeit necessary on
1129 that hardware to reduce the load on the system CPU which was responsible for
1130 producing the video output. By allowing display suspend on the Electron, the
1131 performance benefit would be derived from giving the CPU full access to the
1132 memory bandwidth.
1133
1134 Note that since the CPU is only able to access RAM at 1MHz, there is no
1135 possibility to improve performance beyond that achieved in MODE 4, 5 or 6
1136 normally. However, if faster RAM access were to be made possible (see the
1137 discussion of 8-bit wide RAM access), the CPU could benefit from freeing up
1138 the ULA's access slots entirely.
1139
1140 The region blanking feature mentioned above could be implemented using this
1141 enhancement instead of employing palette blanking for the affected lines of
1142 the display.
1143
1144 Enhancement: Memory Filling
1145 ---------------------------
1146
1147 A capability that could be given to an enhanced ULA is that of permitting the
1148 ULA to write to screen memory as well being able to read from it. Although
1149 such a capability would probably not be useful in conjunction with the
1150 existing read operations when producing a screen display, and insufficient
1151 bandwidth would exist to do so in high-bandwidth screen modes anyway, the
1152 capability could be offered during a display suspend period (as described
1153 above), permitting a more efficient mechanism to rapidly fill memory with a
1154 predetermined value.
1155
1156 This capability could also support block filling, where the limits of the
1157 filled memory would be defined by the position and size of a screen area,
1158 although this would demand the provision of additional registers in the ULA to
1159 retain the details of such areas and additional logic to control the fill
1160 operation.
1161
1162 Enhancement: Region Filling
1163 ---------------------------
1164
1165 An alternative to memory writing might involve indicating regions using
1166 additional registers or memory where the ULA fills regions of the screen with
1167 content instead of reading from memory. Unlike hardware sprites which should
1168 realistically provide varied content, region filling could employ single
1169 colours or patterns, and one advantage of doing so would be that the ULA need
1170 not access memory at all within a particular region.
1171
1172 Regions would be defined on a row-by-row basis. Instead of reading memory and
1173 blitting a direct representation to the screen, the ULA would read region
1174 definitions containing a start column, region width and colour details. There
1175 might be a certain number of definitions allowed per row, or the ULA might
1176 just traverse an ordered list of such definitions with each one indicating the
1177 row, start column, region width and colour details.
1178
1179 One could even compress this information further by requiring only the row,
1180 start column and colour details with each subsequent definition terminating
1181 the effect of the previous one. However, one would also need to consider the
1182 convenience of preparing such definitions and whether efficient access to
1183 definitions for a particular row might be desirable. It might also be
1184 desirable to avoid having to prepare definitions for "empty" areas of the
1185 screen, effectively making the definition of the screen contents employ
1186 run-length encoding and employ only colour plus length information.
1187
1188 One application of region filling is that of simple 2D and 3D shape rendering.
1189 Although it is entirely possible to plot such shapes to the screen and have
1190 the ULA blit the memory contents to the screen, such operations consume
1191 bandwidth both in the initial plotting and in the final transfer to the
1192 screen. Region filling would reduce such bandwidth usage substantially.
1193
1194 This way of representing screen images would make certain kinds of images
1195 unfeasible to represent - consider alternating single pixel values which could
1196 easily occur in some character bitmaps - even if an internal queue of regions
1197 were to be supported such that the ULA could read ahead and buffer such
1198 "bandwidth intensive" areas. Thus, the ULA might be better served providing
1199 this feature for certain areas of the display only as some kind of special
1200 graphics window.
1201
1202 Enhancement: Hardware Sprites
1203 -----------------------------
1204
1205 An enhanced ULA might provide hardware sprites, but this would be done in an
1206 way that is incompatible with the standard ULA, since no &FE*X locations are
1207 available for allocation. To keep the facility simple, hardware sprites would
1208 have a standard byte width and height.
1209
1210 The specification of sprites could involve the reservation of 16 locations
1211 (for example, &FE20-F) specifying a fixed number of eight sprites, with each
1212 location pair referring to the sprite data. By limiting the ULA to dealing
1213 with a fixed number of sprites, the work required inside the ULA would be
1214 reduced since it would avoid having to deal with arbitrary numbers of sprites.
1215
1216 The principal limitation on providing hardware sprites is that of having to
1217 obtain sprite data, given that the ULA is usually required to retrieve screen
1218 data, and given the lack of memory bandwidth available to retrieve sprite data
1219 (particularly from multiple sprites supposedly at the same position) and
1220 screen data simultaneously. Although the ULA could potentially read sprite
1221 data and screen data in alternate memory accesses in screen modes where the
1222 bandwidth is not already fully utilised, this would result in a degradation of
1223 performance.
1224
1225 Enhancement: Additional Screen Mode Configurations
1226 --------------------------------------------------
1227
1228 Alternative screen mode configurations could be supported. The ULA has to
1229 produce 640 pixel values across the screen, with pixel doubling or quadrupling
1230 employed to fill the screen width:
1231
1232 Screen width Columns Scaling Depth Bytes
1233 ------------ ------- ------- ----- -----
1234 640 80 x1 1 80
1235 320 40 x2 1, 2 40, 80
1236 160 20 x4 2, 4 40, 80
1237
1238 It must also use at most 80 byte-sized memory accesses to provide the
1239 information for the display. Given that characters must occupy an 8x8 pixel
1240 array, if a configuration featuring anything other than 20, 40 or 80 character
1241 columns is to be supported, compromises must be made such as the introduction
1242 of blank pixels either between characters (such as occurs between rows in MODE
1243 3 and 6) or at the end of a scanline (such as occurs at the end of the frame
1244 in MODE 3 and 6). Consider the following configuration:
1245
1246 Screen width Columns Scaling Depth Bytes Blank
1247 ------------ ------- ------- ----- ------ -----
1248 208 26 x3 1, 2 26, 52 16
1249
1250 Here, if the ULA can triple pixels, a 26 column mode with either 2 or 4
1251 colours could be provided, with 16 blank pixel values (out of a total of 640)
1252 generated either at the start or end (or split between the start and end) of
1253 each scanline.
1254
1255 Enhancement: Character Attributes
1256 ---------------------------------
1257
1258 The BBC Micro MODE 7 employs something resembling character attributes to
1259 support teletext displays, but depends on circuitry providing a character
1260 generator. The ZX Spectrum, on the other hand, provides character attributes
1261 as a means of colouring bitmapped graphics. Although such a feature is very
1262 limiting as the sole means of providing multicolour graphics, in situations
1263 where the choice is between low resolution multicolour graphics or high
1264 resolution monochrome graphics, character attributes provide a potentially
1265 useful compromise.
1266
1267 For each byte read, the ULA must deliver 8 pixel values (out of a total of
1268 640) to the video output, doing so by either emptying its pixel buffer on a
1269 pixel per cycle basis, or by multiplying pixels and thus holding them for more
1270 than one cycle. For example for a screen mode having 640 pixels in width:
1271
1272 Cycle: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1273 Reads: B B
1274 Pixels: 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
1275
1276 And for a screen mode having 320 pixels in width:
1277
1278 Cycle: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1279 Reads: B
1280 Pixels: 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7
1281
1282 However, in modes where less than 80 bytes are required to generate the pixel
1283 values, an enhanced ULA might be able to read additional bytes between those
1284 providing the bitmapped graphics data:
1285
1286 Cycle: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1287 Reads: B A
1288 Pixels: 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7
1289
1290 These additional bytes could provide colour information for the bitmapped data
1291 in the following character column (of 8 pixels). Since it would be desirable
1292 to apply attribute data to the first column, the initial 8 cycles might be
1293 configured to not produce pixel values.
1294
1295 For an entire character, attribute data need only be read for the first row of
1296 pixels for a character. The subsequent rows would have attribute information
1297 applied to them, although this would require the attribute data to be stored
1298 in some kind of buffer. Thus, the following access pattern would be observed:
1299
1300 Reads: A B _ B _ B _ B _ B _ B _ B _ B ...
1301
1302 In modes 3 and 6, the blank display lines could be used to retrieve attribute
1303 data:
1304
1305 Reads (blank): A _ A _ A _ A _ A _ A _ A _ A _ ...
1306 Reads (active): B _ B _ B _ B _ B _ B _ B _ B _ ...
1307 Reads (active): B _ B _ B _ B _ B _ B _ B _ B _ ...
1308 ...
1309
1310 See below for a discussion of using this for character data as well.
1311
1312 A whole byte used for colour information for a whole character would result in
1313 a choice of 256 colours, and this might be somewhat excessive. By only reading
1314 attribute bytes at every other opportunity, a choice of 16 colours could be
1315 applied individually to two characters.
1316
1317 Cycle: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1318 Reads: B A B -
1319 Pixels: 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7
1320
1321 Further reductions in attribute data access, offering 4 colours for every
1322 character in a four character block, for example, might also be worth
1323 considering.
1324
1325 Consider the following configurations for screen modes with a colour depth of
1326 1 bit per pixel for bitmap information:
1327
1328 Screen width Columns Scaling Bytes (B) Bytes (A) Colours Screen start
1329 ------------ ------- ------- --------- --------- ------- ------------
1330 320 40 x2 40 40 256 &5300
1331 320 40 x2 40 20 16 &5580 -> &5500
1332 320 40 x2 40 10 4 &56C0 -> &5600
1333 208 26 x3 26 26 256 &62C0 -> &6200
1334 208 26 x3 26 13 16 &6460 -> &6400
1335
1336 Enhancement: Text-Only Modes using Character and Attribute Data
1337 ---------------------------------------------------------------
1338
1339 In modes 3 and 6, the blank display lines could be used to retrieve character
1340 and attribute data instead of trying to insert it between bitmap data accesses,
1341 but this data would then need to be retained:
1342
1343 Reads: A C A C A C A C A C A C A C A C ...
1344 Reads: B _ B _ B _ B _ B _ B _ B _ B _ ...
1345
1346 Only attribute (A) and character (C) reads would require screen memory
1347 storage. Bitmap data reads (B) would involve either accesses to memory to
1348 obtain character definition details or could, at the cost of special storage
1349 in the ULA, involve accesses within the ULA that would then free up the RAM.
1350 However, the CPU would not benefit from having any extra access slots due to
1351 the limitations of the RAM access mechanism.
1352
1353 A scheme without caching might be possible. The same line of memory addresses
1354 might be visited over and over again for eight display lines, with an index
1355 into the bitmap data being incremented from zero to seven. The access patterns
1356 would look like this:
1357
1358 Reads: C B C B C B C B C B C B C B C B ... (generate data from index 0)
1359 Reads: C B C B C B C B C B C B C B C B ... (generate data from index 1)
1360 Reads: C B C B C B C B C B C B C B C B ... (generate data from index 2)
1361 Reads: C B C B C B C B C B C B C B C B ... (generate data from index 3)
1362 Reads: C B C B C B C B C B C B C B C B ... (generate data from index 4)
1363 Reads: C B C B C B C B C B C B C B C B ... (generate data from index 5)
1364 Reads: C B C B C B C B C B C B C B C B ... (generate data from index 6)
1365 Reads: C B C B C B C B C B C B C B C B ... (generate data from index 7)
1366
1367 The bandwidth requirements would be the sum of the accesses to read the
1368 character values (repeatedly) and those to read the bitmap data to reproduce
1369 the characters on screen.
1370
1371 Enhancement: MODE 7 Emulation using Character Attributes
1372 --------------------------------------------------------
1373
1374 If the scheme of applying attributes to character regions were employed to
1375 emulate MODE 7, in conjunction with the MODE 6 display technique, the
1376 following configuration would be required:
1377
1378 Screen width Columns Rows Bytes (B) Bytes (A) Colours Screen start
1379 ------------ ------- ---- --------- --------- ------- ------------
1380 320 40 25 40 20 16 &5ECC -> &5E00
1381 320 40 25 40 10 4 &5FC6 -> &5F00
1382
1383 Although this requires much more memory than MODE 7 (8500 bytes versus MODE
1384 7's 1000 bytes), it does not need much more memory than MODE 6, and it would
1385 at least make a limited 40-column multicolour mode available as a substitute
1386 for MODE 7.
1387
1388 Using the text-only enhancement with caching of data or with repeated reads of
1389 the same character data line for eight display lines, the storage requirements
1390 would be diminished substantially:
1391
1392 Screen width Columns Rows Bytes (C) Bytes (A) Colours Screen start
1393 ------------ ------- ---- --------- --------- ------- ------------
1394 320 40 25 40 20 16 &7A94 -> &7A00
1395 320 40 25 40 10 4 &7B1E -> &7B00
1396 320 40 25 40 5 2 &7B9B -> &7B00
1397 320 40 25 40 0 (2) &7C18 -> &7C00
1398 640 80 25 80 40 16 &7448 -> &7400
1399 640 80 25 80 20 4 &763C -> &7600
1400 640 80 25 80 10 2 &7736 -> &7700
1401 640 80 25 80 0 (2) &7830 -> &7800
1402
1403 Note that the colours describe the locally defined attributes for each
1404 character. When no attribute information is provided, the colours are defined
1405 globally.
1406
1407 Enhancement: Character Generator Support and Vertical Scaling
1408 -------------------------------------------------------------
1409
1410 When generating a picture, the ULA traverses screen memory, obtaining 40 or 80
1411 bytes of pixel data for each scanline. It then proceeds to the next row of
1412 pixel data for each successive scanline, with the exception of the text modes
1413 where scanlines may be blank (for which the row address does not advance).
1414 This arrangement provides a conventional bitmapped graphics display.
1415
1416 However, the ULA could instead facilitate the use of character generators. The
1417 principles involved can be demonstrated by the Jafa Mode 7 Mark 2 Display Unit
1418 expansion for the Electron which feeds the pixel data from a MODE 4 screen to
1419 a SAA5050 character generator to create a MODE 7 display. The solution adopted
1420 involves the replication of 40 bytes of character data across as many pixel
1421 rows as is necessary for the character generator to receive the appropriate
1422 character data for all scanlines in any given character row. If only a single
1423 40-byte row of character data were to be present for the first scanline of a
1424 character row, the character generator would only produce the first scanline
1425 (or the uppermost pixels of the characters) correctly, with the rest of the
1426 character shapes being ill-defined.
1427
1428 Here, the ULA could facilitate the use of memory-efficient character mode
1429 representations (such as MODE 7) by holding the row address for a number of
1430 scanlines, thus providing the same row of screen data for those scanlines,
1431 then advancing to the next row. Visualised in terms of pixel data, it would be
1432 like providing a display with a very low vertical resolution. Indeed, being
1433 able to reduce the vertical resolution of a display mode by a factor of eight
1434 or ten would be equivalent to the above character generation technique in
1435 terms of the ULA's screen reading activities.
1436
1437 By combining this vertical scaling or scanline replication with a circuit
1438 switchable between bitmapped graphics output and character graphics output,
1439 MODE 7 support could be made available, potentially as a hardware option
1440 separate from the ULA.
1441
1442 Enhancement: Compressed Character Data
1443 --------------------------------------
1444
1445 Another observation about text-only modes is that they only need to store a
1446 restricted set of bitmapped data values. Encoding this set of values in a
1447 smaller unit of storage than a byte could possibly help to reduce the amount
1448 of storage and bandwidth required to reproduce the characters on the display.
1449
1450 Enhancement: High Resolution Graphics
1451 -------------------------------------
1452
1453 Screen modes with higher resolutions and larger colour depths might be
1454 possible, but this would in most cases involve the allocation of more screen
1455 memory, and the ULA would probably then be obliged to page in such memory for
1456 the CPU to be able to sensibly access it all.
1457
1458 Enhancement: Genlock Support
1459 ----------------------------
1460
1461 The ULA generates a video signal in conjunction with circuitry producing the
1462 output features necessary for the correct display of the screen image.
1463 However, it appears that the ULA drives the video synchronisation mechanism
1464 instead of reacting to an existing signal. Genlock support might be possible
1465 if the ULA were made to be responsive to such external signals, resetting its
1466 address generators upon receiving synchronisation events.
1467
1468 Enhancement: Improved Sound
1469 ---------------------------
1470
1471 The standard ULA reserves &FE*6 for sound generation and cassette input/output
1472 (with bits 1 and 2 of &FE*7 being used to select either sound generation or
1473 cassette I/O), thus making it impossible to support multiple channels within
1474 the given framework. The BBC Micro ULA employs &FE40-&FE4F for sound control,
1475 and an enhanced ULA could adopt this interface.
1476
1477 The BBC Micro uses the SN76489 chip to produce sound, and the entire
1478 functionality of this chip could be emulated for enhanced sound, with a subset
1479 of the functionality exposed via the &FE*6 interface.
1480
1481 See: http://en.wikipedia.org/wiki/Texas_Instruments_SN76489
1482 See: http://www.smspower.org/Development/SN76489
1483
1484 Enhancement: Waveform Upload
1485 ----------------------------
1486
1487 As with a hardware sprite function, waveforms could be uploaded or referenced
1488 using locations as registers referencing memory regions.
1489
1490 Enhancement: Sound Input/Output
1491 -------------------------------
1492
1493 Since the ULA already controls audio input/output for cassette-based data, it
1494 would have been interesting to entertain the idea of sampling and output of
1495 sounds through the cassette interface. However, a significant amount of
1496 circuitry is employed to process the input signal for use by the ULA and to
1497 process the output signal for recording.
1498
1499 See: http://bbc.nvg.org/doc/A%20Hardware%20Guide%20for%20the%20BBC%20Microcomputer/bbc_hw_03.htm#3.11
1500
1501 Enhancement: BBC ULA Compatibility
1502 ----------------------------------
1503
1504 Although some new ULA functions could be defined in a way that is also
1505 compatible with the BBC Micro, the BBC ULA is itself incompatible with the
1506 Electron ULA: &FE00-7 is reserved for the video controller in the BBC memory
1507 map, but controls various functions specific to the 6845 video controller;
1508 &FE08-F is reserved for the serial controller. It therefore becomes possible
1509 to disregard compatibility where compatibility is already disregarded for a
1510 particular area of functionality.
1511
1512 &FE20-F maps to video ULA functionality on the BBC Micro which provides
1513 control over the palette (using address &FE21, compared to &FE07-F on the
1514 Electron) and other system-specific functions. Since the location usage is
1515 generally incompatible, this region could be reused for other purposes.
1516
1517 Enhancement: Increased RAM, ULA and CPU Performance
1518 ---------------------------------------------------
1519
1520 More modern implementations of the hardware might feature faster RAM coupled
1521 with an increased ULA clock frequency in order to increase the bandwidth
1522 available to the ULA and to the CPU in situations where the ULA is not needed
1523 to perform work. A ULA employing a 32MHz clock would be able to complete the
1524 retrieval of a byte from RAM in only 250ns and thus be able to enable the CPU
1525 to access the RAM for the following 250ns even in display modes requiring the
1526 retrieval of a byte for the display every 500ns. The CPU could, subject to
1527 timing issues, run at 2MHz even in MODE 0, 1 and 2.
1528
1529 A scheme such as that described above would have a similar effect to the
1530 scheme employed in the BBC Micro, although the latter made use of RAM with a
1531 wider bandwidth in order to complete memory transfers within 250ns and thus
1532 permit the CPU to run continuously at 2MHz.
1533
1534 Higher bandwidth could potentially be used to implement exotic features such
1535 as RAM-resident hardware sprites or indeed any feature demanding RAM access
1536 concurrent with the production of the display image.
1537
1538 Enhancement: Multiple CPU Stacks and Zero Pages
1539 -----------------------------------------------
1540
1541 The 6502 maintains a stack for subroutine calls and register storage in page
1542 &01. Although the stack register can be manipulated using the TSX and TXS
1543 instructions, thereby permitting the maintenance of multiple stack regions and
1544 thus the potential coexistence of multiple programs each using a separate
1545 region, only programs that make little use of the stack (perhaps avoiding
1546 deeply-nested subroutine invocations and significant register storage) would
1547 be able to coexist without overwriting each other's stacks.
1548
1549 One way that this issue could be alleviated would involve the provision of a
1550 facility to redirect accesses to page &01 to other areas of memory. The ULA
1551 would provide a register that defines a physical page for the use of the CPU's
1552 "logical" page &01, and upon any access to page &01 by the CPU, the ULA would
1553 change the asserted address lines to redirect the access to the appropriate
1554 physical region.
1555
1556 By providing an 8-bit register, mapping to the most significant byte (MSB) of
1557 a 16-bit address, the ULA could then replace any MSB equal to &01 with the
1558 register value before the access is made. Where multiple programs coexist,
1559 upon switching programs, the register would be updated to point the ULA to the
1560 appropriate stack location, thus providing a simple memory management unit
1561 (MMU) capability.
1562
1563 In a similar fashion, zero page accesses could also be redirected so that code
1564 could run from sideways RAM and have zero page operations redirected to "upper
1565 memory" - for example, to page &BE (with stack accesses redirected to page
1566 &BF, perhaps) - thereby permitting most CPU operations to occur without
1567 inadvertent accesses to "lower memory" (the RAM) which would risk stalling the
1568 CPU as it contends with the ULA for memory access.
1569
1570 Such facilities could also be provided by a separate circuit between the CPU
1571 and ULA in a fashion similar to that employed by a "turbo" board, but unlike
1572 such boards, no additional RAM would be provided: all memory accesses would
1573 occur as normal through the ULA, albeit redirected when configured
1574 appropriately.
1575
1576 ULA Pin Functions
1577 -----------------
1578
1579 The functions of the ULA pins are described in the Electron Service Manual. Of
1580 interest to video processing are the following:
1581
1582 CSYNC (low during horizontal or vertical synchronisation periods, high
1583 otherwise)
1584
1585 HS (low during horizontal synchronisation periods, high otherwise)
1586
1587 RED, GREEN, BLUE (pixel colour outputs)
1588
1589 CLOCK IN (a 16MHz clock input, 4V peak to peak)
1590
1591 PHI OUT (a 1MHz, 2MHz and stopped clock signal for the CPU)
1592
1593 More general memory access pins:
1594
1595 RAM0...RAM3 (data lines to/from the RAM)
1596
1597 RA0...RA7 (address lines for sending both row and column addresses to the RAM)
1598
1599 RAS (row address strobe setting the row address on a negative edge - see the
1600 timing notes)
1601
1602 CAS (column address strobe setting the column address on a negative edge -
1603 see the timing notes)
1604
1605 WE (sets write enable with logic 0, read with logic 1)
1606
1607 ROM (select data access from ROM)
1608
1609 CPU-oriented memory access pins:
1610
1611 A0...A15 (CPU address lines)
1612
1613 PD0...PD7 (CPU data lines)
1614
1615 R/W (indicates CPU write with logic 0, CPU read with logic 1)
1616
1617 Interrupt-related pins:
1618
1619 NMI (CPU request for uninterrupted 1MHz access to memory)
1620
1621 IRQ (signal event to CPU)
1622
1623 POR (power-on reset, resetting the ULA on a positive edge and asserting the
1624 CPU's RST pin)
1625
1626 RST (master reset for the CPU signalled on power-up and by the Break key)
1627
1628 Keyboard-related pins:
1629
1630 KBD0...KBD3 (keyboard inputs)
1631
1632 CAPS LOCK (control status LED)
1633
1634 Sound-related pins:
1635
1636 SOUND O/P (sound output using internal oscillator)
1637
1638 Cassette-related pins:
1639
1640 CAS IN (cassette circuit input, between 0.5V to 2V peak to peak)
1641
1642 CAS OUT (pseudo-sinusoidal output, 1.8V peak to peak)
1643
1644 CAS RC (detect high tone)
1645
1646 CAS MO (motor relay output)
1647
1648 ÷13 IN (~1200 baud clock input)
1649
1650 ULA Socket
1651 ----------
1652
1653 The socket used for the ULA is a 3M/TexTool 268-5400 68-pin socket.
1654
1655 References
1656 ----------
1657
1658 See: http://bbc.nvg.org/doc/A%20Hardware%20Guide%20for%20the%20BBC%20Microcomputer/bbc_hw.htm
1659
1660 About this Document
1661 -------------------
1662
1663 The most recent version of this document and accompanying distribution should
1664 be available from the following location:
1665
1666 http://hgweb.boddie.org.uk/ULA
1667
1668 Copyright and licence information can be found in the docs directory of this
1669 distribution - see docs/COPYING.txt for more information.