1 The Acorn Electron ULA
2 ======================
3
4 Principal Design and Feature Constraints
5 ----------------------------------------
6
7 The features of the ULA are limited by the amount of time and resources that
8 can be allocated to each activity necessary to support such features given the
9 fundamental obligations of the unit. Maintaining a screen display based on the
10 contents of RAM itself requires the ULA to have exclusive access to such
11 hardware resources for a significant period of time. Whilst other elements of
12 the ULA can in principle run in parallel with this activity, they cannot also
13 access the RAM. Consequently, other features that might use the RAM must
14 accept a reduced allocation of that resource in comparison to a hypothetical
15 architecture where concurrent RAM access is possible.
16
17 Thus, the principal constraint for many features is bandwidth. The duration of
18 access to hardware resources is one aspect of this; the rate at which such
19 resources can be accessed is another. For example, the RAM is not fast enough
20 to support access more frequently than one byte per 2MHz cycle, and for screen
21 modes involving 80 bytes of screen data per scanline, there are no free cycles
22 for anything other than the production of pixel output during the active
23 scanline periods.
24
25 Timing
26 ------
27
28 According to 15.3.2 in the Advanced User Guide, there are 312 scanlines, 256
29 of which are used to generate pixel data. At 50Hz, this means that 128 cycles
30 are spent on each scanline (2000000 cycles / 50 = 40000 cycles; 40000 cycles /
31 312 ~= 128 cycles). This is consistent with the observation that each scanline
32 requires at most 80 bytes of data, and that the ULA is apparently busy for 40
33 out of 64 microseconds in each scanline.
34
35 (In fact, since the ULA is seeking to provide an image for an interlaced
36 625-line display, there are in fact two "fields" involved, one providing 312
37 scanlines and one providing 313 scanlines. See below for a description of the
38 video system.)
39
40 Access to RAM involves accessing four 64Kb dynamic RAM devices (IC4 to IC7,
41 each providing two bits of each byte) using two cycles within the 500ns period
42 of the 2MHz clock to complete each access operation. Since the CPU and ULA
43 have to take turns in accessing the RAM in MODE 4, 5 and 6, the CPU must
44 effectively run at 1MHz (since every other 500ns period involves the ULA
45 accessing RAM). The CPU is driven by an external clock (IC8) whose 16MHz
46 frequency is divided by the ULA (IC1) depending on the screen mode in use.
47
48 Each 16MHz cycle is approximately 62.5ns. To access the memory, the following
49 patterns corresponding to 16MHz cycles are required:
50
51 Time (ns): 0-------------- 500------------ ...
52 2 MHz cycle: 0 1 ...
53 16 MHz cycle: 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 ...
54 ~RAS: 0 1 0 1 ...
55 ~CAS: 0 1 0 1 0 1 0 1 ...
56 A B C A B C ...
57 F S F S ...
58 a b c a b c ...
59
60 ~WE: ......W ...
61 PHI OUT: ______________/---------------\ ...
62 CPU: D L ...
63 RnW: R ...
64
65 Here, "A" and "B" respectively indicate the row and first column addresses
66 being latched into the RAM (on a negative edge for ~RAS and ~CAS
67 respectively), and "C" indicates the second column address being latched into
68 the RAM. Presumably, the first and second half-bytes can be read at "F" and
69 "S" respectively, and the row and column addresses must be made available at
70 "a" and "b" (and "c") respectively at the latest.
71
72 For the CPU, "L" indicates the point at which an address is taken from the CPU
73 address bus, on a negative edge of PHI OUT, with "D" being the point at which
74 data may either be read or be asserted for writing, on a positive edge of PHI
75 OUT. Here, PHI OUT is driven at 1MHz. Given that ~WE needs to be driven low
76 for writing or high for reading, and thus propagates RnW from the CPU, this
77 would need to be done before data would be retrieved and, according to the
78 TM4164EC4 datasheet, even as late as the column address is presented and ~CAS
79 brought low.
80
81 The TM4164EC4-15 has a row address access time of 150ns (maximum) and a column
82 address access time of 90ns (maximum), which appears to mean that
83 approximately two 16MHz cycles after the row address is latched, and one and a
84 half cycles after the column address is latched, the data becomes available.
85
86 Note that the Service Manual refers to the negative edge of RAS and CAS, but
87 the datasheet for the similar TM4164EC4 product shows latching on the negative
88 edge of ~RAS and ~CAS. It is possible that the Service Manual also intended to
89 communicate the latter behaviour. In the TM4164EC4 datasheet, it appears that
90 "page mode" provides the appropriate behaviour for that particular product.
91
92 The CPU, when accessing the RAM alone, apparently does not make use of the
93 vacated "slot" that the ULA would otherwise use (when interleaving accesses in
94 MODE 4, 5 and 6). It only employs a full 2MHz access frequency to memory when
95 accessing ROM (and potentially sideways RAM).
96
97 See: Acorn Electron Advanced User Guide
98 See: Acorn Electron Service Manual
99 http://acorn.chriswhy.co.uk/docs/Acorn/Manuals/Acorn_ElectronSM.pdf
100 See: http://mdfs.net/Docs/Comp/Electron/Techinfo.htm
101 See: http://stardot.org.uk/forums/viewtopic.php?p=120438#p120438
102
103 Bandwidth Figures
104 -----------------
105
106 Using an observation of 128 2MHz cycles per scanline, 256 active lines and 312
107 total lines, with 80 cycles occurring in the active periods of display
108 scanlines, the following bandwidth calculations can be performed:
109
110 Total theoretical maximum:
111 128 cycles * 312 lines
112 = 39936 bytes
113
114 MODE 0, 1, 2:
115 ULA: 80 cycles * 256 lines
116 = 20480 bytes
117 CPU: 48 cycles / 2 * 256 lines
118 + 128 cycles / 2 * (312 - 256) lines
119 = 9728 bytes
120
121 MODE 3:
122 ULA: 80 cycles * 24 rows * 8 lines
123 = 15360 bytes
124 CPU: 48 cycles / 2 * 24 rows * 8 lines
125 + 128 cycles / 2 * (312 - (24 rows * 8 lines))
126 = 12288 bytes
127
128 MODE 4, 5:
129 ULA: 40 cycles * 256 lines
130 = 10240 bytes
131 CPU: (40 cycles + 48 cycles / 2) * 256 lines
132 + 128 cycles / 2 * (312 - 256) lines
133 = 19968 bytes
134
135 MODE 6:
136 ULA: 40 cycles * 24 rows * 8 lines
137 = 7680 bytes
138 CPU: (40 cycles + 48 cycles / 2) * 24 rows * 8 lines
139 + 128 cycles / 2 * (312 - (24 rows * 8 lines))
140 = 19968 bytes
141
142 Here, the division of 2 for CPU accesses is performed to indicate that the CPU
143 only uses every other access opportunity even in uncontended periods. See the
144 2MHz RAM Access enhancement below for bandwidth calculations that consider
145 this limitation removed.
146
147 Video Timing
148 ------------
149
150 According to 8.7 in the Service Manual, and the PAL Wikipedia page,
151 approximately 4.7µs is used for the sync pulse, 5.7µs for the "back porch"
152 (including the "colour burst"), and 1.65µs for the "front porch", totalling
153 12.05µs and thus leaving 51.95µs for the active video signal for each
154 scanline. As the Service Manual suggests in the oscilloscope traces, the
155 display information is transmitted more or less centred within the active
156 video period since the ULA will only be providing pixel data for 40µs in each
157 scanline.
158
159 Each 62.5ns cycle happens to correspond to 64µs divided by 1024, meaning that
160 each scanline can be divided into 1024 cycles, although only 640 at most are
161 actively used to provide pixel data. Pixel data production should only occur
162 within a certain period on each scanline, approximately 262 cycles after the
163 start of hsync:
164
165 active video period = 51.95µs
166 pixel data period = 40µs
167 total silent period = 51.95µs - 40µs = 11.95µs
168 silent periods (before and after) = 11.95µs / 2 = 5.975µs
169 hsync and back porch period = 4.7µs + 5.7µs = 10.4µs
170 time before pixel data period = 10.4µs + 5.975µs = 16.375µs
171 pixel data period start cycle = 16.375µs / 62.5ns = 262
172
173 By choosing a number divisible by 8, the RAM access mechanism can be
174 synchronised with the pixel production. Thus, 256 is a more appropriate start
175 cycle, where the HS (horizontal sync) signal corresponding to the 4µs sync
176 pulse (or "normal sync" pulse as described by the "PAL TV timing and voltages"
177 document) occurs at cycle 0.
178
179 To summarise:
180
181 HS signal starts at cycle 0 on each horizontal scanline
182 HS signal ends approximately 4µs later at cycle 64
183 Pixel data starts approximately 12µs later at cycle 256
184
185 "Re: Electron Memory Contention" provides measurements that appear consistent
186 with these calculations.
187
188 The "vertical blanking period", meaning the period before picture information
189 in each field is 25 lines out of 312 (or 313) and thus lasts for 1.6ms. Of
190 this, 2.5 lines occur before the vsync (field sync) which also lasts for 2.5
191 lines. Thus, the first visible scanline on the first field of a frame occurs
192 half way through the 23rd scanline period measured from the start of vsync
193 (indicated by "V" in the diagrams below):
194
195 10 20 23
196 Line in frame: 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8
197 Line from 1: 0 22 3
198 Line on screen: .:::::VVVVV::::: 12233445566
199 |_________________________________________________|
200 25 line vertical blanking period
201
202 In the second field of a frame, the first visible scanline coincides with the
203 24th scanline period measured from the start of line 313 in the frame:
204
205 310 336
206 Line in frame: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9
207 Line from 313: 0 23 4
208 Line on screen: 88:::::VVVVV:::: 11223344
209 288 | |
210 |_________________________________________________|
211 25 line vertical blanking period
212
213 In order to consider only full lines, we might consider the start of each
214 frame to occur 23 lines after the start of vsync.
215
216 Again, it is likely that pixel data production should only occur on scanlines
217 within a certain period on each frame. The "625/50" document indicates that
218 only a certain region is "safe" to use, suggesting a vertically centred region
219 with approximately 15 blank lines above and below the picture. However, the
220 "PAL TV timing and voltages" document suggests 28 blank lines above and below
221 the picture. This would centre the 256 lines within the 312 lines of each
222 field and thus provide a start of picture approximately 5.5 or 5 lines after
223 the end of the blanking period or 28 or 27.5 lines after the start of vsync.
224
225 To summarise:
226
227 CSYNC signal starts at cycle 0
228 CSYNC signal ends approximately 160µs (2.5 lines) later at cycle 2560
229 Start of line occurs approximately 1632µs (5.5 lines) later at cycle 28672
230
231 See: http://en.wikipedia.org/wiki/PAL
232 See: http://en.wikipedia.org/wiki/Analog_television#Structure_of_a_video_signal
233 See: The 625/50 PAL Video Signal and TV Compatible Graphics Modes
234 http://lipas.uwasa.fi/~f76998/video/modes/
235 See: PAL TV timing and voltages
236 http://www.retroleum.co.uk/electronics-articles/pal-tv-timing-and-voltages/
237 See: Line Standards
238 http://www.pembers.freeserve.co.uk/World-TV-Standards/Line-Standards.html
239 See: Horizontal Blanking Interval of 405-, 525-, 625- and 819-Line Standards
240 http://www.pembers.freeserve.co.uk/World-TV-Standards/HBI.pdf
241 See: Re: Electron Memory Contention
242 http://www.stardot.org.uk/forums/viewtopic.php?p=134109#p134109
243
244 RAM Integrated Circuits
245 -----------------------
246
247 Unicorn Electronics appears to offer 4164 RAM chips (as well as 6502 series
248 CPUs such as the 6502, 6502A, 6502B and 65C02). These 4164 devices are
249 available in 100ns (4164-100), 120ns (4164-120) and 150ns (4164-150) variants,
250 have 16 pins and address 65536 bits through a 1-bit wide channel. Similarly,
251 ByteDelight.com sell 4164 devices primarily for the ZX Spectrum.
252
253 The documentation for the Electron mentions 4164-15 RAM chips for IC4-7, and
254 the Samsung-produced KM41464 series is apparently equivalent to the Texas
255 Instruments 4164 chips presumably used in the Electron.
256
257 The TM4164EC4 series combines 4 64K x 1b units into a single package and
258 appears similar to the TM4164EA4 featured on the Electron's circuit diagram
259 (in the Advanced User Guide but not the Service Manual), and it also has 22
260 pins providing 3 additional inputs and 3 additional outputs over the 16 pins
261 of the individual 4164-15 modules, presumably allowing concurrent access to
262 the packaged memory units.
263
264 As far as currently available replacements are concerned, the NTE4164 is a
265 potential candidate: according to the Vetco Electronics entry, it is
266 supposedly a replacement for the TMS4164-15 amongst many other parts. Similar
267 parts include the NTE2164 and the NTE6664, both of which appear to have
268 largely the same performance and connection characteristics. Meanwhile, the
269 NTE21256 appears to be a 16-pin replacement with four times the capacity that
270 maintains the single data input and output pins. Using the NTE21256 as a
271 replacement for all ICs combined would be difficult because of the single bit
272 output.
273
274 Another device equivalent to the 4164-15 appears to be available under the
275 code 41662 from Jameco Electronics as the Siemens HYB 4164-2. The Jameco Web
276 site lists data sheets for other devices on the same page, but these are
277 different and actually appear to be provided under the 41574 product code (but
278 are listed under 41464-10) and appear to be replacements for the TM4164EC4:
279 the Samsung KM41464A-15 and NEC µPD41464 employ 18 pins, eliminating 4 pins by
280 employing 4 pins for both input and output.
281
282 Pins I/O pins Row access Column access
283 ---- -------- ---------- -------------
284 TM4164EC4 22 4 + 4 150ns (15) 90ns (15)
285 KM41464AP 18 4 150ns (15) 75ns (15)
286 NTE21256 16 1 + 1 150ns 75ns
287 HYB 4164-2 16 1 + 1 150ns 100ns
288 µPD41464 18 4 120ns (12) 60ns (12)
289
290 See: TM4164EC4 65,536 by 4-Bit Dynamic RAM Module
291 http://www.datasheetarchive.com/dl/Datasheets-112/DSAP0051030.pdf
292 See: Dynamic RAMS
293 http://www.unicornelectronics.com/IC/DYNAMIC.html
294 See: New old stock 8x 4164 chips
295 http://www.bytedelight.com/?product=8x-4164-chips-new-old-stock
296 See: KM4164B 64K x 1 Bit Dynamic RAM with Page Mode
297 http://images.ihscontent.net/vipimages/VipMasterIC/IC/SAMS/SAMSD020/SAMSD020-45.pdf
298 See: NTE2164 Integrated Circuit 65,536 X 1 Bit Dynamic Random Access Memory
299 http://www.vetco.net/catalog/product_info.php?products_id=2806
300 See: NTE4164 - IC-NMOS 64K DRAM 150NS
301 http://www.vetco.net/catalog/product_info.php?products_id=3680
302 See: NTE21256 - IC-256K DRAM 150NS
303 http://www.vetco.net/catalog/product_info.php?products_id=2799
304 See: NTE21256 262,144-Bit Dynamic Random Access Memory (DRAM)
305 http://www.nteinc.com/specs/21000to21999/pdf/nte21256.pdf
306 See: NTE6664 - IC-MOS 64K DRAM 150NS
307 http://www.vetco.net/catalog/product_info.php?products_id=5213
308 See: NTE6664 Integrated Circuit 64K-Bit Dynamic RAM
309 http://www.nteinc.com/specs/6600to6699/pdf/nte6664.pdf
310 See: 4164-150: MAJOR BRANDS
311 http://www.jameco.com/webapp/wcs/stores/servlet/Product_10001_10001_41662_-1
312 See: HYB 4164-1, HYB 4164-2, HYB 4164-3 65,536-Bit Dynamic Random Access Memory (RAM)
313 http://www.jameco.com/Jameco/Products/ProdDS/41662SIEMENS.pdf
314 See: KM41464A NMOS DRAM 64K x 4 Bit Dynamic RAM with Page Mode
315 http://www.jameco.com/Jameco/Products/ProdDS/41662SAM.pdf
316 See: NEC µ41464 65,536 x 4-Bit Dynamic NMOS RAM
317 http://www.jameco.com/Jameco/Products/ProdDS/41662NEC.pdf
318 See: 41464-10: MAJOR BRANDS
319 http://www.jameco.com/webapp/wcs/stores/servlet/Product_10001_10001_41574_-1
320
321 Interrupts
322 ----------
323
324 The ULA generates IRQs (maskable interrupts) according to certain conditions
325 and these conditions are controlled by location &FE00:
326
327 * Vertical sync (bottom of displayed screen)
328 * 50MHz real time clock
329 * Transmit data empty
330 * Receive data full
331 * High tone detect
332
333 The ULA is also used to clear interrupt conditions through location &FE05. Of
334 particular significance is bit 7, which must be set if an NMI (non-maskable
335 interrupt) has occurred and has thus suspended ULA access to memory, restoring
336 the normal function of the ULA.
337
338 ROM Paging
339 ----------
340
341 Accessing different ROMs involves bits 0 to 3 of &FE05. Some special ROM
342 mappings exist:
343
344 8 keyboard
345 9 keyboard (duplicate)
346 10 BASIC ROM
347 11 BASIC ROM (duplicate)
348
349 Paging in a ROM involves the following procedure:
350
351 1. Assert ROM page enable (bit 3) together with a ROM number n in bits 0 to
352 2, corresponding to ROM number 8+n, such that one of ROMs 12 to 15 is
353 selected.
354 2. Where a ROM numbered from 0 to 7 is to be selected, set bit 3 to zero
355 whilst writing the desired ROM number n in bits 0 to 2.
356
357 See: http://stardot.org.uk/forums/viewtopic.php?p=136686#p136686
358
359 Shadow/Expanded Memory
360 ----------------------
361
362 The Electron exposes all sixteen address lines and all eight data lines
363 through the expansion bus. Using such lines, it is possible to provide
364 additional memory - typically sideways ROM and RAM - on expansion cards and
365 through cartridges, although the official cartridge specification provides
366 fewer address lines and only seeks to provide access to memory in 16K units.
367
368 Various modifications and upgrades were developed to offer "turbo"
369 capabilities to the Electron, permitting the CPU to access a separate 8K of
370 RAM at 2MHz, presumably preventing access to the low 8K of RAM accessible via
371 the ULA through additional logic. However, an enhanced ULA might support
372 independent CPU access to memory over the expansion bus by allowing itself to
373 be discharged from providing access to memory, potentially for a range of
374 addresses, and for the CPU to communicate with external memory uninterrupted.
375
376 Sideways RAM/ROM and Upper Memory Access
377 ----------------------------------------
378
379 Although the ULA controls the CPU clock, effectively slowing or stopping the
380 CPU when the ULA needs to access screen memory, it is apparently able to allow
381 the CPU to access addresses of &8000 and above - the upper region of memory -
382 at 2MHz independently of any access to RAM that the ULA might be performing,
383 only blocking the CPU if it attempts to access addresses of &7FFF and below
384 during any ULA memory access - the lower region of memory - by stopping or
385 stalling its clock.
386
387 Thus, the ULA remains aware of the level of the A15 line, only inhibiting the
388 CPU clock if the line goes low, when the CPU is attempting to access the lower
389 region of memory.
390
391 Hardware Scrolling (and Enhancement)
392 ------------------------------------
393
394 On the standard ULA, &FE02 and &FE03 map to a 9 significant bits address with
395 the least significant 5 bits being zero, thus limiting the scrolling
396 resolution to 64 bytes. An enhanced ULA could support a resolution of 2 bytes
397 using the same layout of these addresses.
398
399 |--&FE02--------------| |--&FE03--------------|
400 XX XX 14 13 12 11 10 09 08 07 06 XX XX XX XX XX
401
402 XX 14 13 12 11 10 09 08 07 06 05 04 03 02 01 XX
403
404 Arguably, a resolution of 8 bytes is more useful, since the mapping of screen
405 memory to pixel locations is character oriented. A change in 8 bytes would
406 permit a horizontal scrolling resolution of 2 pixels in MODE 2, 4 pixels in
407 MODE 1 and 5, and 8 pixels in MODE 0, 3 and 6. This resolution is actually
408 observed on the BBC Micro (see 18.11.2 in the BBC Microcomputer Advanced User
409 Guide).
410
411 One argument for a 2 byte resolution is smooth vertical scrolling. A pitfall
412 of changing the screen address by 2 bytes is the change in the number of lines
413 from the initial and final character rows that need reading by the ULA, which
414 would need to maintain this state information (although this is a relatively
415 trivial change). Another pitfall is the complication that might be introduced
416 to software writing bitmaps of character height to the screen.
417
418 See: http://pastraiser.com/computers/acornelectron/acornelectron.html
419
420 Enhancement: Mode Layouts
421 -------------------------
422
423 Merely changing the screen memory mappings in order to have Archimedes-style
424 row-oriented screen addresses (instead of character-oriented addresses) could
425 be done for the existing modes, but this might not be sufficiently beneficial,
426 especially since accessing regions of the screen would involve incrementing
427 pointers by amounts that are inconvenient on an 8-bit CPU.
428
429 However, instead of using a Archimedes-style mapping, column-oriented screen
430 addresses could be more feasibly employed: incrementing the address would
431 reference the vertical screen location below the currently-referenced location
432 (just as occurs within characters using the existing ULA); instead of
433 returning to the top of the character row and referencing the next horizontal
434 location after eight bytes, the address would reference the next character row
435 and continue to reference locations downwards over the height of the screen
436 until reaching the bottom; at the bottom, the next location would be the next
437 horizontal location at the top of the screen.
438
439 In other words, the memory layout for the screen would resemble the following
440 (for MODE 2):
441
442 &3000 &3100 ... &7F00
443 &3001 &3101
444 ... ...
445 &3007
446 &3008
447 ...
448 ... ...
449 &30FF ... &7FFF
450
451 Since there are 256 pixel rows, each column of locations would be addressable
452 using the low byte of the address. Meanwhile, the high byte would be
453 incremented to address different columns. Thus, addressing screen locations
454 would become a lot more convenient and potentially much more efficient for
455 certain kinds of graphical output.
456
457 One potential complication with this simplified addressing scheme arises with
458 hardware scrolling. Vertical hardware scrolling by one pixel row (not supported
459 with the existing ULA) would be achieved by incrementing or decrementing the
460 screen start address; by one character row, it would involve adding or
461 subtracting 8. However, the ULA only supports multiples of 64 when changing the
462 screen start address. Thus, if such a scheme were to be adopted, three
463 additional bits would need to be supported in the screen start register (see
464 "Hardware Scrolling (and Enhancement)" for more details). However, horizontal
465 scrolling would be much improved even under the severe constraints of the
466 existing ULA: only adjustments of 256 to the screen start address would be
467 required to produce single-location scrolling of as few as two pixels in MODE 2
468 (four pixels in MODEs 1 and 5, eight pixels otherwise).
469
470 More disruptive is the effect of this alternative layout on software.
471 Presumably, compatibility with the BBC Micro was the primary goal of the
472 Electron's hardware design. With the character-oriented screen layout in
473 place, system software (and application software accessing the screen
474 directly) would be relying on this layout to run on the Electron with little
475 or no modification. Although it might have been possible to change the system
476 software to use this column-oriented layout instead, this would have incurred
477 a development cost and caused additional work porting things like games to the
478 Electron. Moreover, a separate branch of the software from that supporting the
479 BBC Micro and closer derivatives would then have needed maintaining.
480
481 The decision to use the character-oriented layout in the BBC Micro may have
482 been related to the choice of circuitry and to facilitate a convenient
483 hardware implementation, and by the time the Electron was planned, it was too
484 late to do anything about this somewhat unfortunate choice.
485
486 Pixel Layouts
487 -------------
488
489 The pixel layouts are as follows:
490
491 Modes Depth (bpp) Pixels (from bits)
492 ----- ----------- ------------------
493 0, 3, 4, 6 1 7 6 5 4 3 2 1 0
494 1, 5 2 73 62 51 40
495 2 4 7531 6420
496
497 Since the ULA reads a half-byte at a time, one might expect it to attempt to
498 produce pixels for every half-byte, as opposed to handling entire bytes.
499 However, the pixel layout is not conducive to producing pixels as soon as a
500 half-byte has been read for a given full-byte location: in 1bpp modes the
501 first four pixels can indeed be produced, but in 2bpp and 4bpp modes the pixel
502 data is spread across the entire byte in different ways.
503
504 An alternative arrangement might be as follows:
505
506 Modes Depth (bpp) Pixels (from bits)
507 ----- ----------- ------------------
508 0, 3, 4, 6 1 7 6 5 4 3 2 1 0
509 1, 5 2 76 54 32 10
510 2 4 7654 3210
511
512 Just as the mode layouts were presumably decided by compatibility with the BBC
513 Micro, the pixel layouts will have been maintained for similar reasons.
514 Unfortunately, this layout prevents any optimisation of the ULA for handling
515 half-byte pixel data generally.
516
517 Enhancement: The Missing MODE 4
518 -------------------------------
519
520 The Electron inherits its screen mode selection from the BBC Micro, where MODE
521 3 is a text version of MODE 0, and where MODE 6 is a text version of MODE 4.
522 Neither MODE 3 nor MODE 6 is a genuine character-based text mode like MODE 7,
523 however, and they are merely implemented by skipping two scanlines in every
524 ten after the eight required to produce a character line. Thus, such modes
525 provide a 24-row display.
526
527 In principle, nothing prevents this "text mode" effect being applied to other
528 modes. The 20-column modes are not well-suited to displaying text, which
529 leaves MODE 1 which, unlike MODEs 3 and 6, can display 4 colours rather than
530 2. Although the need for a non-monochrome 40-column text mode is addressed by
531 MODE 7 on the BBC Micro, the Electron lacks such a mode.
532
533 If the 4-colour, 24-row variant of MODE 1 were to be provided, logically it
534 would occupy MODE 4 instead of the current MODE 4:
535
536 Screen mode Size (kilobytes) Colours Rows Resolution
537 ----------- ---------------- ------- ---- ----------
538 0 20 2 32 640x256
539 1 20 4 32 320x256
540 2 20 16 32 160x256
541 3 16 2 24 640x256
542 4 (new) 16 4 24 320x256
543 4 (old) 10 2 32 320x256
544 5 10 4 32 160x256
545 6 8 2 24 320x256
546
547 Thus, for increasing mode numbers, the size of each mode would be the same or
548 less than the preceding mode.
549
550 Enhancement: 2MHz RAM Access
551 ----------------------------
552
553 Given that the CPU and ULA both access RAM at 2MHz, but given that the CPU
554 when not competing with the ULA only accesses RAM every other 2MHz cycle (as
555 if the ULA still needed to access the RAM), one useful enhancement would be a
556 mechanism to let the CPU take over the ULA cycles outside the ULA's period of
557 activity comparable to the way the ULA takes over the CPU cycles in MODE 0 to
558 3.
559
560 Thus, the RAM access cycles would resemble the following in MODE 0 to 3:
561
562 Upon a transition from display cycles: UUUUCCCC (instead of UUUUC_C_)
563 On a non-display line: CCCCCCCC (instead of C_C_C_C_)
564
565 In MODE 4 to 6:
566
567 Upon a transition from display cycles: CUCUCCCC (instead of CUCUC_C_)
568 On a non-display line: CCCCCCCC (instead of C_C_C_C_)
569
570 This would improve CPU bandwidth as follows:
571
572 Standard ULA Enhanced ULA
573 MODE 0, 1, 2 9728 bytes 19456 bytes
574 MODE 3 12288 bytes 24576 bytes
575 MODE 4, 5 19968 bytes 29696 bytes
576 MODE 6 19968 bytes 32256 bytes
577
578 With such an enhancement, MODE 0 to 3 experience a doubling of CPU bandwidth
579 because all access opportunities to RAM are doubled. Meanwhile, in the other
580 modes, some CPU accesses occur alongside ULA accesses and thus cannot be
581 doubled, but the CPU bandwidth increase is still significant.
582
583 Enhancement: Region Blanking
584 ----------------------------
585
586 The problem of permitting character-oriented blitting in programs whilst
587 scrolling the screen by sub-character amounts could be mitigated by permitting
588 a region of the display to be blank, such as the final lines of the display.
589 Consider the following vertical scrolling by 2 bytes that would cause an
590 initial character row of 6 lines and a final character row of 2 lines:
591
592 6 lines - initial, partial character row
593 248 lines - 31 complete rows
594 2 lines - final, partial character row
595
596 If a routine were in use that wrote 8 line bitmaps to the partial character
597 row now split in two, it would be advisable to hide one of the regions in
598 order to prevent content appearing in the wrong place on screen (such as
599 content meant to appear at the top "leaking" onto the bottom). Blanking 6
600 lines would be sufficient, as can be seen from the following cases.
601
602 Scrolling up by 2 lines:
603
604 6 lines - initial, partial character row
605 240 lines - 30 complete rows
606 4 lines - part of 1 complete row
607 -----------------------------------------------------------------
608 4 lines - part of 1 complete row (hidden to maintain 250 lines)
609 2 lines - final, partial character row (hidden)
610
611 Scrolling down by 2 lines:
612
613 2 lines - initial, partial character row
614 248 lines - 31 complete rows
615 ----------------------------------------------------------
616 6 lines - final, partial character row (hidden)
617
618 Thus, in this case, region blanking would impose a 250 line display with the
619 bottom 6 lines blank.
620
621 See the description of the display suspend enhancement for a more efficient
622 way of blanking lines than merely blanking the palette whilst allowing the CPU
623 to perform useful work during the blanking period.
624
625 To control the blanking or suspending of lines at the top and bottom of the
626 display, a memory location could be dedicated to the task: the upper 4 bits
627 could define a blanking region of up to 16 lines at the top of the screen,
628 whereas the lower 4 bits could define such a region at the bottom of the
629 screen. If more lines were required, two locations could be employed, allowing
630 the top and bottom regions to occupy the entire screen.
631
632 Enhancement: Screen Height Adjustment
633 -------------------------------------
634
635 The height of the screen could be configurable in order to reduce screen
636 memory consumption. This is not quite done in MODE 3 and 6 since the start of
637 the screen appears to be rounded down to the nearest page, but by reducing the
638 height by amounts more than a page, savings would be possible. For example:
639
640 Screen width Depth Height Bytes per line Saving in bytes Start address
641 ------------ ----- ------ -------------- --------------- -------------
642 640 1 252 80 320 &3140 -> &3100
643 640 1 248 80 640 &3280 -> &3200
644 320 1 240 40 640 &5A80 -> &5A00
645 320 2 240 80 1280 &3500
646
647 Screen Mode Selection
648 ---------------------
649
650 Bits 3, 4 and 5 of address &FE*7 control the selected screen mode. For a wider
651 range of modes, the other bits of &FE*7 (related to sound, cassette
652 input/output and the Caps Lock LED) would need to be reassigned and bit 0
653 potentially being made available for use.
654
655 Enhancement: Palette Definition
656 -------------------------------
657
658 Since all memory accesses go via the ULA, an enhanced ULA could employ more
659 specific addresses than &FE*X to perform enhanced functions. For example, the
660 palette control is done using &FE*8-F and merely involves selecting predefined
661 colours, whereas an enhanced ULA could support the redefinition of all 16
662 colours using specific ranges such as &FE18-F (colours 0 to 7) and &FE28-F
663 (colours 8 to 15), where a single byte might provide 8 bits per pixel colour
664 specifications similar to those used on the Archimedes.
665
666 The principal limitation here is actually the hardware: the Electron has only
667 a single output line for each of the red, green and blue channels, and if
668 those outputs are strictly digital and can only be set to a "high" and "low"
669 value, then only the existing eight colours are possible. If a modern ULA were
670 able to output analogue values (or values at well-defined points between the
671 high and low values, such as the half-on value supported by the Amstrad CPC
672 series), it would still need to be assessed whether the circuitry could
673 successfully handle and propagate such values. Various sources indicate that
674 only "TTL levels" are supported by the RGB output circuit, and since there are
675 74LS08 AND logic gates involved in the RGB component outputs from the ULA, it
676 is likely that the ULA is expected to provide only "high" or "low" values.
677
678 Short of adding extra outputs from the ULA (either additional red, green and
679 blue outputs or a combined intensity output), another approach might involve
680 some kind of modulation where an output value might be encoded in multiple
681 pulses at a higher frequency than the pixel frequency. However, this would
682 demand additional circuitry outside the ULA, and component RGB monitors would
683 probably not be able to take advantage of this feature; only UHF and composite
684 video devices (the latter with the composite video colour support enabled on
685 the Electron's circuit board) would potentially benefit.
686
687 Flashing Colours
688 ----------------
689
690 According to the Advanced User Guide, "The cursor and flashing colours are
691 entirely generated in software: This means that all of the logical to physical
692 colour map must be changed to cause colours to flash." This appears to suggest
693 that the palette registers must be updated upon the flash counter - read and
694 written by OSBYTE &C1 (193) - reaching zero and that some way of changing the
695 colour pairs to be any combination of colours might be possible, instead of
696 having colour complements as pairs.
697
698 It is conceivable that the interrupt code responsible does the simple thing
699 and merely inverts the current values for any logical colours (LC) for which
700 the associated physical colour (as supplied as the second parameter to the VDU
701 19 call) has the top bit of its four bit value set. These top bits are not
702 recorded in the palette registers but are presumably recorded separately and
703 used to build bitmaps as follows:
704
705 LC 2 colour 4 colour 16 colour 4-bit value for inversion
706 -- -------- -------- --------- -------------------------
707 0 00010001 00010001 00010001 1, 1, 1
708 1 01000100 00100010 00010001 4, 2, 1
709 2 01000100 00100010 4, 2
710 3 10001000 00100010 8, 2
711 4 00010001 1
712 5 00010001 1
713 6 00100010 2
714 7 00100010 2
715 8 01000100 4
716 9 01000100 4
717 10 10001000 8
718 11 10001000 8
719 12 01000100 4
720 13 01000100 4
721 14 10001000 8
722 15 10001000 8
723
724 Inversion value calculation:
725
726 2 colour formula: 1 << (colour * 2)
727 4 colour formula: 1 << colour
728 16 colour formula: 1 << ((colour & 2) + ((colour & 8) * 2))
729
730 For example, where logical colour 0 has been mapped to a physical colour in
731 the range 8 to 15, a bitmap of 00010001 would be chosen as its contribution to
732 the inversion operation. (The lower three bits of the physical colour would be
733 used to set the underlying colour information affected by the inversion
734 operation.)
735
736 An operation in the interrupt code would then combine the bitmaps for all
737 logical colours in 2 and 4 colour modes, with the 16 colour bitmaps being
738 combined for groups of logical colours as follows:
739
740 Logical colours
741 ---------------
742 0, 2, 8, 10
743 4, 6, 12, 14
744 5, 7, 13, 15
745 1, 3, 9, 11
746
747 These combined bitmaps would be EORed with the existing palette register
748 values in order to perform the value inversion necessary to produce the
749 flashing effect.
750
751 Thus, in the VDU 19 operation, the appropriate inversion value would be
752 calculated for the logical colour, and this value would then be combined with
753 other inversion values in a dedicated memory location corresponding to the
754 colour's group as indicated above. Meanwhile, the palette channel values would
755 be derived from the lower three bits of the specified physical colour and
756 combined with other palette data in dedicated memory locations corresponding
757 to the palette registers.
758
759 Interestingly, although flashing colours on the BBC Micro are controlled by
760 toggling bit 0 of the &FE20 control register location for the Video ULA, the
761 actual colour inversion is done in hardware.
762
763 Enhancement: Palette Definition Lists
764 -------------------------------------
765
766 It can be useful to redefine the palette in order to change the colours
767 available for a particular region of the screen, particularly in modes where
768 the choice of colours is constrained, and if an increased colour depth were
769 available, palette redefinition would be useful to give the illusion of more
770 than 16 colours in MODE 2. Traditionally, palette redefinition has been done
771 by using interrupt-driven timers, but a more efficient approach would involve
772 presenting lists of palette definitions to the ULA so that it can change the
773 palette at a particular display line.
774
775 One might define a palette redefinition list in a region of memory and then
776 communicate its contents to the ULA by writing the address and length of the
777 list, along with the display line at which the palette is to be changed, to
778 ULA registers such that the ULA buffers the list and performs the redefinition
779 at the appropriate time. Throughput/bandwidth considerations might impose
780 restrictions on the practical length of such a list, however.
781
782 Enhancement: Display Synchronisation Interrupts
783 -----------------------------------------------
784
785 When completing each scanline of the display, the ULA could trigger an
786 interrupt. Since this might impact system performance substantially, the
787 feature would probably need to be configurable, and it might be sufficient to
788 have an interrupt only after a certain number of display lines instead.
789 Permitting the CPU to take action after eight lines would allow palette
790 switching and other effects to occur on a character row basis.
791
792 The ULA provides an interrupt at the end of the display period, presumably so
793 that software can schedule updates to the screen, avoid flickering or tearing,
794 and so on. However, some applications might benefit from an interrupt at, or
795 just before, the start of the display period so that palette modifications or
796 similar effects could be scheduled.
797
798 Enhancement: Palette-Free Modes
799 -------------------------------
800
801 Palette-free modes might be defined where bit values directly correspond to
802 the red, green and blue channels, although this would mostly make sense only
803 for modes with depths greater than the standard 4 bits per pixel, and such
804 modes would require more memory than MODE 2 if they were to have an acceptable
805 resolution.
806
807 Enhancement: Display Suspend
808 ----------------------------
809
810 Especially when writing to the screen memory, it could be beneficial to be
811 able to suspend the ULA's access to the memory, instead producing blank values
812 for all screen pixels until a program is ready to reveal the screen. This is
813 different from palette blanking since with a blank palette, the ULA is still
814 reading screen memory and translating its contents into pixel values that end
815 up being blank.
816
817 This function is reminiscent of a capability of the ZX81, albeit necessary on
818 that hardware to reduce the load on the system CPU which was responsible for
819 producing the video output. By allowing display suspend on the Electron, the
820 performance benefit would be derived from giving the CPU full access to the
821 memory bandwidth.
822
823 The region blanking feature mentioned above could be implemented using this
824 enhancement instead of employing palette blanking for the affected lines of
825 the display.
826
827 Enhancement: Memory Filling
828 ---------------------------
829
830 A capability that could be given to an enhanced ULA is that of permitting the
831 ULA to write to screen memory as well being able to read from it. Although
832 such a capability would probably not be useful in conjunction with the
833 existing read operations when producing a screen display, and insufficient
834 bandwidth would exist to do so in high-bandwidth screen modes anyway, the
835 capability could be offered during a display suspend period (as described
836 above), permitting a more efficient mechanism to rapidly fill memory with a
837 predetermined value.
838
839 This capability could also support block filling, where the limits of the
840 filled memory would be defined by the position and size of a screen area,
841 although this would demand the provision of additional registers in the ULA to
842 retain the details of such areas and additional logic to control the fill
843 operation.
844
845 Enhancement: Region Filling
846 ---------------------------
847
848 An alternative to memory writing might involve indicating regions using
849 additional registers or memory where the ULA fills regions of the screen with
850 content instead of reading from memory. Unlike hardware sprites which should
851 realistically provide varied content, region filling could employ single
852 colours or patterns, and one advantage of doing so would be that the ULA need
853 not access memory at all within a particular region.
854
855 Regions would be defined on a row-by-row basis. Instead of reading memory and
856 blitting a direct representation to the screen, the ULA would read region
857 definitions containing a start column, region width and colour details. There
858 might be a certain number of definitions allowed per row, or the ULA might
859 just traverse an ordered list of such definitions with each one indicating the
860 row, start column, region width and colour details.
861
862 One could even compress this information further by requiring only the row,
863 start column and colour details with each subsequent definition terminating
864 the effect of the previous one. However, one would also need to consider the
865 convenience of preparing such definitions and whether efficient access to
866 definitions for a particular row might be desirable. It might also be
867 desirable to avoid having to prepare definitions for "empty" areas of the
868 screen, effectively making the definition of the screen contents employ
869 run-length encoding and employ only colour plus length information.
870
871 One application of region filling is that of simple 2D and 3D shape rendering.
872 Although it is entirely possible to plot such shapes to the screen and have
873 the ULA blit the memory contents to the screen, such operations consume
874 bandwidth both in the initial plotting and in the final transfer to the
875 screen. Region filling would reduce such bandwidth usage substantially.
876
877 This way of representing screen images would make certain kinds of images
878 unfeasible to represent - consider alternating single pixel values which could
879 easily occur in some character bitmaps - even if an internal queue of regions
880 were to be supported such that the ULA could read ahead and buffer such
881 "bandwidth intensive" areas. Thus, the ULA might be better served providing
882 this feature for certain areas of the display only as some kind of special
883 graphics window.
884
885 Enhancement: Hardware Sprites
886 -----------------------------
887
888 An enhanced ULA might provide hardware sprites, but this would be done in an
889 way that is incompatible with the standard ULA, since no &FE*X locations are
890 available for allocation. To keep the facility simple, hardware sprites would
891 have a standard byte width and height.
892
893 The specification of sprites could involve the reservation of 16 locations
894 (for example, &FE20-F) specifying a fixed number of eight sprites, with each
895 location pair referring to the sprite data. By limiting the ULA to dealing
896 with a fixed number of sprites, the work required inside the ULA would be
897 reduced since it would avoid having to deal with arbitrary numbers of sprites.
898
899 The principal limitation on providing hardware sprites is that of having to
900 obtain sprite data, given that the ULA is usually required to retrieve screen
901 data, and given the lack of memory bandwidth available to retrieve sprite data
902 (particularly from multiple sprites supposedly at the same position) and
903 screen data simultaneously. Although the ULA could potentially read sprite
904 data and screen data in alternate memory accesses in screen modes where the
905 bandwidth is not already fully utilised, this would result in a degradation of
906 performance.
907
908 Enhancement: Additional Screen Mode Configurations
909 --------------------------------------------------
910
911 Alternative screen mode configurations could be supported. The ULA has to
912 produce 640 pixel values across the screen, with pixel doubling or quadrupling
913 employed to fill the screen width:
914
915 Screen width Columns Scaling Depth Bytes
916 ------------ ------- ------- ----- -----
917 640 80 x1 1 80
918 320 40 x2 1, 2 40, 80
919 160 20 x4 2, 4 40, 80
920
921 It must also use at most 80 byte-sized memory accesses to provide the
922 information for the display. Given that characters must occupy an 8x8 pixel
923 array, if a configuration featuring anything other than 20, 40 or 80 character
924 columns is to be supported, compromises must be made such as the introduction
925 of blank pixels either between characters (such as occurs between rows in MODE
926 3 and 6) or at the end of a scanline (such as occurs at the end of the frame
927 in MODE 3 and 6). Consider the following configuration:
928
929 Screen width Columns Scaling Depth Bytes Blank
930 ------------ ------- ------- ----- ------ -----
931 208 26 x3 1, 2 26, 52 16
932
933 Here, if the ULA can triple pixels, a 26 column mode with either 2 or 4
934 colours could be provided, with 16 blank pixel values (out of a total of 640)
935 generated either at the start or end (or split between the start and end) of
936 each scanline.
937
938 Enhancement: Character Attributes
939 ---------------------------------
940
941 The BBC Micro MODE 7 employs something resembling character attributes to
942 support teletext displays, but depends on circuitry providing a character
943 generator. The ZX Spectrum, on the other hand, provides character attributes
944 as a means of colouring bitmapped graphics. Although such a feature is very
945 limiting as the sole means of providing multicolour graphics, in situations
946 where the choice is between low resolution multicolour graphics or high
947 resolution monochrome graphics, character attributes provide a potentially
948 useful compromise.
949
950 For each byte read, the ULA must deliver 8 pixel values (out of a total of
951 640) to the video output, doing so by either emptying its pixel buffer on a
952 pixel per cycle basis, or by multiplying pixels and thus holding them for more
953 than one cycle. For example for a screen mode having 640 pixels in width:
954
955 Cycle: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
956 Reads: B B
957 Pixels: 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
958
959 And for a screen mode having 320 pixels in width:
960
961 Cycle: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
962 Reads: B
963 Pixels: 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7
964
965 However, in modes where less than 80 bytes are required to generate the pixel
966 values, an enhanced ULA might be able to read additional bytes between those
967 providing the bitmapped graphics data:
968
969 Cycle: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
970 Reads: B A
971 Pixels: 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7
972
973 These additional bytes could provide colour information for the bitmapped data
974 in the following character column (of 8 pixels). Since it would be desirable
975 to apply attribute data to the first column, the initial 8 cycles might be
976 configured to not produce pixel values.
977
978 For an entire character, attribute data need only be read for the first row of
979 pixels for a character. The subsequent rows would have attribute information
980 applied to them, although this would require the attribute data to be stored
981 in some kind of buffer. Thus, the following access pattern would be observed:
982
983 Cycle: A B ... _ B ... _ B ... _ B ... _ B ... _ B ... _ B ... _ B ...
984
985 A whole byte used for colour information for a whole character would result in
986 a choice of 256 colours, and this might be somewhat excessive. By only reading
987 attribute bytes at every other opportunity, a choice of 16 colours could be
988 applied individually to two characters.
989
990 Cycle: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
991 Reads: B A B -
992 Pixels: 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7
993
994 Further reductions in attribute data access, offering 4 colours for every
995 character in a four character block, for example, might also be worth
996 considering.
997
998 Consider the following configurations for screen modes with a colour depth of
999 1 bit per pixel for bitmap information:
1000
1001 Screen width Columns Scaling Bytes (B) Bytes (A) Colours Screen start
1002 ------------ ------- ------- --------- --------- ------- ------------
1003 320 40 x2 40 40 256 &5300
1004 320 40 x2 40 20 16 &5580 -> &5500
1005 320 40 x2 40 10 4 &56C0 -> &5600
1006 208 26 x3 26 26 256 &62C0 -> &6200
1007 208 26 x3 26 13 16 &6460 -> &6400
1008
1009 Enhancement: MODE 7 Emulation using Character Attributes
1010 --------------------------------------------------------
1011
1012 If the scheme of applying attributes to character regions were employed to
1013 emulate MODE 7, in conjunction with the MODE 6 display technique, the
1014 following configuration would be required:
1015
1016 Screen width Columns Rows Bytes (B) Bytes (A) Colours Screen start
1017 ------------ ------- ---- --------- --------- ------- ------------
1018 320 40 25 40 20 16 &5ECC -> &5E00
1019 320 40 25 40 10 4 &5FC6 -> &5F00
1020
1021 Although this requires much more memory than MODE 7 (8500 bytes versus MODE
1022 7's 1000 bytes), it does not need much more memory than MODE 6, and it would
1023 at least make a limited 40-column multicolour mode available as a substitute
1024 for MODE 7.
1025
1026 Enhancement: High Resolution Graphics
1027 -------------------------------------
1028
1029 Screen modes with higher resolutions and larger colour depths might be
1030 possible, but this would in most cases involve the allocation of more screen
1031 memory, and the ULA would probably then be obliged to page in such memory for
1032 the CPU to be able to sensibly access it all.
1033
1034 Enhancement: Genlock Support
1035 ----------------------------
1036
1037 The ULA generates a video signal in conjunction with circuitry producing the
1038 output features necessary for the correct display of the screen image.
1039 However, it appears that the ULA drives the video synchronisation mechanism
1040 instead of reacting to an existing signal. Genlock support might be possible
1041 if the ULA were made to be responsive to such external signals, resetting its
1042 address generators upon receiving synchronisation events.
1043
1044 Enhancement: Improved Sound
1045 ---------------------------
1046
1047 The standard ULA reserves &FE*6 for sound generation and cassette input/output
1048 (with bits 1 and 2 of &FE*7 being used to select either sound generation or
1049 cassette I/O), thus making it impossible to support multiple channels within
1050 the given framework. The BBC Micro ULA employs &FE40-&FE4F for sound control,
1051 and an enhanced ULA could adopt this interface.
1052
1053 The BBC Micro uses the SN76489 chip to produce sound, and the entire
1054 functionality of this chip could be emulated for enhanced sound, with a subset
1055 of the functionality exposed via the &FE*6 interface.
1056
1057 See: http://en.wikipedia.org/wiki/Texas_Instruments_SN76489
1058 See: http://www.smspower.org/Development/SN76489
1059
1060 Enhancement: Waveform Upload
1061 ----------------------------
1062
1063 As with a hardware sprite function, waveforms could be uploaded or referenced
1064 using locations as registers referencing memory regions.
1065
1066 Enhancement: Sound Input/Output
1067 -------------------------------
1068
1069 Since the ULA already controls audio input/output for cassette-based data, it
1070 would have been interesting to entertain the idea of sampling and output of
1071 sounds through the cassette interface. However, a significant amount of
1072 circuitry is employed to process the input signal for use by the ULA and to
1073 process the output signal for recording.
1074
1075 See: http://bbc.nvg.org/doc/A%20Hardware%20Guide%20for%20the%20BBC%20Microcomputer/bbc_hw_03.htm#3.11
1076
1077 Enhancement: BBC ULA Compatibility
1078 ----------------------------------
1079
1080 Although some new ULA functions could be defined in a way that is also
1081 compatible with the BBC Micro, the BBC ULA is itself incompatible with the
1082 Electron ULA: &FE00-7 is reserved for the video controller in the BBC memory
1083 map, but controls various functions specific to the 6845 video controller;
1084 &FE08-F is reserved for the serial controller. It therefore becomes possible
1085 to disregard compatibility where compatibility is already disregarded for a
1086 particular area of functionality.
1087
1088 &FE20-F maps to video ULA functionality on the BBC Micro which provides
1089 control over the palette (using address &FE21, compared to &FE07-F on the
1090 Electron) and other system-specific functions. Since the location usage is
1091 generally incompatible, this region could be reused for other purposes.
1092
1093 Enhancement: Increased RAM, ULA and CPU Performance
1094 ---------------------------------------------------
1095
1096 More modern implementations of the hardware might feature faster RAM coupled
1097 with an increased ULA clock frequency in order to increase the bandwidth
1098 available to the ULA and to the CPU in situations where the ULA is not needed
1099 to perform work. A ULA employing a 32MHz clock would be able to complete the
1100 retrieval of a byte from RAM in only 250ns and thus be able to enable the CPU
1101 to access the RAM for the following 250ns even in display modes requiring the
1102 retrieval of a byte for the display every 500ns. The CPU could, subject to
1103 timing issues, run at 2MHz even in MODE 0, 1 and 2.
1104
1105 A scheme such as that described above would have a similar effect to the
1106 scheme employed in the BBC Micro, although the latter made use of RAM with a
1107 wider bandwidth in order to complete memory transfers within 250ns and thus
1108 permit the CPU to run continuously at 2MHz.
1109
1110 Higher bandwidth could potentially be used to implement exotic features such
1111 as RAM-resident hardware sprites or indeed any feature demanding RAM access
1112 concurrent with the production of the display image.
1113
1114 Enhancement: Multiple CPU Stacks and Zero Pages
1115 -----------------------------------------------
1116
1117 The 6502 maintains a stack for subroutine calls and register storage in page
1118 &01. Although the stack register can be manipulated using the TSX and TXS
1119 instructions, thereby permitting the maintenance of multiple stack regions and
1120 thus the potential coexistence of multiple programs each using a separate
1121 region, only programs that make little use of the stack (perhaps avoiding
1122 deeply-nested subroutine invocations and significant register storage) would
1123 be able to coexist without overwriting each other's stacks.
1124
1125 One way that this issue could be alleviated would involve the provision of a
1126 facility to redirect accesses to page &01 to other areas of memory. The ULA
1127 would provide a register that defines a physical page for the use of the CPU's
1128 "logical" page &01, and upon any access to page &01 by the CPU, the ULA would
1129 change the asserted address lines to redirect the access to the appropriate
1130 physical region.
1131
1132 By providing an 8-bit register, mapping to the most significant byte (MSB) of
1133 a 16-bit address, the ULA could then replace any MSB equal to &01 with the
1134 register value before the access is made. Where multiple programs coexist,
1135 upon switching programs, the register would be updated to point the ULA to the
1136 appropriate stack location, thus providing a simple memory management unit
1137 (MMU) capability.
1138
1139 In a similar fashion, zero page accesses could also be redirected so that code
1140 could run from sideways RAM and have zero page operations redirected to "upper
1141 memory" - for example, to page &BE (with stack accesses redirected to page
1142 &BF, perhaps) - thereby permitting most CPU operations to occur without
1143 inadvertent accesses to "lower memory" (the RAM) which would risk stalling the
1144 CPU as it contends with the ULA for memory access.
1145
1146 Such facilities could also be provided by a separate circuit between the CPU
1147 and ULA in a fashion similar to that employed by a "turbo" board, but unlike
1148 such boards, no additional RAM would be provided: all memory accesses would
1149 occur as normal through the ULA, albeit redirected when configured
1150 appropriately.
1151
1152 ULA Pin Functions
1153 -----------------
1154
1155 The functions of the ULA pins are described in the Electron Service Manual. Of
1156 interest to video processing are the following:
1157
1158 CSYNC (low during horizontal or vertical synchronisation periods, high
1159 otherwise)
1160
1161 HS (low during horizontal synchronisation periods, high otherwise)
1162
1163 RED, GREEN, BLUE (pixel colour outputs)
1164
1165 CLOCK IN (a 16MHz clock input, 4V peak to peak)
1166
1167 PHI OUT (a 1MHz, 2MHz and stopped clock signal for the CPU)
1168
1169 More general memory access pins:
1170
1171 RAM0...RAM3 (data lines to/from the RAM)
1172
1173 RA0...RA7 (address lines for sending both row and column addresses to the RAM)
1174
1175 RAS (row address strobe setting the row address on a negative edge - see the
1176 timing notes)
1177
1178 CAS (column address strobe setting the column address on a negative edge -
1179 see the timing notes)
1180
1181 WE (sets write enable with logic 0, read with logic 1)
1182
1183 ROM (select data access from ROM)
1184
1185 CPU-oriented memory access pins:
1186
1187 A0...A15 (CPU address lines)
1188
1189 PD0...PD7 (CPU data lines)
1190
1191 R/W (indicates CPU write with logic 0, CPU read with logic 1)
1192
1193 Interrupt-related pins:
1194
1195 NMI (CPU request for uninterrupted 1MHz access to memory)
1196
1197 IRQ (signal event to CPU)
1198
1199 POR (power-on reset, resetting the ULA on a positive edge and asserting the
1200 CPU's RST pin)
1201
1202 RST (master reset for the CPU signalled on power-up and by the Break key)
1203
1204 Keyboard-related pins:
1205
1206 KBD0...KBD3 (keyboard inputs)
1207
1208 CAPS LOCK (control status LED)
1209
1210 Sound-related pins:
1211
1212 SOUND O/P (sound output using internal oscillator)
1213
1214 Cassette-related pins:
1215
1216 CAS IN (cassette circuit input, between 0.5V to 2V peak to peak)
1217
1218 CAS OUT (pseudo-sinusoidal output, 1.8V peak to peak)
1219
1220 CAS RC (detect high tone)
1221
1222 CAS MO (motor relay output)
1223
1224 ÷13 IN (~1200 baud clock input)
1225
1226 ULA Socket
1227 ----------
1228
1229 The socket used for the ULA is a 3M/TexTool 268-5400 68-pin socket.
1230
1231 References
1232 ----------
1233
1234 See: http://bbc.nvg.org/doc/A%20Hardware%20Guide%20for%20the%20BBC%20Microcomputer/bbc_hw.htm
1235
1236 About this Document
1237 -------------------
1238
1239 The most recent version of this document and accompanying distribution should
1240 be available from the following location:
1241
1242 http://hgweb.boddie.org.uk/ULA
1243
1244 Copyright and licence information can be found in the docs directory of this
1245 distribution - see docs/COPYING.txt for more information.