What MSX future should be?

     The latest enhancements made in basic MSX hardware were made a long ago - in 1990. While MSX and especially MSX2 were good computers, maybe not the fastest of their time but with good graphics and sound. Since that times "big brothers" went far away, almost nobody now wants to live in command line environments and windowed systems want lots of RAM and fast processors. Of course good coding can give you a working windowed system even on 8-bitter, but here we came to the performance limits of the system. Good games also need faster processor speed, and MSX is famous for good games - so what shall we do next? The answer is - make a faster MSX

The history of MSX380
     I was always interested in speeding up my MSX. Working with huge projects like basic or C compilers, heavy weight system utilites and so on - I desperately needed to make compiling and building times smaller. And, since I'm not the last man in hardware design, I worked on the problem.
     Naturaly, i was the first in Russia (back then - in USSR) who made a working turbo-version of a regular Yamaha MSX. First it was 8 Mhz, and after a half-year testing period it became 10,5 Mhz, the triple speed of the original. This seemed to be an upper limit of what MSX RAM and ROM can handle. The further speedup seems to require major hardware modifications, and I didn't want to torture my beloved computer which already suffered a lot during my experiments. But triple speed seemed not to be enough for me. What next?

MSX180. The first step.
     After discussing the problem with Max Vlasov, well known in Russia MSX fan, hardware designer and programmer we desided to make an accelerator board for MSX. If RAM doesnt work on faster speeds, why not to put faster RAM with CPU on a separate board and connect it to MSX in such a way it will emulate signals of the original CPU. In this case CPU will work at the top speed all the time until it needs to access some standard MSX peripherials. We also thought about using new, faster peripherials, so we added an expansion bus to our board You can see what we made on the picture below.

This is MSX 380 prototype. You can see 20 MHz Z180 in PLCC package and a bigger chip is FlexLogic FPGA, which holds most of the circuitry. Smaller DIP packages are bus buffers.

     We decided to start with Z180, and there are some reasons for that. First, Z380 was hard to find. There is only one major Zilog's dealer in Russia and it didn't have Z380s because they were used nowhere. Second, the PLCC package of Z180 is much easier to experiment with than Z380's PQFP. And, finaly, we didn't have any documentation on Z380 back then. No pinout, no command set description, no timings!
     The work was hard. Since Z180 was intended for embedded system use, it had a lousy DRAM interface. It couldn't work with DRAM ant 20 MHz. Period. After examining the problem, Max made a cool design, which implements DRAM's static page mode, and it solved all the problems! Our final version works at 20 mhz ZERO wait states, with a simple plain 30-pin SIMMs. The IDE interface was also added to the board, so you can directly attach an HDD or ATAPI CD-ROM. It will also work at the top speed, which means up to 5 megabytes per second (Z180 DMA channel is used in this case)! The computing speed of this little monster is 1666 Dhrystones (tested with Dhrystone v1.0, normal MSX has result below 200).

How does it work?
     I publish only non-detailed diagram here. I could put complete circuit diagram, but I think nobody needs it and it will only occupy lots of space on my server

The block diagram of MSX180.

     20 Mhz Z180 (33 mhz can also be used here, but will require either EDO chips mounted without SIMM or one wait state per memory access) is the heart of the design. Data and address buses are buffered and fed to the pins of 40-pin connector, which has pinout of Z80 and is inserted in the place of a standard Z80 inside your MSX. You even don't have to dismount original CPU because it is shut off with the means of HOLD signal, which is continously fed to the appropriate leg of Z80. All Control signals are fed inside FPGA, translated and output to the Z80 connector, now emulating exact timing of 3.58 mhz Z80. This bus speed can be altered, up to full 20 mhz, but not all MSX mainboards can handle that. Some can. (I still don't completely understand how. Yamaha YIS-805 and Daewoo CPC-400 are among them.) Special care is taken when writing to VDP. Additional wait states are inserted to ensure documented 8us and 2us delays. Of course they are inserted only when required, to achieve the fastest possible CPU<->VDP transfer.
     As was written earlier, DRAM works in static page mode. This means 20-30 ns access times for normal 30-pin SIMM. This is enough not only for zero wait state operation at 20 Mhz but also for hidden refresh, which do not eat CPU cycles like normal Z80 refresh does. Z180 works in zero wait state, 3-tacts-per-command mode, which gives you almost 7 million instructions per second.
SIMM is conventional 30-pin, which are very cheap now. All sizes: 256K, 1M or 4M can be used, with auto-detection of the size. Mapper is built into the FPGA chip, as well as the copies of slot-selection registers. On board RAM appears in slot 3-2 which is common for many MSX designs. Since FPGA is reprogrammable, slot address of RAM can be altered to any other. ROM BIOS is used from the mainboard, as well as all other peripherials. This is a cost-effective solution, because most programs run under DOS now, and BIOS can be copied to onboard RAM for speed optimization. When you have 4 megs, why not to waste 64K for BIOS?
     IDE interface is built according to ATA level 3 standards, which includes DMA operation. When Z180 works in parallel DMA mode, it can read or write up to 2 megabytes of data per second and also do something in parallel, on half the speed, but that's 10 MHz, almost thrice tha MSX speed! Cool for games, I think.

MSX380
     The MSX380 accelerator, which uses 18 mhz Z380 CPU was the successor for MSX180 project. Its structure is very similar to MSX180 with only few differences. Since Z380 has 16-bit external bus, we redesigned IDE interface to remove 8-16 bits bridge. The same reason was for adding such a bridge on the Z80 connector. We also changed SIMM to modern 72-pin one and added fast 16-bit on-board BIOS ROM. The Dhrystone speed of the accelerator is approximately 2700.

MSX380, the first prototype.

     Since Z380 is 16-bit processor with 4G of address space it would be foolish to waste such a good address space for 100% MSX compatability. This led to the following design innovations.

Multi-layer mapper
This is used for two tasks: for accessing memory modules of more than 4M and easy multitasking with memory space protection. FPGA contains 256 registers, each of them containing long (11 bits) physical address for the given logical mapper page. It also contains modification bit, so operating system can see if the page contents was altered by the CPU. By initializing these registers you can force Z380 to see anything in its lower 64K of RAM - from 64K to 4M mapper - just exactly what your program requires!
Bootstrap ROM
This was added for two purposes - for exiting from Z380's Extended Mode, which can be done only by reset, and for utilizing some memory protection (this uses NMI, which is activated when something wrong is done by the program, and, oops! you get that "Guru meditation" message on the screen.
Extended mode BIOS
This is not written yet. Located in the upper gigabyte of address space, it contains basic I/O routines, memory allocation/deallocation and anything else required for stable and fast work of Z380 in the Extended Mode.

The memory map of MSX380 is as shown below:

00000000-0000FFFFStandard Z80 MSX address space. Slot selection and mapper can work here, like on normal MSX.
00010000-3FFFFFFFThis is reserved for Extended mode RAM. Nobody wants to place a gigabyte on MSX, but RAM becomes cheaper and cheaper, so... who knows?
40000000-BFFFFFFFThis HUGE hole is reserved for PCI bridge which will probably appear in the future.
C0000000-FFFFFFFFAnd this - for Extended mode BIOS and (probably) for future BIOS extensions.

The mapping of various I/O registers is determined and is subject to change. We definitely don't want to interfere with any currently-used MSX extension hardware.

MSX380 paused... What in the future?
     We got bad news recently... Zilog doesn't want to produce Z380 any more. Bye-bye, plans of 33 or 40 MHz Z380-powered MSX. They seem to discontinue it in the nearest future, in favour of the newest Z382 CPU. Unfortunately Z382, while being 33 MHz, is absolutely unusable as an MSX CPU, because all the I/O space in it is already used for various internal devices, like serial I/O, DMA controllers and so on. This makes a good embedded processor, but unusable for MSX. Z380s are still available, but who knows for how long. This, as well as some financial difficulties (read: Russians always have not enough money) caused us to PAUSE the project, as well as designing a brand new MSX motherboard.

     Where to go, if Z380 will be not available? We have some alternatives:
First, we can take a modern, zillion megahertz CPU, and use it for Z80 emulation. Anything else will be performed by either standard hardware or another zillion MHz CPU emulating it. RISC processors come to mind when thinking about CPU, DSP - for the peripherial emulation.
Second, we can still have RISC for CPU and build peripherials from scratch (oops!!! sorry, from widely available FPGA chips). They are fast, easy to simulate and, what is most significant, they are cheap. And becoming more and more cheaper.
And at last, if you are a really anal type like me or Max, you can build your own, fast Z80. Why not? VHDL description of basic Z80 is available (not widely, but you can get it if you really want to play with it). FPGA chips, capable of holding Z80 structure are also available (not so cheap, however; your brand new Z80 will cost like Pentium). So you need only time, time, time...

What can be done with Z80 to optimize its performance (not mentioning the higher clock possible on those 150 or 200 MHz FPGAs)?

+ Built-in MSX memory selection circuits
Why put off-chip what can be put on-chip? Besides, location of MMU on the single chip with CPU will improve timing and help to optimize memory access.

+ DRAM access in static page or EDO mode
Why not to put address multiplexers also on the chip? Modern DRAM can work with 10 ns cycles, that's faster than conventional static RAM. (Cache memory is another topic, and it is expensive.)

+ 32-bit data path
ANY Z80 instruction in the single memory cycle. Sounds good, doesn't it? And since all we have now is 32-bit, 72-pin SIMMs, this could be a wise enhancement.

+ Internal prefetch queue
This works in pair with 32-bit memory access. Since we already have MMU on-chip, wh will not have any problems with queue purging when swithing slots or mapper pages.

+ Jump prediction
Another logical solution. Why not to look for 0C3h code in the prefetch queue and modify prefetch address accordingly?

Sounds cool... The only problem is to sit down and do this. Maybe... sometime. I wish I should do this, MSX should not die in the 21st century.


Egor G. Voznessenski
June, 1997
Moscow.

Back to MSX page