What's the proper implementation for hardware emulation?
I'm going to programme a Game Boy emulator (Z80 is the CPU in case somebody is not familiar with it), and while I was doing my research, I've found some things I'm not so sure about.
The first one was that C is the programming language to choose here. That's not so much of a problem, but I'd like to hear your opinion from today's point of view. Even C++ was not recommended.
The second thing I found out was that everybody was using one function per opcode. That seems logical since it's just one function call and probably better optimised than having one function for the "ADD" instruction and then you've got to find out what registers are used here. But how necessary is that today? Is it something I should stick to or should I rather rewrite my emulator if I notice that another way which might be more convenient just doesn't cut it (more or less modern gaming consoles pop into my mind right now)?
Also, it's kind of demotivating to write a function for "add that register to this register" over and over again. Is there a way to automate that from an opcode map or something like that?
I mostly agree with WingsOfIcarus. I wrote a few emulators already so here is my insight:
- The use of function pointers is a good idea (for speed and clarity of code)
-
OOP is not a problem
Yes, member calls are a little bit slower, but if you are careful it will not affect performance too much. On the other hand, OOP emulation code is much better to manage/read/understand.
-
Use an instruction database instead of fixed instruction decoding.
I am using a single text file which consist of all the necessary information for all instructions. The emulator parses it during initialization (feeds the arrays of function pointers and operands...). In this architecture it is very easy to correct errors in the instruction set without any code change.
Complex instruction sets documentation are almost always faulty to some point. The worst case is Z80 (I have never see a 100% error-free instruction set). So use more instruction sets, compare them and create an error-free set (if you can).
-
Add sound, video, keyboard and mouse to your emulation
This is usually not a problem. On Windows use WaveOut instead of DirectSound. It's more stable, much faster (usable latencies of DSound are sometimes even > 400 ms). With WaveOut I was able to lover latency to 20-80 ms which is OK.
-
Apply limit speed by T cycles of emulated CPU per second
I am using machine cycle correct timings which is much slower, but allows me to correctly implement any hardware periphery emulations as (FDC, DMAC, sound chips,...without any hacks)
Apply load/save of files for the emulated platform
For example, this is part of my instruction set (which is directly fed to CPU emulation:
opc T0 T1 MC1 MC2 MC3 MC4 MC5 MC6 MC7 mnemonic
B8 04 00 M1R 4 ... 0 ... 0 ... 0 ... 0 ... 0 ... 0 CP A,B
B9 04 00 M1R 4 ... 0 ... 0 ... 0 ... 0 ... 0 ... 0 CP A,C
BA 04 00 M1R 4 ... 0 ... 0 ... 0 ... 0 ... 0 ... 0 CP A,D
BB 04 00 M1R 4 ... 0 ... 0 ... 0 ... 0 ... 0 ... 0 CP A,E
BC 04 00 M1R 4 ... 0 ... 0 ... 0 ... 0 ... 0 ... 0 CP A,H
BD 04 00 M1R 4 ... 0 ... 0 ... 0 ... 0 ... 0 ... 0 CP A,L
BE 07 00 M1R 4 MRD 3 ... 0 ... 0 ... 0 ... 0 ... 0 CP A,(HL)
BF 04 00 M1R 4 ... 0 ... 0 ... 0 ... 0 ... 0 ... 0 CP A,A
C0 11 05 M1R 5 MRD 3 MRD 3 ... 0 ... 0 ... 0 ... 0 RET NZ
C1 10 00 M1R 4 MRD 3 MRD 3 ... 0 ... 0 ... 0 ... 0 POP BC
C2L2H2 10 10 M1R 4 MRD 3 MRD 3 ... 0 ... 0 ... 0 ... 0 JP NZ,U16
C3L1H1 10 00 M1R 4 MRD 3 MRD 3 ... 0 ... 0 ... 0 ... 0 JP U16
C4L2H2 17 10 M1R 4 MRD 3 MRD 4 MWR 3 MWR 3 ... 0 ... 0 CALL NZ,U16
C5 11 00 M1R 5 MWR 3 MWR 3 ... 0 ... 0 ... 0 ... 0 PUSH BC
C6U2 07 00 M1R 4 MRD 3 ... 0 ... 0 ... 0 ... 0 ... 0 ADD A,U8
C7 11 00 M1R 5 MWR 3 MWR 3 ... 0 ... 0 ... 0 ... 0 RST 00H
C8 11 05 M1R 5 MRD 3 MRD 3 ... 0 ... 0 ... 0 ... 0 RET Z
C9 10 00 M1R 4 MRD 3 MRD 3 ... 0 ... 0 ... 0 ... 0 RET
CAL2H2 10 10 M1R 4 MRD 3 MRD 3 ... 0 ... 0 ... 0 ... 0 JP Z,U16
opc: operation code [hex]
L1,H1,U1,S1 means first operand direct number or address
L2,H2,U2,S2 means second operand direct number or address
L3,H3,U3,S3 means third operand direct number or address
H,L ... U16 high and low byte
U ... U8 unsigned byte
S ... S8 signed byte
T0 normal instruction duration [T] always 2 decimal digits
T1 instruction duration if condition not met [T] always 2 decimal digits
MC1++ Machine cycle first is type,second is duration [T] always 1 decimal digit
... unused
M1R M1 cycle
MRD memory read
MWR memory write
IOR IO read
IOW IO write
NON no external operation (internal computation)
INT interrupt cycle
mnem instruction text (mnemonic)
-
opc
is used for the address in an array of pointers -
mnemonic
is used to select the proper function pointer, and operands type -
T0
andT1
are used for instructions timing (this is enough for rough emulations) -
MC1++
are used for correct MC timings (to implement correct hardware emulation and contentions timing)
Here is my Zilog Z80A complete instruction set with machine cycle timing link for download. Feel free to use (just mention my nick somewhere). After porting to this I was finally able to 100% pass the ZEXALL test. For more info see Writing a graphical Z80 emulator in C or C++.
First suggestion, you shouldn't use nested switch statements, you should rather use array of function pointers, alot faster -> better emulation, and nicer code, nested switch-es can also get a bit messy, here are some links where you can read more about these arrays
http://www.newty.de/fpt/fpt.html
http://www.multigesture.net/wp-content/uploads/mirror/zenogais/FunctionPointers.htm
Second suggestion, Yes you can do it in C#, Java, C++, but since you want every single bit of your CPU cycles so you can get as close emulation as possible - emulating one CPU cycle of target architecture with least number of CPU cycles on curret architecture, and OOP isn't so good in this case from what I heard/read from people. One of the things is performance, and second is pretty much obvious, emulation is, as you probably noticed, really complex task and wraping it in OOP can be unnecessary pain in the neck.
Here's a pretty cool implementation of working with some opcodes for an NES emulator:
http://bisqwit.iki.fi/jutut/kuvat/programming_examples/nesemu1/
Here's the accompanying youtube videos that have a little more explanation as to what's going on
http://www.youtube.com/watch?v=y71lli8MS8s
It uses C++ templates and some additional C++11 features. As to whether you choose C++ or C that is up to you but it shouldn't really matter a whole lot. If you're just emulating a gameboy I doubt that speed is going to be an issue on modern processors so try to just use whatever you're comfortable with.