Jump to content
Visual Boy Advance-M


VBA-M Contributor
  • Content Count

  • Joined

  • Last visited

Community Reputation

0 Neutral

About Exophase

  • Rank
  • Birthday 08/07/1983
  1. Exophase

    any chance for N900 build?

    VBA is too slow for OMAP3530, ports to Pandora also confirm that. For right now you're better off waiting for a port of gpSP instead (but I'm not going to be doing one).
  2. Yes, if you're willing to invite all of the challenges associated with dynamic recompilers, like dealing with self-modifying code and branches. When you just want a one to one correspondence you can use a direct mapped decode cache. This is what the author of Gemulator likes to do (see emulators.com). This makes it so you don't have to perform and block tracking and mapping, but you still have to invalidate cache lines on writes to accommodate self-modifying code. In this case, it's actually just as well to just use a lookup table to convert any Thumb opcode to an ARM equivalent (or special undefined ARM code where there's no equivalent). The input is only 16 bits, and the output is nominally 32 bits. You'll likely need another bit of output, though, to determine how whether the PC offsets should be halfword aligned or not, because Thumb behavior regarding this is inconsistent. This method would probably be superior to any kind of cached conversion. For your purposes, that is, only using interpreters, eliminating code is really the only benefit. Performance would be substantially worse because of the time needed to convert from Thumb to ARM, and also because ARM instructions are just a lot more expensive to emulate in general. In a usual GBA game over half the instructions executed will be Thumb, so you'd be hurting speed a ton across the board. I went with this technique in the emulator I'm currently writing, but only because I was never planning on relying on the interpreter for speed (did a recompiler for that) and I wanted to have less points of failure. For recompilation, where the speed isn't that big of a deal, it's nice to have less code to work with, and having one instruction set instead of two means you don't have to write a bunch of analysis and optimization code twice. Another benefit is that it gives you room to perform propagation from Thumb to ARM, resulting in ultimately better code. So in short, I think you shouldn't do it.
  3. Exophase

    VBA2 progress

    You should have taken the comments on this matter that I posted several months ago more to heart It's possible to emulate all of GBA video using shaders, but it'd require some damn complex ones with lots of texture lookups and multiple passes. A DS scener by the name of WolfgangSt is currently writing a DS emulator (or at least he was, haven't seen any public updates in a few months) that renders in hardware. But it's still very incomplete, and I told him a lot of the same things I told you. At any rate, he's doing very little caching, meaning that he fetches tile maps and paletted pixels using shader code. This means that rendering a quad with that shader results in it actually rendering those pixels from a layer like a DS would. Raster effects weren't a problem because it could render a line at a time this way. But I still don't know how he intends on handling blending or OBJ windows. I think that you should step back and determine what your basic high level goals are. If your main goal is something faster than VBA's current renderer then that's very attainable. If you want something cleaner then you're probably going to have to make a compromise somewhere. If you want it on harder so that you can tie in shader effects at an earlier stage of the rendering pipeline then you're probably not going to get it.
  4. Exophase

    VBA2 progress

    The projection effect is accomplished by changing the affine transformation step values throughout the frame, up to every scanline. This is usually done with HDMA but you can do it with HIRQs. I don't know specifically about the sky in F-Zero X right now, but the part above the horizon line is often accomplished by changing the screen mode in the middle of the screen and using a normal text BG for it - that's how Mario Kart does it. It can also be done with sprites, or by changing the affine parameters to point to it (but this is a lot trickier to manage). In mode 1 you can use one of the text BGs on top of the affine one. Rendering the entire screen at once is going to give you massive compatibility problems. As you can see from from the two examples in this post alone a lot of games change what's on the screen while it's being drawn. Games use it for lots of other things too, like color gradients and wavy scrolling. That's why emulators of just about every 2D platform use per-line renderers instead of per-frame ones. In fact, in some ways per-frame is slower because you're working with frame sized buffers instead of line sized ones. This applies to priority combining and alpha blending. You'll end up with a lot more L1 misses. One of the big speed hits from typical line based renderers vs frame based ones is that they will check every sprite every line to see if it should be drawn. You can improve upon this approach by caching which sprites are on which lines are on then only changing this if OAM has been modified before the line. Usually OAM is only modified in vblank so you end up recreating the table once per frame instead of once per line. Even the few games that perform sprite multiplexing only do so a few times per frame. I've mentioned these things in another thread, but I think that trying to do the layer compositing in OpenGL is just asking for trouble. You're not going to be able to correctly emulate blending so easily (or mosaic, but I think most people don't really care about mosaic anyway). The only way I can think of doing it is with a relatively complex depth peeling approach that would require at least three passes. OBJ windows are also going to be pretty problematic. Caching the backgrounds into bitmaps (correct me if this is not what you're doing) might also prove a little unpleasant - basically, what is your strategy for dealing with modifications to them? Do you plan to recreate the entire background when any part of the map is modified? Or just a part of it? Unfortunately, in OpenGL modifying textures can be slow, even if it's just a part of it (sometimes especially if it's just a part of it). An even bigger issue is what happens when the tiles themselves are modified, since it can be expensive to track where in a map a tile is being used. You may end up having to re-cache every map any time any tile is modified. Some games modify tiles to animate them, and if you recache the entire maps you'll end up with something that's probably substantially slower than a full software renderer. Oh yeah, forgot about palette modifications, another common per-frame mod. I guess you're basically going to be stuck recaching the whole map a lot no matter what. Layers are at least 256x256, already a decent amount more than the work rendering them to 240x160, but they can be up to 512x512. Affine maps at up to 1024x1024 are quite a bit worse, although there you do save on not having to transform them. One more comment: if you found the bitmap modes extremely easy then you're probably not emulating them completely. I say this because the bitmaps are affine transformed like the affine layers in modes 1 and 2.
  5. Exophase

    Minimum requirements

    Dual core Atoms are plenty powerful enough to run VBA...
  6. Exophase

    Minimum requirements

    Are you trying to save on power consumption? My gut feeling tells me that you'd be better off not using a core at all than using both at full CPU with a lower clock speed. I could of course be dead wrong, I have seen platforms that consume much less power at lower clock speeds than while halting, but x86 platforms these days should be saving a lot while CPU isn't use. If this isn't your goal then I have a hard time imagining a multicore x86 CPU that can't already run VBA well enough. Are you thinking about embedded non-x86 platforms or something?
  7. Exophase

    Strange "Illegal word write" at Metroid Fusion intro

    GBA games read from and write to messed up places all the time. It's the result of sloppy buggy coding and a lack of memory protection on the GBA to tell them that they screwed up. A majority of it probably occurs from accidentally using uninitialized or NULL pointers. The stores are harmless and can be ignored. The loads, on the other hand, although not being deliberate, still have to be emulated correctly because shockingly some games not only make them but then crash if the results are wrong. Zelda: Minish Cap is notorious for doing this in several places. It's a miracle these games ever worked in the first place. It probably just needs some of the bits to be right, but who knows which ones for which circumstances. A lot of memory accesses happen at [0, offset], further suggesting NULL pointer de-referencing. These return the last thing that was on the prefetch buffer when you left the BIOS, if you're not currently executing in the BIOS. Some accesses happen way out past the first 256MB of address space where nothing sits on the bus, and these "open bus" reads return the last fetched instruction on the prefetch buffer (basically like doing an ldr reg, [pc]). One exception to this is DMAs that come from the BIOS region, those are actually just read as zero by the DMA controller since it doesn't have access to the BIOS at all. GBA games do this unintentionally in order to set an area of memory to zero. But copying to 0 will do nothing and was not intentional. Probably the programmers of Metroid Fusion just wanted to initialize those DMA registers to 0 and didn't realize that they were triggering the DMA at the same time.
  8. Exophase

    C/C++ macro expander?

    If you want anyone to read expanded code I suggest you run it through some kind of formatting tool like indent. Macros suck because they're single line, so expansions contain a bunch of really long lines It'd be great if mainstream C compilers had extensions to support multi-line macros.. no more line joins would be necessary, macro line numbers could be forwarded to compilation error code - good luck finding an error in a several thousand line macro - and they could be expanded in a way that doesn't look as awful.
  9. Exophase

    How the Asterix 3D game renders.

    Neat information. I remember when I was looking at the performance of this game, if you let the GBA execute 1 instruction every cycle (never going to happen on a real GBA) until an idle loop or halt SWI, then what you see is one frame where it uses the CPU full stop, then one or two (can't remember exactly.. two would fit your explanation) where it does close to nothing. Even on a real GBA it'd probably still be doing at least 150,000 or so instructions during that first frame, since it's optimized ARM code that's running. This is quite a bit more than typical GBA games will run at. Having auto-frameskip helps hide the unbalanced performance due to that first frame.. and it doesn't make the gameplay experience worse if it's synchronized right, since it's not a 60fps game to begin with. But it's also possible that the game isn't really locked at 20Hz and just runs at that since it takes that long to render - it would be interesting to see if the frame rate went up if you overclocked the emulated ARM. I'm going to guess that nearby enemies appear on top of the protagonist due to a very crude polygon sorting. This would be vital if you expected decent performance. In general this kind of 3D rendering is very rudimentary, you can see that there is no lighting whatsoever.
  10. Exophase

    C/C++ macro expander?

    Yes, you can usually use the compiler to do it. If you have gcc try this: gcc file.c -E -o file_expanded.c It won't matter if the file is actually C/C++, so long as its preprocessor directives are valid.
  11. Exophase

    GBC boot code support

    It's a part of the CPU die like it was on the original Gameboy, it can't be dumped short of a CPU exploit that probably would have turned up by now if it existed. The only way to get it is to decap it (burn off the chip packaging) and take pictures of the transistors under a microscope, then use software and/or hand techniques to convert this to data. That's what they did with the Gameboy, and I believe this has been done with some other chips. Last I heard an 8192 byte ROM was considered very difficult to learn by decapping, but I think things might have changed since then.
  12. Exophase

    Cycle accuracy causes fights.

    I.S.T., I think you should definitely read the rest of the thread if you haven't. I don't see how a lot of the things he has said can be anything but insulting. For what it's worth, I linked Charles MacDonald (as you know, big in the SMS scene and many others) to the thread and he was less than impressed with the dude.
  13. Exophase

    Cycle accuracy causes fights.

    Geez, this guy is something else ;p (the RetroCopy dude)
  14. Exophase

    Cycle accuracy causes fights.

    I agree that "hack" is a pretty open term. I don't like using per-game hacks in my emulator. What these consist of are certain pieces of code added to detect certain complex states that the real machine could never detect, usually added because the emulator authors have no idea why something isn't working and kept trying out hackish things until it worked. The reason I don't like these is because it distracts from understanding what the machine is supposed to be doing, which means that you could end up having that bug in other games. You could also very easily end up having that bug in other places in the same game, where your hack no longer catches it. It also makes the code more complex and messier, requires you to have some kind of per-game identification system which is prone to failure with modified versions of that game, and can even make the code slower. It reminds me of some ancient Greek astronomers who were certain that the planets orbited in paths that consisted of circular components. When they observed the planets and found that a simple circular model didn't work they added in all kinds of weird circular fixtures to it to try to compensate, including having the planets move backwards and go in little sub-circular loops at various points. In reality planets move in elliptical orbits, which is a much simpler model. They ended up making things really complicated because they had misconceptions that they were forcing things to work in, but even then things only worked some of the time. Other times, however, it's okay to implement something that's not a faithful representation of the underlying hardware when you do actually know/understand how the hardware works and have a good grasp on why the software would never rely on it, or feel that it's not worth the tradeoff. I think that you have to make careful decisions about this, but you always need to know what's actually going on as much as possible.
  15. Exophase

    Cycle accuracy causes fights.

    A good deal of what these people say doesn't even apply to a good deal of platforms. They need to understand that not everything runs a CPU derived from z80 or 6502. Also, while the author of this emulator explains the approach taken and insists several times how accurate it is, he doesn't actually describe a scenario where sub-instruction granular interleaving (a real description of what these people mean when they say "cycle accurate") will necessarily generate a different vital emulation state than a instruction granular interleaving. On a single CPU system like Master System w/o any real kind of tight feedback loops between components I don't even know if this is possible. Would be nice if someone would prove that instead of just blindly implementing it this way. I mean, you could implement something at the gate level instead and you could claim how since it's lower level it COULD be more accurate but if you don't come with an example it's pretty moot. By the way, it's also my opinion that it's much easier to do an emulator like this than it is to say, schedule interrupts and what have you. People like this just want praise for nothing. SMS is dead simple to emulate as far as emulators go and he has to be the millionth person to write one. I suppose something is necessary for him to differentiate himself to the people who don't know any better? Unfortunately far fewer people are in emulation for optimization these days, which tends to have much more pragmatic results than improving accuracy where 100% of games worked fine w/o hacks to begin with.