The Great Bug Hunt
I feel like I say this a lot, but… my motivation has been all over the place lately, so I didn’t get as much done on NooDS this month as I would have liked. That being said, I do have a few interesting things that I think are worth writing about, and a fairly long story about a bug that’s eluded me for months. So, as always, here we go!
Before I get into the main focus of this post, I’d like to talk about a strange graphical priority issue in Golden Sun for the GBA. An issue report brought this to my attention, and my initial look into it yielded some puzzling results. On one of the background layers, there were graphics meant to be displayed above the character sprites. That’s all well and good, but the same layer also contained a carpet that was meant to be displayed below the characters! Since each layer can only have a single priority value, this didn’t seem to make any sense. And yet, on actual hardware it worked! NooDS was rendering the carpet above the characters, which hurts my brain to look at. NO$GBA has the same issue, so GBATEK wasn’t much help here. I let the issue sit for a while and worked on other things, until fleroviux, creator of NanoboyAdvance, chimed in with some incredibly helpful information. Basically, the GBA has a weird quirk where transparent objects can still update the priority of a pixel, even though they don’t update the color value. I had to rework my object rendering a bit for this (I previously had each object priority level as a separate layer, which wouldn’t allow changing the priority of an existing pixel), but I managed to get it working and even got a bit of a speed boost from the refactor! Everything looked good until I noticed new priority issues on the title screen of Mario & Luigi: Partners in Time. I was worried that I messed something up, but after looking at how that game renders its graphics, it only made sense for this issue to occur. So why doesn’t it on hardware? I had assumed that since the GBA and DS have near-identical 2D engines, the quirk must apply to both. However, I started doubting this, and after loading up Golden Sun in GBARunner2 to see if the game’s quirk exploitation would work in DS mode, I confirmed that it is, in fact, unique to the GBA. This isn’t a revolutionary discovery by any means, but I certainly found it interesting.
So, now let’s talk about that bug I mentioned in the beginning. If you’ve been reading my previous progress posts, you might know that Mario Kart DS and the DS Zelda games have managed to stay unbootable on NooDS for a long time. Mario Kart just ended up needing misaligned memory accesses handled properly, but the Zelda games continued to leave me stumped. They were getting stuck in a code loop, but they passed through that loop hundreds of times before actually getting stuck, making it pretty hard to debug. Eventually I enlisted the help of melonDS; I made it spit out debug information that I could directly compare to what I had NooDS spitting out. After a lot of searching and backtracking, I finally found a load instruction that was getting a different value on NooDS than it was on melonDS. I tracked writes to that memory address, and found that the value the game expects is put there by a cartridge DMA transfer, but it’s overwritten by another transfer before being accessed by the code. My first thought was that it must be a timing issue, so I threw a quick hack together to delay cartridge reads. This did manage to make the games boot, but I was a little disappointed because it meant that I’d need to set up that scheduler in order to implement a proper fix (which I’ve still been putting off, bleh). I was later talking about it with Arisotura, who mentioned that for a long time DeSmuME didn’t have cartridge timing either. I thought that was a little strange; if the issue really was timing, then the Zelda games shouldn’t have worked on DeSmuME either. PSISP also told me that he didn’t experience any timing issues with the Zelda games when he was working on CorgiDS. I started to doubt my conclusion, so I looked into it deeper.
I tracked down the exact moment when the overwriting DMA transfer occurred, and immediately knew something was wrong. For context, I’ll first describe how cartridge DMA transfers work. To receive data from the cartridge, a game must set parameters such as the size of the transfer, and then issue a command to start it. The cartridge will then begin sending data, which can be read one word at a time from a specific memory address. This data can be read in manually, but a DMA channel can be set up to make things easier. When the DMA is set to cartridge mode, it’s only triggered when data from the cartridge is ready to be read. The length of the DMA should be set to one word, the source address should be fixed to the cartridge read address, and the destination address should increment after every transfer. It should also be set to repeat so that it keeps transferring every time the cartridge event is triggered, which would continue until the cartridge has sent its previously set amount of data. However, even after the transfer has finished, the DMA channel remains active, waiting for another trigger. It’s therefore a good idea to disable the channel until it’s needed again. Now that that’s established, let’s take a look at what Zelda is doing. It transfers data from the cartridge to memory in 512-byte blocks, using the DMA like I described. Everything is fine up until the point where it disables the channel. For some reason, it first clears the mode and repeat bits, but leaves the channel enabled. It then disables the channel separately afterwards. This leaves the channel enabled for a short time while set to immediate mode with no repeat. During this time, NooDS would perform an immediate DMA transfer of one word based on the settings of the channel, and then automatically disable it. Now, there are a few problems with this. First of all, it’s this extra transfer that’s overwriting the data the game wants. Secondly, if this extra transfer was supposed to happen, why would the game bother explicitly clearing the enabled bit, when non-repeating transfers do that themselves upon completion? It was very clear that this transfer was not supposed to happen.
I asked Arisotura how melonDS handles DMA parameters being changed while active, and she said that it only updates a channel’s parameters when that channel is restarted. This makes sense, because it would mean that the channel would stay in cartridge mode and not transfer the extra word. Just to make sure, I wrote a quick hardware test to confirm. Surprisingly, I found that DMA parameters can actually be changed while the channel is running, and it does take effect. So then, how do the Zelda games work? I worked on my test some more, looking to see how various parameter changes affect the DMA while it’s running. I found that it mostly works the way I already had it implemented in NooDS; all parameters aside from source address, destination address, and word count can be changed and take effect while the channel is running. There was one exception though; one edge case that does not behave the way I expected. When changing a repeating DMA channel in any mode to immediate mode with no repeat, an immediate transfer does not occur. Instead, the channel stays enabled indefinitely, as if waiting for a trigger event that will never come. I’m not exactly sure why this happens, but it could be related to how immediate transfers are triggered. Perhaps the trigger is the channel being enabled, in which case it wouldn’t be triggered if switching to immediate mode while the channel is already enabled. Regardless, that’s how it works on hardware, and that’s why the Zelda games can do what they do (even if I still don’t really know why they do it).
That was a pretty lengthy story for such a small bug; I feel like I just wrote one of endrift’s famous “Holy Grail” bug posts! I wouldn’t say it’s quite that significant, though. Most, if not all, games should work fine with melonDS’ implementation, because changing the DMA parameters while a channel is running is pretty dangerous and probably not very useful. Maybe other emulators are doing something similar to melonDS, or maybe this weird behavior is already known and I just missed the memo; I haven’t checked, so I’m not sure. Either way, the Zelda games now work on NooDS, so that’s all I care about! Aside from these bugs, there were also some smaller fixes and improvements, including a timer optimization that gave a pleasing speed boost. I’ll end this post with a somewhat big new addition that isn’t directly related to emulation improvements or anything like that. Instead, it’s a port to a new platform: Android! It’s nothing special yet, but it has a basic file browser and I whipped up some button images so it doesn’t look completely terrible. The biggest problem right now is that it’s, of course, slow. And also the fact that it’s on Android, which isn’t exactly my go-to platform for playing games. Well, I suppose I have even more reason now to look into writing an ARM64 JIT :)