In an attempt to try and decode the master game file, I stumbled across the root level loading function - sub_46BE30 on PC and sub_800511F0 on Gamecube. This is very useful in allowing us to understand the decompression function, the file loading and more.
The first thing I noticed was that the function takes a single argument, an integer. Naturally I had to play around with this and low and behold it was the id of the level to load. With the ability to control the level loaded, we can plough through numbers and find the ids of every level in the game, including debug levels and story mode levels. I have made Gamecube codes to replace Dragon's Gate with each of the hidden levels here. Highlights include "Ships Ahoy", a very unfinished level and "APB Two", a pinball level?
This function is also called with a value of -1 (referred to ingame as 'Shell Level') to load the menus, and various other numbers correspond to story custscenes.
With that out the way, the focus can be pulled to deeper within the function. My current goal is to try and reverse the decompression function (sub_40FF40 on PC) and get to the game's files.
Short update on this, my theory on file decompression works so hopefully we're not too far off getting all the games files. I won't post the source yet because it uses a lot of hardcoded values and thus decodes exactly one file, but it does do what it's meant to. It works by reading in the MASTER.DAT at the correct offset, setting up the memory in a running Shrek.exe process and launching the decompress function on the data, then reading the memory back out. Shoutout to Microsoft for literally just handing me remote code execution and thinking that's a good idea (tons of malware uses the same technique I described to run malicious code from legitimate processes haha). The current goal is now working out how the game calculates offsets to read from in MASTER.DAT so we can get any file, and how much memory to allocate. Aiming for that this weekend.
And file extraction works! I think there's a couple of kinks that need to be sorted, but for the most part it seems pretty solid. The downside is that most of the files seem to be pretty proprietary, so they'll need further analysis. Nonetheless, the code and a compiled executable have been released here, so give it a look!