Restoring Voices into Ultima VI With FM Towns

Thursday, August 22, 2024 at 4:21 PM

Video of Shamino in Ultima VI says his name, Shamino Salle Dacil.

I own a copy of FM Towns (insanely expensive, barf like $600 on ebay nowadays, but I got it awhile back for like $200, wish it was on gog.com), and I want to extract the audio clips for testing the speech system in my Ultima VI fan recreation project #AgeOfSingularity. Obviously I can't include those original copyrighted asset files in a release of my free fan game, but if the system works, in theory any voice actor files could be dropped in seamlessly or easily modded in later.

Shamino in Ultima VI says his name, Shamino Salle Dacil.

Shamino in Ultima VI says his name, Shamino Salle Dacil.

So I needed to figure out how to extract the audio files to .ogg. It'd be great to be able to test with the real files, but the alternative of going through the entire game and manually recording the sound of each keyword response, might be too tall an order at this juncture.

Then I spent several hours examining the open-source NuVIe codebase, finding leads about lzw decompression and hex headers, (all with the "assistance" of ChatGPT (ie: me asking it to solve all my problems) but we -- I mean I, definitely I, was not able to crack it, and my python scripts kept failing to produce proper output files.

Then someone pointed to where an extractor tool had already been made here. So if you also have a legal copy of FM Towns Ultima VI, you can use that tool to extract the audio for use in my Age of Singularity Ultima VI fan recreation project. Does this make Ultima VI as legit as Skyrim? Yes!

The tool works, which is insane! Now I have the task of matching audio files to keywords in my databases and coding them to play at the right times, etc. It's insane how much work has already been done.

Next I will need to find a way to easily match keywords to the audio files. (I later used a local Pinokio installation of Whisper along with GPT python scripts to do this.)

Here's what ChatGPT-4o said when I asked it to examine the working version, to see where the attempts I'd made with ChatGPT-4o had gone wrong:

"Here's a specific breakdown of how the working code handles each of the crucial steps:

Sample Conversion and Interpolation: The convert_sample method converts raw 8-bit samples to 16-bit PCM samples, applying endian correction where necessary. It also interpolates between samples by calculating additional intermediate values, ensuring smoother transitions in the audio output.

WAV File Restoring Voices into Ultima VI: FM Towns Initialization: The wav_init_header method sets up the WAV file's header by writing the necessary RIFF and WAVE chunk information, including the file size, audio format (PCM), sample rate, and bit depth, making sure the output file conforms to the WAV format standard.

Handling Edge Cases in Decompression: During LZW decompression, the code checks for unexpected codewords (e.g., codewords that don’t yet exist in the dictionary) and handles them by creating fallback entries based on previously known samples. This ensures that the decompression process continues even when encountering anomalies.

Complete Integration of Audio Components: The code integrates various SDL operations to manage audio playback. It loads decompressed data into memory with SDL_RWFromMem and plays it using SDL’s audio functions. The integration includes proper memory management, ensuring that all resources are freed after use to prevent leaks and ensure smooth playback.

This combination of careful handling of data, error checking, and integration with audio libraries is what allows the code to successfully convert the .SAM files into playable audio."

Here's the original fb UDIC post.

You've reached the end of this devlog entry.