TECHNICAL POST
Automating U6 Dialogue Matching in my #AgeOfSingularity Ultima VI Fan Recreation Project
I've been working on the Age Of Singularity Ultima VI fan recreation project, and one of the key challenges now is matching 6,500 transcribed audio files to the correct keyword response pairs in the U6 dialogue SQL database. This is crucial for ensuring the game’s characters speak the right lines at the right moments, but I just did it in a day using ChatGPT-4o, Whisper, and Python.
Here’s how we tackled it:
I used the Pinokio installer to easily locally install Whisper, an advanced FOSS speech-to-text model, to transcribe the dialogue from the over 6,500 U6 FM Towns audio files into text. These transcriptions were saved as .srt files, containing both the text and timestamps.
From the .srt files, we extracted just the dialogue text, ignoring the timestamps and sequence numbers, to prepare it for matching with the corresponding entries in the SQL database. We may use the timestamps later to play audio as text displays, but for now it's unneeded.
To match the transcriptions with the correct responses in the SQL database, we made a Python script that used a fuzzy matching technique called Ratcliff/Obershelp Pattern Recognition. This algorithm is perfect for finding the best matching subsequence within a larger string, which is exactly what we needed.
Implemented through the fuzz.partial_ratio function in Python, this technique allowed us to compare the transcription to the dialogue responses and identify matches even when the transcription was only a part of a larger response.
We set a threshold (e.g., 80%) to determine how close the match needed to be before it was considered valid.
Once a match was found, we automatically updated the AudioFile column in our SQL database with the corresponding file number, ensuring that each response was correctly linked to the appropriate audio file, the audiofiles all being stored in an audio directory in the persistent data path of the Unity project. This also means they can be easily replaced with other recordings if desired, as long as the naming scheme matches.
Also remember these FM Towns audio files will NOT be included in Age Of Singularity by default, but setting up the framework to be functional, will allow easily adding voice lines.
In the Python script we added logging at each step so we could track the process and ensure everything was working as expected.
After testing with a small batch of files, we scaled up the process to handle all characters, ensuring that the entire dialogue database was updated accurately.
This automated approach has just saved like three months of tedious manual work by ensuring the audio files are correctly linked to the in-game dialogue. Now all I gotta do is update my Dialogue C Sharp script in Visual Studio Code in Unity, so it loads audio files as needed and plays them, then we have a system on part with Skyrim, which can be used in Age Of Singularity and in any other game I'm working on.
Again, all this enabled by machine learning tools like Whisper and Chat GPT-4o. It’s another step forward in bringing the world of Ultima VI to life in Age Of Singularity.
Here's the original fb UDIC post.