Guest post by Jamin Smith, Director and Lead Designer at Superstring Ltd.
The design philosophy behind Superstring’s latest game was to blur the lines between the world of the game, and the real world of the player. Acolyte leans heavily into the world of corporate conspiracy and throws the player down an ARG (Alternate Reality Game) rabbit-hole that has them searching through asset files, website source code and social media profiles for information and clues to advance the game. A daunting task.
Thankfully, the player isn’t alone in this undertaking. The USP of Acolyte is in partnering the player with a digital assistant called an ‘Acolyte’; the AI Watson to the player’s Sherlock. This virtual PA – Ana – is instrumental to the conspiracy and the narrative as a whole, and humanizing her is key to the themes of the game.
Text-to-speech (TTS) integration and assigning a voice to Acolyte has been vital in reinforcing the themes of the game, in addition to offering a layer of previously-missing accessibility. All of this was brought to the game in the recent Voice Within update, where TTS was very much the star of the show. In this blog piece, I wanted to explore the implementation of the ReadSpeaker TTS Unity plugin, and the ways in which TTS has taken Acolyte to the next level of immersion and enjoyment, and meeting my own personal design goals.
The voice behind the game
My name’s Jamin Smith and I’m the Director and Lead Designer at Superstring; a London-based game studio looking to explore new genre hybrids, and bring together unexpected systems for new gameplay experiences. Our first game, Headspun, combined Adventure and Management systems with live-action scenes to create a spin on the full-motion video (FMV) genre. In the case of our latest game, Acolyte, this meant modernizing the text-adventure, and combining natural language input with ARG design and AI themes for a very different type of detective game. Acolyte is essentially a glorified chatbot and ARG, around which an elaborate story unfolds.
The idea of chatbots had always intrigued me. As a kid, I always enjoyed trying to ‘break’ them with particularly tricky or weird questions. GodBot, I think it was called, was one of the best at the time, and offered the guise that you were talking to an omnipotent being via a messaging platform. As well as seeing if it truly knew everything, I’d chuck paradoxical questions at it, just to see how it might grapple with a question that clearly had no logical answer: “Can God create a stone so heavy that he cannot lift it?”
I’d also (and I stress the ‘as a kid’ tee-up to this once more!) hurl abuse at the bot, inventing creative slurs or seeing how it’d react to the most scathing insults I knew. I only admit this in writing as it’s key to one of the themes of the game: AI and guilt. And something I’ll come back to later in this piece.
In short, there was something quite ‘gamey’ about my interactions with chatbots back in the early days of the internet. This style of conversing was incredibly engaging, and I always thought a chatbot interface and input system would be a great idea for a game. With the seed for this concept (a game with natural language input) kicking round my head for years, the rest of what would ultimately become ‘Acolyte’ took much longer to form. The thematic wrapper and surrounding narrative would need to be just right to ensure a chatbot-style game experience could still be engaging.
The genesis of ARG Acolyte
Acolyte released on PC on June 28th 2022, and cast players in the role of a QA tester for a fictional tech start-up called Nanomax. In the fiction of the game, the player starts their employment with Nanomax, and is given their own Acolyte, tasked with ensuring she is bug-free, and ready for her big commercial release.
As a new employee of the company, the player is given an integrated email client, a manager, and a list of testing tasks to be getting on with. This ‘testing’ process is achieved by simply conversing with the AI, and putting the various features and functionalities of an Acolyte through its paces. For the most part that’s done through conversation, which is Acolyte’s single gameplay mechanic, around which everything else revolves.
The game takes advantage of natural language input, meaning the player can say or ask anything of their Acolyte at any point in time. Just like texting a friend via a messaging platform. There are no predefined dialogue options, and there is no linear story to follow. The player can dictate their own route through the investigation based purely on their question, answer or choice of words.
Natural language input offers a huge amount of agency to the player in Acolyte, but it does come with a big drawback: custom, authored VO to bring Ana to life was impossible. Ana has an infinite number of possible responses to a player’s input; it’s not like there’s a script that could be passed to an actor to perform. This – pretty major – limitation meant the game could only feasibly be brought to life in a text-only form.
Until ReadSpeaker’s TTS Unity plugin.
TTS was a game changer in every sense
The way the project was set up in Unity, with dialogue generated at runtime based on the player’s input, meant that the game was prime for TTS implementation. Integrating ReadSpeaker’s text-to-speech plugin took around twenty minutes, and automatically gave a voice to each line of dialogue created within my character engine. There was a lot of nuancing to get the right tone, timing and intonation, and then settings for turning on and off etc., but essentially Acolyte went from a text-only experience to a fully-voiced game – with almost infinite lines of VO – in less than half an hour.
The tonality of the voice was important to get right. In the game, Ana is a PA, and designed to help make the life of her user easier. She has a slightly corporate look and feel to her, as she’s been designed by Nanomax to replicate an executive assistant. Despite this, she also has a youth and naivety which offsets the business-like interactions and world around her. While the game steers clear of ‘AI becomes sentient!’ clichés, this contrast is still really key to the game and narrative, and a theme that the addition of TTS really amplified.
Ana is an AI, and yet behind that, there is something very human. Very real. With the right pitch, pauses and emphases, certain scenes in the game suddenly feel so much more poignant and emotional. The TTS made the questions about what lies underneath Ana’s programming really pop, and the game is far stronger for it.
While I didn’t implement the feature in the end, mainly for narrative reasons, the plugin also allows for a voice-swap at runtime, meaning alongside hair and clothes customizations, the player could also customize the voice of their Acolyte. This is a feature I’ll be bearing in mind for future Superstring titles.
As mentioned earlier in this piece, based on my own interactions with chatbots, and early Acolyte playtests, I was confident that many players coming into the game and messing around with the natural language input for the first time were very likely to insult or berate their Acolyte. She will, at times, throw up odd or irrelevant answers to questions, and – inevitably, possibly even sadly – the player will say something mean or nasty in response.
Off the back of this, one of my holistic design goals with Acolyte was to make the player feel guilty. Excitement, laughter, fear and even sadness are all emotions that game designers are able to exploit and elicit in one form or another, but guilt is an incredibly difficult feeling to draw out of a player. Based on the knowledge that many players would be mean to their Acolytes, I found guilt interesting to explore from a design perspective, and made it a goal; I wanted Acolyte players to feel bad about how they’d treated a digital or artificial intelligence, and after a particular point in the story I won’t talk about explicitly (spoilers!), maybe question their previous interactions or choices of language.
If I’m honest with myself, I don’t think we achieved this guilt-inducing moment in the first version of Acolyte. To get such a feeling out of a text-only experience – and a very non-linear one, at that – was no small feat. But that didn’t mean it couldn’t still be worked out…
Natural language projects and TTS go hand in hand
With Acolyte: Voice Within, the game’s first major post-release update, TTS was brought to the game alongside a host of other features and improvements, and brought the definitive version of the game to market. The additional features included new jokes and interesting facts (these can be asked for at any time in the game, with randomly generated results), a new horoscope module, and – the star of the show – full TTS integration. With a fully-voiced Acolyte now interacting with players, the game was as close to my initial vision for the concept as possible, with the game’s themes amplified, and my design goals met far more closely.
With certain pivotal story scenes now delivered via a voice that captured something very human, the player is able to connect with their Acolyte in a more meaningful way. That notion of something real and human under the surface of this digital assistant is captured so perfectly with the voice accompanying each chat-message. Did we manage to make players feel guilty about their treatment of their Acolytes over the course of the game? I think it’s hard to say. But TTS has allowed us to get as close as possible to this goal via a 20-minute plugin implementation.
To bring things to a conclusion, of sorts, natural language projects and TTS go hand in hand, and I wish the ReadSpeaker plugin had been available to me at the very start of the production process (or I’d been aware of it as a solution to the problem). With the benefit of foresight and planning with the feature in mind, I’d love to see how much further the themes in the game could be pushed.
Superstring is currently in pre-production on its third game – our biggest production to date – and while I can’t talk about what exactly that project looks like, I will say that TTS is being baked into the development from the very beginning, and we’re hoping to push the envelope of what can be achieved with VO created at run-time.