Testing Tuesday: Escape the Room Alexa Skill



Florian and I tested this Skill yesterday, pretty cool!

Here’s the video:

And here is the Skill: https://www.amazon.com/gp/product/B075J914W2

Testing Tuesday: Voice app candidates
Gal Shenar from Stoked Skills: "Why I migrated my Alexa Skill with 1700+ reviews to Jovo"

Yeah, this was a good one indeed!

Something I had in mind but didn’t bring up in the session… It’s interesting to contrast this #VoiceGame with the ‘Jack Ryan - November Morning’ Alexa Skill by the amazing folks of Earplay, whom I’m quite the fanboy of. The interesting aspect, at least to me, is not how much of a difference the pre-recorded audio for ‘Jack Ryan’ makes (because that’s quite obvious, but difficult to achieve for hobby developers), but the actual voice interface.

Here’s a list of the voice commands for ‘Jack Ryan’:

So, quite a bit more complex than the four commands for ‘Escape the Room’.

What are your folks’ opinions on this? Does the more complex interface of ‘Jack Ryan’ increase immersion by allowing for a more realistic and fine-grained interaction with the game world? Or is it too complex and takes the user’s attention away from the game world onto the interface? Curious to hear your opinions!


Ah yes, we should test that one as well!

That’s definitely interesting. I like that, with “Escape the Room,” I know what to say most of the time, so the limited interaction model could be an advantage. However, I understand that having more options can feel more immersive and a little more natural in the interaction.

I think designers and developers need to make a choice when working on the interaction model: “Do we want people to remember ‘commands’ or should it feel like natural language?” The more complex a game, the more difficult it is to build for open/natural language requests. This is why I believe commands make sense for a game like this. And the more commands there are, the more users have to remember, or even might get the impression that it’s not “command-based”, and try out things that don’t work.

Curious to see how the Jack Ryan Skill solves this.


Good point! So, a tentative summary is:

  • A simpler voice interface is easier to learn & master
  • A more complex voice interface feels more natural and ‘empowered’, and thus yields a higher degree of immersion

If this is true, it would result in about the following learning curve:
:warning: This is not a judgement about the quality of the Skills, more a discussion about the merit of different approaches to designing voice interfaces!
If this hypothesis that a more complex interface allows for higher immersion when mastered is true, there’s still a challenge in leading the user through a potentially frustrating learning phase.
And I think this is one point where working with a popular media franchise and high-quality audio pays a dividend: The story and the esthetics of the Skill are so satisfying that the user is more forgiving of a longer training phase and a potential lack of early successes.
Also, ‘Jack Ryan’ has a tutorial scenario with very few game items and relatively obvious actions to take, which also conter-balances the steeper learning curve.

What do you folks think about this hypothesis?


Really interesting theory…I wonder if the ideal isn’t the basic “level up” concept of traditional games like Mario.

Basically you learn one ‘skill’ at a time…you then spend most of the level mastering that skill (along with using the other ones you’ve already learned).

In many ways, I think they were already walking the line on that within Escape the room because they gave you the basic commands…but then when you saw a color lock, they taught you that could say four colors in a row to try and “solve” that puzzle.

It didn’t go to fully teaching you new commands b/c they had you back out of those interactions when you were done with a specific puzzle…so it’s more of a hybrid approach that I think, here, helps keep the game feeling simple.

I don’t know that I’ve seen/played a really good voice example of “leveling up” the command knowledge of the player (though maybe Six Swords does do this?)…but I think there’s a lot of room and opportunity for more people to explore and play with this general concept (and I think it would help add to the engagement/fun for the user – unlocking new powers/commands as they get further into games).



First of all: Thanks for replying, @falicon, and welcome to the community forum! :hugs:

Great point, Kevin, this is definitely the best way to educate users and keep a balance between their skill level and the difficulty of the game! I know because I’ve read about it in my beloved book about game design. :sweat_smile:

In case of ‘Escape the Room’ I don’t really see it applied - The situation of the colors or the numbers is more like a separate mini-game, and it’s not a way to interact with the game word once you backed off the safe or circuit breaker. But then with the very limited inventory of commands, I think it’s not really neccesary. It would be interesting to know if the different rooms have different degrees of difficulty, with simpler rooms just having a simpler sequence of actions to take to solve the puzzle.

For ‘Jack Ryan’ I think the up-ramping takes place only via the tutorial scenario… But then I don’t know because I got stuck in the first room. :grimacing:

What do you think would be good ways to introduce new skills into an Escape The Room-type game? Looking at the commands in the picture above, it’s hard to imagine introducing them one at the time. And introducing coarse-grained commands like ‘use’ first and more fine-grained commands like ‘touch’, ‘open’, and ‘turn on’ later would require the user to un-learn things.

Curious to hear what you think!


Thanks - great to jump in and share my random thoughts! :slight_smile:

Agree with you about Escape the Room being more mini-games vs. leveling up…I was trying to say it was related, but not really the same thing…you did a better job of explaining what I meant!

In a game like this, I’m not sure they need the “level up” approach because it seems like they can accomplish a lot with just the small set of commands they have told you about at the start…and then breaking into mini-games gives them unlimited opportunities to support additional commands as needed.

That being said…if they were to try the level up approach, I think it might be more to do with the state of this game…it wasn’t clear to me from the demo if the game “remembers” that you saw the letter on the desk…so do you have to be facing the letter to be able to re-examine it? Also, instead of having to back out of the mini-games, they could teach you commands like the color combination thing for the room…and once they did, if you use that command it would know you meant to try to the color lock again (rather than make you jump back to examine the lock, then learn the color commands again, then try your new combination).

…but these are all small things that may or may not actually help here…I think this skill is already a top performer, so these sorts of changes might actually hurt in this case.

The reason a lot of this is on my mind though is because my adventure game plugin (almost ready for v1 release) comes with support for something like 50+ commands…clearly you can’t teach a user all of that before starting the game, and they aren’t going to remember it all right away even if you did…so it’s going to have to be some version of leveling up (or the game dev will have to pick and choose a tiny subset of commands they want to actually offer up when using the plugin).

Hopefully we’ll see (and learn) what works best before too long :wink:


Hey guys,
Great discussion here. I believe that the escape the room model is perfect and having not played it, I think that learning from room to room is best to develop game play.
As an example what is to say that you don’t proceed from look left or right, up or down, doesn’t continue with other actions like duck down, crawl, evade, run and jump… If you learn that the spoken action is the same as controlling a joystick that any action available in a video game is also available in a voice game where a stick is placed in one of four positions and any type of advanced move involves the one in four position plus another button push.


Hey guys,

I haven’t actually played the Jack Ryan skill yet (I will try it soon), but I can give you some context on how/why I decided on the voice model for Escape the Room. I think you basically covered it with that chart you made, but I wanted it to really be as easy to learn as possible.

Part of what I’ve realized myself is that if you are playing a game and Alexa doesn’t understand you once, they might stay, if it happens two or three times in a row, chances are they will quit the skill and give up. People don’t have too much patience these days.

That’s why I focused on narrowing things down to 3 main commands (looking, inspecting objects, and using items on objects). I has originally considered letting users pick up items, but realized that would resolve in users trying to pick up anything they see and get annoying when not everything is an item that can be picked up. Instead we just give a user items when they come across them.

When @Florian and @jan were testing the skill it appears that they ‘followed the directions’ more than most people, by using the command ‘inspect’ on objects almost every time. In reality, the inspect command covered many use cases and internally I think of it like ‘use’, similar to pressing the ‘E’ button in some video games. The voice model for the intent actually covers everything that the user might say when trying to look at that items. For example saying ‘open the door’ will inspect the door, saying ‘press the button’ will inspect the button etc. I have a similar setup for using an item on an object - here is a snippet of some of the intent for that one

			"name": "UseItemOnObject",
			"phrases": [
				"Use {Item} on {Object}",
				"Use the {Item} on the {Object}",
				"Pour the {Item} in the {Object}",
				"Pour the {Item} into the {Object}",`
				"Fix the {Object} with the {Item}",
				"Turn the {Object} with the {Item}",
				"Smash the {Object} with the {Item}",
				"Break the {Object} with the {Item}",
				"Connect the {Object} with the {Item}"

This allows me to cover almost anything the user will try to do with a valid response.

Another way that I tried to increase immersion is by providing different responses for many of the incorrect actions users may take. For example if a user tries to use an item on something that it wont work with, there will be a custom response for each combination as opposed to just telling the user that it didn’t work. These kinds of things help to keep the user feel like they are engaged in the world and that Alexa is understanding their commands, reducing their likeliness to get bored and quit.

About the tutorial - I have had a branch with a tutorial for many months - its just not quite finished yet. I ended up deciding to try to improve the game itself and allow the user to ramp up within the game in place of a tutorial as I feel that sometimes a tutorial can signal to a user that this game will be very complex, and I wanted this to be more of a family game that anyone can pick up without having to ‘learn’.

Time to check out Jack Ryan and see what it is like!


Awesome, great to have you here @gshenar! :star_struck:
And thanks for sharing the rationale behind your design decisions, that’s greatly appreciated!

(Not to forget - Also glad to welcome @chucklapress here, welcome! :wave:)

Great observation about the patience / error tolerance of users… That’s indeed a serious risk that’s proportional to the complexity of the interface. I personally think this steep learning curve is also part of why the ‘Jack Ryan’ Skill doesn’t seem to have gotten a lot of traction (I’m basing this claim purely on the fact that the Skill has only 18 reviews), despite the beneficial factors of great IP (for discoverability) and high-end quality audio (for engagement). The users that mustered the patience to learn it seemed to have had a great time, though (similar to Six Swords, in that sense).

Thinkiong about it… There’s the point that a complex interface might feel more natural, but on the other hand it’s quite elegant to have only a few essential commands that you can do a lot with, and that have a different effect if applied to different situations. And the fact that users use the same command (“use X (with Y)”) over and over again doesn’t seem to irritate users, judging by the reviews.

Great point also about the tutorial… It’s easy to see how it indicates “This game is difficult to learn!”, even though I never thought about it like that.

If I may ask… What are your thoughts on what we discussed above with @falicon, about ramping up the challenge as the user’s expertise increases? Do the rooms in ETR have about the same level of difficulty? Do you observe a pattern that users get bored after three rooms and drop off?


I don’t currently have any sort of system to ‘ramp up the difficulty’, but in the skill description I do mention the difficulty of each of the rooms so that users can choose how much of a challenge they want at first. Each room generally has 2-3 puzzles and the difficulty of that room is just an assessment I made based on the difficulty of those puzzles.

I also have some cross promotion with my other games, Christmas Escape, and Escape the Airplane. Escape the Airplane is harder so I tend to promote it to users once they beat the rooms in Escape the Room. Christmas Escape is easier so I tend to promote it when users are quitting the game and haven’t managed to successfully beat a room yet.

I haven’t managed to determine a real pattern of users getting ‘bored’. I try to keep the difficulty level high enough that it won’t feel like you are just going through the motions, instead I want you to always be paying attention to the room and trying to solve the different puzzles. Some users tend to finish all the rooms and ask for more, other users are content with just finishing once, having a good time with it and not coming back.

While the game has been successful, one of the pitfalls is that there isn’t really a great path for user retention. Users usually finish the game and don’t come back day after day. I haven’t really found a great solution for that but it is something I’m definitely thinking about and would love to improve somehow.


For those interested, Jon and Dave of Earplay also gave us some insights into the design choices behind ‘Jack Ryan’ in the recent episode of the VUX.world podcast, starting at about minute 19.

To paraphrase, ‘Jack Ryan’ is consciously designed to be more challenging, i.e.

  • It’s targeted towards a more dedicated target group (Jack Ryan fans and gamers)
  • It’s not supposed to be played casually, but requires some more focus
  • It should create a high level of immersion, and thus feel highly rewarding when solved


Just read your post on that topic, @falicon, very interesting.

For the rest: Kevin wrote down some additional thoughts about this topic on his blog. Can’t wait to try out the adventure game plugin for Jovo. Is there an “ideal” Voice User Interaction approach?


Not nec. a perfect solution - but what if, as they solve a room, they earn a clue that is part of a bigger story line/puzzle…so that hopefully gives them a reason to play more rooms and come back day-over-day to see the larger story get revealed and solve the bigger mystery.

Would love to see if/how that would work out for you!


…or at the very least solving X number of rooms unlocks new/advanced rooms for them to be able to play…some of the ‘you pay to play’ or if you are willing to put the time in, you ‘earn to play’ concepts that more traditional games often use…


Yes that is really interesting. Currently these games don’t really connect to an overarching storyline but I think that would make it really compelling to keep playing so you can learn more about the story and develop certain characters (there aren’t really characters in these games yet either), but it could be sort of like SAW, where there is a criminal mastermind creating the scenarios that leaves clues and maybe there is a final room where you have to play the others and get all the clues to beat it?

Sounds like an interesting avenue to explore


Ha! While reading halfway through your post I thought of SAW. Really interesting! The first few movies didn’t really feel like there’s an overarching story (at least to me 10 years or so ago) but I really enjoyed the the last 3 movies when it really got into the backstory and how everything connected. (I’m not really a horror movie person)