Looking for method to insert a delay between audio clips with Alexa AudioPlayer

Ben_Hartman · 2020-06-06 07:10:48 UTC

Is there an elegant way to add a delay between audio files when using the Alexa AudioPlayer? I have a skill where the audio files are currently playing back-to-back with no real separation between them. I tried passing a .tell with an SSML delay (<break time="2s"/>) to this.$audioSkill.$audioPlayer, but that only seems to work with the PlayIntent and not when I ENQUEUE the next clip in the PlaybackNearlyFinished Intent. If I put one there, the skill simply stops without advancing to the next clip.

Unfortunately, I cannot change the audio files, because they are also used by a mobile app that we’ve developed. I was thinking of creating an audio file that contains a second or two of silence that I ENQUEUE between each clip, but that complicates the logic a bit (e.g. Next and Previous intents must now account for the blank clip). It also seems like a hack and not really what I’m looking for. However, maybe that’s my only option…

Any thoughts?

Ben_Hartman · 2020-06-07 11:28:03 UTC

For now, I decided just to use an MP3 with a one second pause and I use the incoming token to determine whether I should queue up the next chapter or a pause (this.$alexaSkill.$audioPlayer.getToken()). So far, it seems to be working out ok.

However, I’m curious if this would work with a Jovo webhook. I think in my earlier testing (before using Lambda) the Jovo environment always produced a token of ‘silence’, so I couldn’t use the incoming token to make this distinction.

jan · 2020-06-08 10:13:28 UTC

Hi Ben, a silent MP3 file would also have been my suggestion.

Regarding the webhook: hmm, usually the webhook and Lambda should behave in the same way. Do you have a sample request/response where you see a difference?

Ben_Hartman · 2020-06-08 11:21:18 UTC

Hi @jan, I tried the webhook again with a physical device and you’re correct, the behavior was the same as with the lambda function. What had me confused is that beforehand I was using the Jovo Debugger for simulating the AudioPlayer (since Alexa’s simulator won’t play audio files). If I use the Jovo Debugger, the token is always “silence”. Here’s an example:

With the debugger:

+++++ AlexaSkill.PlaybackStarted
Playback started for token: silence
{
	"version": "1.0",
	"response": {
		"shouldEndSession": true
	},
	"sessionAttributes": {}
}
{
	"version": "1.0",
	"context": {
		"AudioPlayer": {
			"offsetInMilliseconds": 0,
			"token": "silence",
			"playerActivity": "PLAYING"
		},

Same code but with a physical device:

+++++ AlexaSkill.PlaybackStarted
Playback started for token: chapter_0
{
	"version": "1.0",
	"response": {
		"shouldEndSession": true
	},
	"sessionAttributes": {}
}
{
	"version": "1.0",
	"context": {
		"AudioPlayer": {
			"offsetInMilliseconds": 104,
			"token": "chapter_0",
			"playerActivity": "PLAYING"
		},

Is it expected behavior that the debugger produces a token of silence?

jan · 2020-06-08 14:37:46 UTC

Ah, thanks!

The Jovo Debugger uses static sample requests. For audioplayer requests, we could think about an update. cc @AlexSwe

AlexSwe · 2020-06-09 10:22:01 UTC

Oh, oops. That’s a funny coincidence. Like Jan said, it’s a static example request. I copied it (long time ago) from another project. It’s not a real “silence-request”