Google Action Audio Issue

google-assistant

#1

Hi, we have been having issues with Google Actions not playing our mp3s that we have stored in AWS S3 buckets. The Alexa skill and Google Action use the same Lambda function for fulfillment. We have no issue with Alexa, but for some reason Google will sometimes either not audibly play a file (time elapses as if it were playing) or skip over completely. . Completely random moments in the user experience. We will have 2-3 sessions where there is no problem and then a consecutive stretch where a URL is totally skipped.
When we check the logs, there is no indication that anything is wrong. We have tested on both the iOS device as well as the Google Home mini. The problem has persisted for a couple weeks.
Any ideas come to mind as how we might be able to address this?

Appreciate any help!


#2

Hi Craig,
I have indications we are experiencing the same issues.
The response JSON for our Google Action has all the Audio Files (mp3s stored on S3) included but it seems like the Action sometimes does not play a file.
I suspect this is a Google issue accessing files on S3.
Since there is no error log it is really hard why devices would not play audio.
Let me know if you habe any hints!
Best,
Dominik.


#3

Could you only observe the problem with files on S3 or does it happen with any storage service?


#4

@AlexSwe we only use S3 so i don’t have any cross check.


#5

@AlexSwe Same for us. Our entire library of audio files are on S3 only. The problem is also incredibly random. It is not just one part of our experience. It will be completely different each time, and no correlation with the duration of audio file.

This happens on the speaker as well as on the iOS app and testing from the Google Action console.


#6

Hi,

Would anyone have a recommendation to where might be the optimal place to store the audio files for Google Action (while using Lambda function for fulfillment) as S3 buckets seem to not be a consistent solution. Not wanting to completely migrate away from S3 as there is no problem with Alexa skill and our setup. But unfortunately our Google experience is being affected and our logs are not providing the answer.

Thanks in advance for any suggestions you might have as to how to remedy this without needing to restructure everything.


#7

Hi @craig,

One suggestion I would offer is Google Firebase storage. I have an Alexa Skill (sorry, not a Google Action for this case) that accesses MP3s and background images from firebase. The reason I used Firebase in this specific instance was because we have a mobile app that already used this storage and we didn’t want to create a separate S3 bucket just for the voice app. It seems to work well - I just have a call using the Firebase API to get a URL for the audio file and I pass that to the .enqueue() method of this.$alexaSkill.$audioPlayer. I’m assuming that you could do something similar with this.$googleAction.$mediaResponse.play().

Just my thoughts…

-Ben


#8

I am not sure if this is a S3 issue at all. It might be a device related issue so moving to Firebase or any other hosting provider will not solve the issue. Another assumption is that a S3 setting causes this problem.


#9

Thanks for that suggestion @Ben_Hartman. We ended up changing files from S3 to Firebase. Unfortunately it persisted.

I am thinking that it might be on how we use sequential mp3 files. While there is no issue at all on Alexa, would adding multiple mp3 URLs to a single speechbuilder cause an issue? We had been using speechbuilder with multiple .addAudio(URLs).

Guessing we should be using SSML if we are using multiple mp3 URLs. Would that be accurate?

Appreciate the suggestions so far.

Craig


#10

This is the same. The speechbuilder generates SSML in the end.

Might be differences how Alexa and Google Assistant support audio though. How many files are you playing in a row and how long are they in total?


#11

Ok, appreciate the clarification.

3 files in a row. Shortest is 1 second. Longest might be 20 seconds. Total length would be 30 seconds.
Logs show nothing wrong. Unfortunately though, the audio is just as likely to skip at the welcome intent as it is through the action.

Would you recommend each .addAudio() url to have a text even if one is a chime?

Example of code below:

this.$speech
.addAudio(audio_url)
.addBreak(‘250ms’)
.addAudio(
this.t(’.audio url’)
)
.addBreak(‘250ms’)
.addAudio(this.t(audio url));
this.$reprompt.addAudio(
this.t(‘audio url’)
);

        if (this.getType() === 'GoogleAction') {
          const text = this.t(`.TEXT`);
          this.$googleAction
            .displayText(text)
            .ask(this.$speech, this.$reprompt);
        } else {
          this.ask(this.$speech, this.$reprompt);
        }

#12

Could you maybe share the relevant part of your response JSON that shows the SSML response?


#13

You’ll notice below that theres SSML for three audio srcs, the first two played, but the third one did not.

“payload”: {
“google”: {
“expectUserResponse”: true,
“richResponse”: {
“items”: [
{
“simpleResponse”: {
“ssml”: “<audio src=“https://firebasestorage.googleapis.com/v0/b/randomskill-00000.appspot.com/o/US%20-%20English%20Audio%2FTell%20Me%20More%2FTMM%20.mp3?alt=media”/> <break time=“250ms”/> <audio src=“https://firebasestorage.googleapis.com/v0/b/randomskill-00000.appspot.com/o/US%20-%20English%20Audio%2FReturning%20You%20%2FNF%20Returning%20to%20where.mp3?alt=media”/> <break time=“250ms”/> <audio src=“https://firebasestorage.googleapis.com/v0/b/staging-00000.appspot.com/o/US%20-%20English%20Audio%2FOnly%20One%2FNF_9-%20IF%20ONLY%20ONE%20GIVER%201.mp3?alt=media”/>”,
“displayText”: “Sample text here?”
}
}
]
},
“noInputPrompts”: [
{
“ssml”: “<audio src=“https://firebasestorage.googleapis.com/v0/b/randomskill-00000.appspot.com/o/US%20-%20English%20Audio%2FReprompt%20-%20Time%20Out%2FNeed%20More%20Time-.mp3?alt=media”/>”
}
],
“userStorage”: “{“userId”:“a1990f7b-cb2f-423c-82a3-980f6a08b306”}”
}
}


#14

I still think this is a device / firmware issue. Our response JSONs look beautiful. We do not scratch any limits (in terms of audio length, amount of audio files, …). What i have discovered: The likelyness of the last audio in a response to be skipped is higher than the audios in the beginning of the response.


#15

Appreciate the insight. Kind of wish we had a pattern as to which files wouldn’t play. Fairly sporadic for us. Sometimes it’s when we only have one audio URL. One strange thing we have noticed is that the time of day this happens. In US Eastern time zone and seems to be way more prevalent from 9am- ~ 3pm and then afterwards, it’s much much better. I initially thought it was just my Google Home Mini, but the Google Actions console test has this issue a decent amount as well.