Multimodal apps with jovo(Voice + Vision)


I am now planning to create simple apps for Echo Show and Google Home devices.
These apps are going to have 2 or 3 screens at max and user can navigate using voice or by touching.
How can I go about this using jovo for both platforms. I understand that I will need separate apps for both platforms in this case.