Google has open-sourced their AI model for converting sequences of natural language instructions to actions in a mobile device UI. The model is based on the Transformer deep-learning architecture and achieves 70% accuracy on a new benchmark dataset created for the project.
A team of scientists from Google Research published a paper describing the model at the recent Association for Computational Linguistics (ACL) conference. The goal of the project is to help develop natural-language interfaces for mobile device users who are visually impaired or who temporarily need a “hands-free” mode.