The Mozilla Actuator
The Mozilla actuator inherits from the abstract_actuator class to provide some limited voice control to a running Mozilla process. It provides support for opening URLs in the current pane, in a new tab, or in an entirely new window, as well as the ability to open new windows and tabs to a default URL. It also implements its own forward/back stacks; forward and back navigation can be requested directly, or through the inherited undo and redo functions.
Because we haven’t yet found a way to send arbitrary messages to applications, we are using an alternate method of controlling Mozilla. Unix and Unix-like systems provide a remote-activation command line option for Mozilla. We are using system calls of the form
mozilla –remote “openurl(www.carleton.edu)”
to control Mozilla. Documentation for remote operation of Mozilla in this manner can be found on the Mozilla website.
The actuator is designed to work with spoken-English representation of web sites, using maps to convert from English text to a valid URL. For instance “google” will map to www.google.com. One map is initialized from the data file mozillaAc.cfg, and can be specified by the user. The other map is dynamically updated with each webpage to map the text of each hyperlink to its destination. This allows the user to follow links just by speaking the text of the link.
When it opens a new URL, the actuator first makes a system call to curl to download the raw html from the new site. This file is then scanned for all the links, and the links and their targets are loaded into the link map. In addition, the text of each link is written out to file which is dynamically compiled into the grammar for the parser (again with a system call), in order that future requests to follow the links will be recognized by the parser.
The mozillaAc.cfg file which is located in the Data directory allows for user specification of a number of settings. These include the static mapping from English speech to valid URLs, the default URL to which new tabs and windows will open, and the path to parser on the users’ machine. The file itself contains a fuller description of how it may be modified.
Known Bugs
- Link names whose first word matches the English of one of the statically-mapped URLs will open to the static URL rather than the link target. This is probably caused by the order in which the maps are queried. It is unknown whether this case can be handled without greatly changing the current architecture.
- Special html characters which must be escaped with &; are not all handled properly.
- Unexpected formation of the text of links (new lines, for example) can cause the newly-compiled grammar to fail to parse new utterances.
- If the user manipulates Mozilla without using the actuator, the forward/back stacks in the program will be out of synch with the internal stacks used by Mozilla.
- There is currently no support for vocally specifying valid URLs. This limits the sites that can be visited. Adding this feature is not difficult, provided the parser can be made to work with statements like “go to www dot Carleton dot edu”.
- Interaction with forms, buttons, surveys, images-as-links, flash, and other non-text-based features of the web are not, nor can they easily be, supported.