This document presents techniques that facilitate mobile robots to be deployed as interactive agents in populated exhibitions and trade shows. The mobile robots can be tele-operated over the Internet and, in this way, provide remote access to distant users. Throughout this document we describe several key techniques that have been developed in this context. To support safe and reliable robot navigation, techniques for environment mapping, robot localization, obstacle detection and path-planning have been developed. To support the interaction of both web and on-site visitors with the robot and its environment, appropriate software and hardware interfaces have been employed. By using advanced navigation capabilities and appropriate authoring tools, the time required for installing a robotic tour-guide in an exhibition or a commencial fair has been drastically reduced. The developed robotic systems have been thoroughly tested and validated in the real-world conditions offered in the premises of various sites. Such demonstrations ascertain the functionality of the employed techniques, establish the reliability of the complete systems, and provide useful evidence regarding the acceptance of tele-operated robotic tour-guides by the broader public.
The webfair project utilizes state-of-the-art technologies in robotic navigation, remote robot control, web-interfaces and media communication, to facilitate realistic and personalized tele-presence in large exhibition workspaces.
The main system components of the webfair system are presented in Figure 1. Through a web-interface, users all over the world are able to tele-control the robotic avatar and specify exhibits, or interesting places (e.g. the room where old cars are exhibited) that they may wish to visit. Real-time video as well as high-resolution photographs captured by cameras located on the robotic avatar facilitate detailed inspection on demand. The system also utilizes a multimedia information base providing a variety of information about the exhibition at various levels of detail (e.g. information on the participating organisations, on their products, links to their Web sites, etc.). Interaction with on-site attendants at exhibition booths is made possible through a visual/audio interface that establishes a natural dialogue between the two parties on request.
Besides offering interaction with remote users, webfair robotic avatars also offer interaction with on-site visitors. Through touchscreens specially-mounted on top of the robotic avatars, on-site visitors can instruct robotic avatars to guide them to the physical sites of specific exhibitors or retrieve information from the webfair system database.
Figure 1. Webfair system components
In order to navigate safely and reliably, mobile robots must be able to create suitable representations of the environment. Maps are also necessary for human-robot interaction since they allow users to direct the robot to places in the environment. Our current system uses two different mapping techniques. The first approach is an incremental technique for simultaneous localization and mapping that is highly efficient and uses grid maps to deal with environments of arbitrary shape but lacks the capability of global optimization especially when larger cycles have to be closed. The second approach relies on line-features and corner-points. It uses a combination of a discrete (Hidden Markov) model and a continuous (Kalman-filter) model and applies the EM-algorithm to learn globally consistent maps. Both methods offer very promising alternatives, each one with its own merits. In environments with no clearly defined structure (walls, corridors, corners, etc.), the former method is more adequate, at the price of slightly decreased loop-closing capabilities. When the environment structure becomes evident, the latter method can be employed, to render more robust loop closing. Both mapping approaches are described in the remainder of this section.
This first mapping technique uses occupancy grid maps and realizes incremental probabilistic mapping schemes that have been previously employed with great success. Mathematically, a sequence of robot poses and corresponding maps is calculated by maximizing the marginal likelihood of each pose and map relative to the previous pose and map. The overall approach can be summarized as : At any point in time the robot is given an estimate of its pose and a map After the robot moves further on and after taking a new measurement, the robot determines its most likely new pose. It does this by trading off the consistency of the measurement with the map and the consistency of the new pose with the control action and the previous pose. The map is then extended by the new measurement, utilizing the calculated post-probable pose at which this measurement was taken.
To align a 2D measurments scan relative to the map constructed so far, an occupancy grid map is computed out of the sensor scans obtained so far. Additionally, we integrate over small Gaussian errors in the robot pose when computing the maps. This increases the smoothness of the map and of the likelihood function to be optimized and, thus, facilitates range registration. To maximize the likelihood of a scan with respect to this map, we apply a hill climbing strategy.
Figure 2 depicts an occupancy grid map obtained with the approach described above for the Belgioioso exhibition center.
Figure 2. An example occupancy grid map.
In the case of structured environments, localization accuracy can be increased by constructing and employing feature based maps of the environment. Our feature based mapping algorithm utilizes line segments and corner points which are extracted out of laser range measurements and treats map features as parameters of the dynamical system according to which the robot's state evolves. Therefore, the problem is reformulated as to simultaneously determine the state and the parameters of a dynamical system; that is, a learning problem, that is solved via a variant of the EM-algorithm . In the mapping context, the E-step is responsible for calculating the state of the robot at each point in time, while the EM step is responsible for utilizing the calculated states in order to recompute the map. Since the approach described here is an off-line mapping technique, all past and future observations are available and can be used during the E-step. The problem of estimating variables given both past and future observations is denoted as “smoothing.” .
To detect and close loops during mapping, our algorithm relies on the global localization capabilities of a hybrid method based on a switching state-space model. This approach applies multiple Kalman trackers assigned to multiple hypotheses about the robot’s state. It handles the probabilistic relations among these hypotheses using discrete Markovian dynamics. Hypotheses are dynamically generated by matching corner points extracted from measurements with corner points contained in the map. Hypotheses that cannot be verified by observations or sequences of observations become less likely and usually disappear quickly.
Our algorithm (Figure 3) iterates the E- and the M-step until the overall process has converged or until a certain number of iterations has been carried out. Our current system always starts with a single hypothesis about the state of the system. Whenever a corner point appears in the robot's measurements, new hypotheses will be created at corresponding positions. On the other hand, hypotheses that cannot be confirmed for a sequence of measurements typically vanish. The resulting map always corresponds to the most likely hypothesis.
Figure 3. Flowgram of the featuremapping algorith
The upper image of Figure 4 depicts a typical feature map of the same exhibition site as the one depicted in Figure 2 (Belgioioso exhibition center). The map is computed based on laser range information and the pure odometry data gathered with a mobile robot. As can be seen from the image the odometry error becomes too high to generate a consistent map. The bottom image of Figure 4 shows the map obtained with the approach described above. As the figure illustrates, our algorithm was able to correctly eliminate the odometry error although it had to close serval cycles.

Figure 4. Feature map extraction process. Top: initial map, bottom: final map.
The aim of the exhibition setup tool is to provide an easy and coherent way to set-up the WebFAIR system to a particular exhibition site. The exhibition setup tool is used to organise the multimedia information to be visualised by the remote and on-site visitors. The information includes images, graphics, text, video, audio, web links, etc as required by the particular exhibition. Additionally, the WebFAIR system needs a model of the environment, which is required to visualise the operation of the WebFAIR system. More specifically, this tool presents the following functionality:
Figure 5 depicts an example occupancy grid map for navigation (top) and the corresponding interface picture with nine different points of interest marked on it (bottom).
Figure 5. An example grid naviagtion map (top) and the corresponding interface graphic representation (bottom)
The Avatar Navigation Modules implement the following competences:
The collision avoidance module reacts to unforeseen changes of the environment, based on data provided by the avatar’s sensors. The module assumes input from the sonar and laser sensors. The consortium has developed a technique allowing mobile robots to reliably navigate in populated environments, which employs the model of the robot system and trades off the progress towards a goal location and the free space around the robot. This technique is augmented by the integration and forecasting of motions of people in the environment.
The module is responsible for determining the position and orientation (pose) of each robot using sensor measurements. Localization is a fundamental task in order for the robot to ba able to efficiently plan paths (Path Planning Module) from the current position to arbitrary target positions. Two different probabilitic algorithms are used for localization, both developped by mebers of the consortioum.
The first approach (Monte Carlo localization or particle filters) utilizes a grid ocupancy map in order to compute the probability distribution of what the robot believes it is its position. According to this approach, the probability distribution of the robot's belief is discretized by utilization of a certain number of weighted samples. Particle filters are approximate bayes filters that use random samples for posterior estimation.
According to the second approach, the robot's state space is descritized topologically, based on the corner points of the map. A number of gaussian pose hypotheses are maintained corresponding to the topological locations of the map. Kalman filter update equations are used in order to update each of the gaussian pose hypotheses, while discrete bayessian equations are used to update the pobabilies of each of the discrete topological locations.
In addition to the above reactive navigation modules, the WebFAIR system should be able to plan its trajectories from its current location to a target position. For this purpose, a standard technique, developed by members of the consortium, is employed, which has the advantage that paths can be computed very efficiently, in an on-line fashion and can quickly be adapted to situations in which the collision avoidance module chooses a detour due to the presence of unforeseen obstacles. The planning module of the WebFAIR system is also capable of detecting dynamical changes into the representation of the environment. This allows the system to quickly react to situations in which entire passages are blocked and from which it would not be able to escape otherwise.
In order to efficiently service the visualisation requests issued by several visitors and to minimize the travel time between different locations in the exhibition, the WebFAIR system needs to coordinate these visitors’ requests and accordingly drive the avatar.
Both operations of the module are transparents to the user.
The goals of the Webfair project lie towards the development of an interactive ro-botic system, capable of helping distant (internet) users to remotely access exhibi-tion sites and trade fairs, as well as providing on-site guidance to physically present visitors. These capabilities are directly related with the way users communicate with the robots and since the main channels of communication between robots and users are the interfaces, the Webfair interfaces have an important impact on the overall success of the project.
Although sharing many common properties, much of the required functionality is dif-ferent for remote and on-site users of the Webfair system. Hardware-related restric-tions (remote users typically use personal computers, equipped with keyboards and pointing devices such as mice – on the other hand, robots cannot carry ordinary keyboards and mice) also impose the need for interface diversification. That is, dif-ferent interfaces should be designed based on the needs/restrictions of each of the two major categories of users.
- Regarding remote (internet) users, the proposed system is based on the concept of “robotic avatars”, i.e. mobile robotic agents that operate as the users’ remote rep-resentatives, able to carry out actions in their workspaces (exhibition/trade fair) and transmit information about it back to the users. Remote users should be able to per-ceive the exhibition through the “senses” of the robotic avatar and “act” in the remote environment according to the avatar’s capabilities. Additionally, remote users should be able to instruct the mobile robotic platform to rove in the exhibition site in a de-sired way, visit specific exhibits, guide them to specific parts of the exhibition (a se-ries of exhibits), or simply to wander around. Remote users’ interface should also be designed to run under any hardware/operating system configuration and should make no assumptions about special hardware or software (typical remote users do not possess special hardware or special pointing devices besides standard key-boards and mice).
- Regarding on-site (physically-present) users, Webfair robots should act as robotic guides by accepting commands directly through their on-board interfaces and pro-viding exhibit-related information on user’s demand. One of the key features of the Webfair system lies in the ability of mobile robots to interact in a natural and friendly way with on-site visitors. For this purpose, user input is achieved with touch screens specially mounted on top of the robots (the user touches the screen in order to give input), while robot output is provided by means of both the screen and a speech syn-thesis device.
Each visitor (either remote or physically present) typically spends a limited amount of time with the robot; therefore, means for an easy and reliable interaction between the robot and the visitor should be provided. The key challenge in the design of the Webfair user interfaces is to make them intuitive and easy to use, so that untrained and non-technical users can operate the robots without detailed instructions. The appearance of the robots themselves must also be appealing, so that the users are attracted to them and tempted to spend time with them.
Similar layouts and design concepts have been adopted for the development of both remoter and onboard interfaces. Despite the fact that the on-board interface is based on an LCD touch screen and a speech interface, unlike the remote user interface, which typically runs on a standard personal computer with common input/output devices, both interfaces are web-based and run under ordinary web browsers. The actual interfaces consist of a number of web pages, with appropriate presentation, control and information fields, as described in the next sections.
The developed web-interface (interface for remote users) has been designed to provide enhanced functionality and ease of use, allowing personalized control of the robot. It uses commercial live streaming video and broadcast software to provide continuous transmission of video recorded with the robot’s cameras to the remote user. Additionally, web-users are provided with a flexible control over the robot. They can control the robot exclusively for a fixed amount of time, which is generally set to 10 minutes per user. The user that has control over the robot can direct it to arbitrary points in the exhibition. The user can select from a list of predefined guided tours or direct the robot to visit particular exhibits or locations in the exhibition. At each point in time, the user can request a high-resolution image to be grabbed with the camera's maximal resolution. Furthermore, the interface allows the control of the rotation of the robot enabling the user to look at any desired direction or the user may request from the robot to move around an exhibit in order to view it from several directions. Finally, a video-conferencing feature improves this feeling of tele-presence by allowing the web user to initial a tele-conference call with onsite attendants of the exhibition..
The web interface of the Webfair system relies on Java techniques to provide smooth and realistic visualisation of the avatar's actions. Input can be given by means of the keyboard and the mouse, so that no additional input device is necessary at the user's site. Output is in the form of online audiovisual streams and of images captured by the robot cameras. Additionally, text, graphics, images and sounds are offered to the user to provide necessary background information on the exhibits, the exhibitors and the exhibition site.
A diagram of the control page of the interface is depicted in Figure 6. The left side contains the predefined tours offered to the user as well as the list of exhibits that areknown to the robot. The center shows the live-stream as well as a Java applet animating the robot in a 2D floor-plan. This map can also be used to directly move the robot to an exhibit or to an arbitrary location in the exhibition. Between the map and the live-stream, the interface includes control buttons as well as a message window displaying system status messages. The right part of the interface shows multi-media information about the exhibit including links to relevant background information.
Figure 6. The web-interface.
Besides interacting with remote users, Webfair robots communicate
and interact with on-site visitors as well. For this purpose, a reduced
version of the web interface is displayed in a touch screen
appropriately mounted at the rear side of the robot. The on-board
interface is designed so that the users in the exhibition site have
access to similar services as the Web-visitors. The robot has an LCD
touch screen on it, so that users can enter their requests and the
software will deliver information about the exhibition in a browser
environment, similar to that of the web interface. The users enjoy a
guided tour by following the robot around the exhibition site. At the
exhibits, the robot uses audio to communicate with its audience. The
users can either request additional information or, alternatively, let
the robot proceed.
An important aspect of Webfair’s onboard interface is its reaction to people. Besides interaction through the touch screen the robot uses certain anthropomorphic characteristics to express its intentions and its internal status. As an example, the gazing direction is used to express the intended direction of motion. This way, on-site visitors can adapt their motion so as not to obstruct its path. Moreover, mechanical mouth and eyebrows are used to express “robot emotions” like dissatisfaction or friendliness, by taking the usual facial expressions that we, humans, use to express the same feelings. As an example, if on-site visitors do not obstruct the path of the robot, the robot appears happy. In the opposite case, if people steadily obstruct the robot's intended direction of motion and repeatedly ignore its warning messages for clearing its way, then the mechanical parts of its head indicate dissatisfaction.
Figure 7. The onboard-interface.
The on-board interface is designed so that the users in the exhibition site have access to similar services as the Web-visitors. The robot has an LCD touch screen on it, so that users can enter their requests and the software will deliver information about the exhibition in a browser environment, similar to that of the web interface. The users enjoy a guided tour by following the robot around the exhibition site. At the exhibits, the robot uses audio to communicate with its audience. The users can either request additional information or, alternatively, let the robot proceed. An important aspect of WebFAIR’s onboard interface is its reaction to people. The robot uses certain anthropomorphic characteristics to express its intentions and its internal status. As an example, the gazing direction is used to express the intended direction of motion. Moreover, a mechanical mouth is used to express “robot emotions” like dissatisfaction or friendliness, by taking the usual facial expressions that we, humans, use to express the same feelings. As an example, if people steadily obstruct the robot's intended direction of motion and repeatedly ignore its warning messages for clearing its way, then the mechanical mouth indicates dissatisfaction.
All communication needs between the web/on-board user interfaces and
the main navigation system of the robot are handled by the Webfair
interface server. That is, the Webfair interface server handles all
requests from users to robots and vice versa, serving as the main
integrator between the robot navigation modules and the user interfaces.
The interface server is implemented using the Java programming language and a client-server approach. Accordingly, both web and onboard user interfaces are implemented as web pages created using HTML and containing java applets.
Regarding the user interface (the web/onboard html pages), there
exist different java applets corresponding to the various sections of
the interface depicted in Figures 6,7.
A main Java applet on the web page (the client) opens a connection with
the Webfair Interface server. All necessary information (the number of
users that are waiting in the queue - in the case of the Web Interface,
the location of the robot, messages to the user, control input from the
user etc.) is communicated throughout this connection. Inter-applet
communication takes place between the main applet and all other applets
in the same web page in order for them to use the main connection to
the server and to be synchronized. The user selection applet, for
instance, communicates to the main applet all the user choices, such as
a specific exhibit, a specific tour, or a specific position for the
robot to go to (translated in X, Y coordinates for the robot). This
information is then sent to the Webfair interface server that in turn
will process it and send it to the robot navigation system. Open
communication links are needed between the applets and the Webfair
interface server to convey relevant information (e.g. location of the
robot, video stream, and commands to the robot). These modules are
connected via a TCP/IP socket.
The Webfair interface server (Java application) is connected as a
client to the robot navigation module (server) using a TCP/IP socket
connection. In other words, the Java application acts as a client to
the robot navigation system and as a server to the user interface
applets module ( Figure
8). The Java interface server performs multiple functions for both
the web and the on-board interfaces, managing all user connections to
the system, regardless of whether they come from the web or the
on-board interface. A priority protocol has been established to better
manage and integrate on-line web visitors and on-site users. Thus, on
pre-determined days and times, priority is given by the Java server to
either on-site or web visitors, depending on exhibition visiting hours
or other factors. The Java application creates and manages the queue of
user requests for connections accordingly and activates the first user
in the queue when the time has come. The duration of each user’s
connection is also decided and handled by the server, which sends the
appropriate messages to activate or terminate connections. As a client,
the Java application constantly receives the robot’s coordinate
information from the robot navigation module and conveys it to the
applets. Additionally, the robot navigation module can send information
regarding the exhibit it has arrived at, a string indicating the
filename of the picture requested so it can be displayed to the user
through the web browser, and other messages concerning the state of the
robot. The HTTP server runs on the same machine as the java interface
server for security reasons since the applets are allowed to open a
socket connection only back to the hostname from which they were
loaded. Thus, the java interface server acts as an intermediate stage
between the web serving and the final output, integrating all modules
together.
.
Figure 8: Block diagram of the WebFair interface server
For more details email to: webfair@ics.forth.gr