Image Image Image Image Image Image Image Image Image Image

Bartlett School of Architecture, UCL

Scroll to top


No Comments

Teaching a Robot To See: A Conversation in Eye Tracking in the Media Arts and Human-Robot Interaction

Teaching a Robot To See: A Conversation in Eye Tracking in the Media Arts and Human-Robot Interaction
  • On October 8, 2019

Eye movements, most effectively Point of Gaze, can help predict social cues in communications. When included in a conversation participants look for these social cues to give affirmation or feedback regarding how to behave and react. Point of Gaze can attribute to intention which can help elicit conversation in Human- Robot Interaction (HRI). By including each agent in a constant check of intention through non verbal cues, communications can begin to move beyond simple uncanny reactions, but through depths of information in understanding how to navigate these feedback systems. By examining the calibration sequence of eye tracking technologies, we can create novel tasks as check points in HRI communications. These novel tasks when paired with artistic expression can create an illustrative environment which captures attention and  intentions more definitively and accurately.  

1.1 Introduction

In reference to recent technological advances in precision and affordability, eye tracking technologies have become available to visual media artist whose aim is to create a dialogue with robotics interactions. The aims of this paper is to introduce the reader to current trends in eye tracking measurements and how they have been applied to create tracking systems which can be used for interaction design within an environment and the participant. By giving reference to the technological agenda my research project strives to  bridge the gap between the pragmatic conditions set by these technologies and design choices/ aesthetics. By studying calibration processes in accuracy and precision of eye tracking this paper uncovers how these techniques can be used to aid in collaborative novel tasks between humans and robotics. 

1.2 Research: 

How can processes in accuracy and precision of eye tracking uncover techniques used to aid in collaborative novel tasks between humans and robotics in media arts? 

This method will ask questions of the current state of eye tracking and the current state of Human – Robot Interaction in relationship to eye tracking. Chapter 4 will focus on my research and design interactions to stimulate communication and collaboration, while chapter 5 will evaluate and speculate on future iterations. 

1.3 Your Eye’s Motion, By Luna

Figure 01: Your Eye’s Motion, by Luna, 2019.

 Your Eye’s Motion by LUNA i”‹s an exploration of human perception through robotic motion. The installation tracks the position of your eye and the direction of gaze or Point of Gaze(POG) as an input for controlling a small robotic arm situated inside an enclosed environment. The gaze of the viewer is mapped on the range of motion of the robotic arm to create an extension of the eye and take the body to places it cannot access otherwise (similar to an optical device).

1.4 The Entangled Eye

Figure 02: The Entangled Eye promotional photograph by Dautel, Bugdayci, and Wuss, 2019.

The Entangled Eye is an exploration of visual perception through robotic motion. Luna and Laika are two robotic creatures with curious and animate behavior programmed to elicit your attention. As the whimsical creatures chase your eyes with unique kinematic expressions, the direction of your gaze orchestrates a conversation. The indeterminacy of attention coupled with the animacy of kinetic movement constitutes the basis for our attempt at reconfiguring and understanding the perceptual experience of behavioral artifacts and animism.

1.4 Goals

My research projects have become an extension of questions regarding exploration of visual perception aided by tracking of the eye and mapping artifacts to robotics. Can media artists begin working with new models of creative research?  By examining practices in eye tracking research this paper addresses a significant and technological beacon that can aid in the relationship allowing researchers the ability to understand how these technologies effect design situations. When applied to Human Robot Interaction, can the gaze can help systems analyze and anticipate the needs/ attention of the participant?

Chapter 2: Eye Tracking and How it Works:

Eye Tracking has been used to understand attention, memory, language, problem solving, decision making in many fields these include and are not limited to medical research (ALS + Glaucoma), professional performance, such as the attention of air traffic controllers, infant behavioral analysis and virtual reality. By understanding how participants visually perceive their environments media artists and designers can begin to expose qualities of engagement rather than analysis at a very surface level understanding. The “eye mind” hypothesis infers that attention can be measured through cognitive process connected between the eye and mind(Just and Carpenter, 1976). By studying the point of gaze in eye tracking techniques researchers seek to uncover truths around visual perception and eye motions. 

2.1 Types of Measurements

In an effort to understand the different types of eye movement measurements I would like to briefly cover the different fields of study within eye tracking research. These sets of dependent variables in data help quantify eye movement measurements as different properties of movements during a period of time. These properties contain attributes of direction, amplitude, duration, velocity, and acceleration (Holmqvist et al., 2017 ). Questions that are asked from these variables include: In what direction did the eye move? How far did the eye move and for how long? Was there any acceleration during these movements? These questions can be further broken down into categories of measurements which include Movement Measurements, Position Measurements, Numerosity Measurements, Latency Measurements. 

Figure 03: Free Scan Path sweep with vertical direction
(Holmqvist et al., 2017)

Movement Measurements refer to eye movements in space and the properties of these movements. Movement direction takes into account the scan path or linear/curved paths of the eye during saccades or drifts.  Saccadic movements occur during visual perceptions as our eyes move from object to object quickly creating a mental picture to the brain. Saccades are important in understanding how images are formed on the horizontal and vertical planes. This motion can be plotted on a linear histogram and can be averaged together to create an Overall Fixation Vector, however very rarely take the shortest path between two points. The Overall Fixation Vector applied to movement with the eyes creates a ScanPath.  “Scanpath direction is a measure of the general

direction of a sequence of fixations and saccades while scanning a stimulus”(Holmqvist et al., 2017 p.301).

While Movement Measurements track movement in space, Position Measurements focus on the stillness of gaze in one or many positions. This measurement identifies: Where does the participant look and for how long? Position is measured in cartesian X and Y coordinates and Areas of Interest (AOI). By applying position dispersion to the coordinate data we can understand how focused a participant is while following stimulus in an environment. When recording position data, raw data is averaged to create a Fixation data set which tries to eliminate noise from the data and more commonly used in this field of measurement. Over the years many have attempted to gather data from a Z vector in a 3 dimensional plane which considers the relative distance to the object of fixated gaze. However this  is calculated by using an averaging function from an already averaged set of data, therefore is not usually used in conversation as the data can be quite noisy. Position Measurements also at times measure pupil dilation. This form of data can be extracted to form conclusions over mental illness and arousal over stimulus in the environment (Holmqvist et al., 2017 ). 

Numerosity Measurements calculates how many eye movements occurred during an event detection or stimulus of an observation. The frequency of saccades can help detect accurately how many events were detected by a participant. Blink rate is also attributed to this form of measurement(Holmqvist et al., 2017 ).

In an attempt to add time and space to the study of eye movements, Latency Measurements add distance (from one point to another) and time delay. The value of Latency to the frequency of an eye tracker camera is paramount to the accuracy of its data set.  Eye detection algorithms must account “for values of latency measures to be correctly measured, it is crucial that your stimulus program and recording system are temporally accurate; if the flash or new stimulus picture is shown 48 ms later than the mark shows in your eye-tracking data file, due to slow loading and rendering, your latency value will be 48 ms too high…the sampling frequency on average causes a constant error of half a sample (e.g. 1 ms for a 500 Hz eyetracker) for all latency measures… and an 8 ms refresh time will reduce the efficient speed to 125 Hz for these measures, even if a much faster eye-tracker is being used”(Holmqvist et al., 2017 p.427-428).

2.2 Scleral search coils

Figure 04: SSC Headset (Whitmire et al., 2019)

Video based eye tracking has attracted most consumer products, mostly due to the unobtrusiveness of a helmet or glasses based system. In recent years many eye research teams argue that with Magnetic, tracking using scleral search coils (SSC,) allows for a much higher accuracy and requires a mounting system on a human head. “In this technique, the head is positioned between large Helmholtz coils, which generate a uniform magnetic field. A wire loop embedded in a silicon annulus is placed on the sclera of the eye. The magnetic field induces a voltage in the scleral coil according to its orientation. 

By examining the magnitude of the voltages induced in the thin wires leading from the coil, the system estimates the eye’s orientation”(Whitmire et al., 2019). The orientation is then calculated using the Biot-Savart law which accommodates magnetic fields drawn by currents. The law states how the value of the magnetic field at a specific point in space from one short segment of current-carrying conductor depends on each factor that influences the field.(Encyclopedia Britannica, 1998).  This technique however accurate requires an invasive insertion technique into the eye which requires the use of a local anesthetic and a specialized technician. (Whitmire et al., 2019).

In Figure 04 coiled electromagnets are placed on a VR headset to track magnetic fields and orientations between the coils embedded in the eye (Whitmire et al., 2019).

2.3 Eye corneal-reflection/pupil-centre

Figure 05: Dark pupil and glint(left), bright eye (right)
(Gneo et al., 2012)

The most common and least invasive eye tracking technique is based on a system using a stationary camera and illumination. By using infrared light pointed in the direction of an observer’s eye a vector can be calculated during the calibration state to determine angle of reflection from the cornea reflection or “glint” (Holmqvist et al., 2017) in relationship to a pupil centroid. By illuminating the centroid the pupil appears as a white or black colored blob which can then be detected through computer vision or blob detection, also known as the “bright pupil” effect (Poole and Ball, 2004).  

In Figure 06 below the camera is placed at different angles in front of the eye with an infrared light. The larger white circle represents the bright pupil while the small white circle represents the glint.

Figure 06: Three images showing glint and bright pupil reflection from infrared illumination.
(Poole and Ball,2004)

There are 3 different types of corneal-reflection/ pupil-centre techniques associated with camera and illumination. The most commonly used system is known as the static eye tracker which places the camera and light on a table in front of the participant. This system is in close contact with the observer while stimuli is presented usually on a monitor to represent Point of Gaze(POG) or the association one might have with a computer mouse clicking and navigating qualia on a screen. The second type of tracker commonly used is by placing the camera and illumination device on a fixed location located on the participant. This can be in the form of a helmet or glasses. A scene camera is usually used to record and analyze stimulus being observed. A third type of system places a tracker on the helmet or glasses to effectively give distance between stimulus and the participant.

It is important to briefly understand the difference between remote systems and tower/ head

The SMI 1250 High Speed Tower calibration experiment by
Holmqvist et al., 2017

mounted  systems. By allowing the participants to engage freely with their environment, remote systems allow for free head movements and engagement. These systems are usually built into the environments such as cars and flight simulators. Tower and head mounted systems require the participant securing themselves into a device to stabilize head and body movements. This technique usually requires a chin strap or stabilization system to keep the head from moving from one side to the other (Holmqvist et al., 2017). 

In Figure 7 The SMI 1250 High speed tower eye tracker locates a participants head into a fixed location in a precisely detect eye movements in relationship to stimuli on the screen (Holmqvist et al., 2017). 

One of the main weaknesses of pupil based corneal reflections systems is that the accuracy may be disturbed by eyelashes or drooping eyelids. Many also argue that extreme gaze and reflection angles can cause a drop of accuracy .

2.4 Calibration

In order to determine accuracy of a participant’s Point of Gaze it is important to calibrate each user to the relationship of their eyes to the monitored system. This calibration works by displaying a dot on the screen. If the eye fixes for longer than a certain threshold time and within a certain area, the system records that pupil-centre/corneal-reflection relationship as corresponding to a specific x,y coordinate on the screen. Once the image processing software has identified the centre of the pupil and the location of the corneal reflection, the vector between them is measured, and, with further trigonometric calculations, point-of-regard can be found (Poole and Ball, 2004).  

Figure 08: Visualization of data plotted during a DIY eye tracking calibration
(Janthanasub and Meesad, 2015)

However this system lacks the ability to calibrate for 3 dimensional spaces and stimuli which requires Z axis focus. I will try to explore this process in the following chapter as it pertains to my research and design project. By using robotics to create a 3 dimensional calibration platform my team hopes to enhance and understand the constraints and abilities of focus in more than two axis. 


2.5 Convolutional Neural Networks

Until recently, trained Neural Networks working with regression models were a thing of pure speculation. Over the past 5 years machine learned algorithms have taken the interest of technological firms and businesses on a global scale. These Convolutional Neural Networks base their mathematics on regression models and gradient descent to plot data points onto a 3 dimensional plane.  (Kogan, 2018). Figure 09 shows a general scheme in calculating point of gaze prior to advances in Neural Networks(Gneo et al., 2012).

Figure 09: An example of general calculation in Point of Gaze mapping
(Papoutsak et al., 2016)

The power of Convolutional Neural Networks support the idea of training a system to recognize pixel data using image classification. From a data set we can apply the training algorithm to each individual pixel. Let (f) represent a function whose aim is to accurately guess the value of that pixel in response to pixel value it was feed went prompted. It does this over and over the more data it is feed until it successfully completes its hypothesis. By compartmentalizing image pixels, CNN’s have the ability to break down the image into smaller samples for high predictability using a “weight” or a mathematical breakdown of the data set to achieve the highest accuracy. These models are investigated thoroughly by a data scientist who’s occupation relies on computational practices in gradient descent. This computational function plots points onto a 3 dimensional plane while analyzing values within the valleys or arcs of the data. By doing this computation over and over on a data set of roughly 10,000 points a clear representation of the image can formed to look close to the original data.(Kogan, 2018)

In eye tracking we can assume a centroid calculated on the pupil. Using these forms of CNN’s the algorithm can attempt to map Point of Gaze to pixel data on a screen in a similar fashion to image classification. When applying neural network analysis an independent model (linear regression) is used rather than a feature extracted model as seen in figure 09.  These independent models rely on computation using Artificial Neural Networks combined with Convolutional Neural Networks to efficiently analyze calibration data sets to bring accuracy from 1 degree deviation to .06 degrees, which has been successfully done in the industry today (Gneo et al., 2012).

Researchers at Brown University and Virginia Institute of Technology have begun research around web based eye tracking methods. By accessing a participants webcam on their computer, researchers Papoutsak and collaborators,  have used linear regression models in trained Convolutional Neural Network data sets. During the calibration sequences it is assumed that the user is looking in a particular Area of Interest and therefore an image sequence is created to train a data set.  This assumption was methodically tested against the accuracy of a Tobii 60 Eye-X, a 50 Hz tracker and concluded that when applied to a task during a calibration sequence, eye movements can be predicted and compared using machine learned data sets. (Papoutsak et al., 2016)

2.6 Choosing the right Eye Tracker

While browsing the market for the right eye tracker for research or artistic purposes it is important to take into account how the eye tracker will be used in relationship to the data accessed. Companies like Tobii, SMI, and EyeLink offer a variety of cameras all accessing information at different frequencies with multi or single camera alignment.  More expensive products will offer higher frequency rates as well acquire data from multiple cameras. A sampling frequency is measured in Hertz (HZ) and can record the gaze of a participant at the sample rate to given seconds. A 50 Hz sample rate takes an image 50 times per second which in eye tracking is a low number. In research conditions it is most common to find a professional tracker running at approximately 250 Hz (Holmqvist et al., 2017). The higher the sampling frequency the more infrared illumination is needed much like a camera shutter and ISO in photography. Sample frequency is important as lower end cameras may not be able to measure saccadic movements and may offset Latency Measurements as mentioned previously in this chapter. 

Many eye tracking companies as well use artificial eye models as a means to increase precision. These models apply standard deviation analysis as well as mathematics in Root Mean Square (RMS) and is calculated using angular distance. Poor eye trackers have an RMS of 1 degree or higher while high end trackers have a precession better than .1 degree. The rule of thumb for study of microsaccades and gaze-contingent studies should have an RMS lower than .03 degrees. Higher RMS values will have more noise in their data sets.(Holmqvist et al., 2017).


Chapter 3: Applications in the Arts

3.1 Eye Tracking in the Media Arts

Figure 10: An examples of facial detection using Gene Kogan’s FaceOSC from ML4A
(Wuss, 2019)

During the last few decades designers have approached eye movements and design forms in novel sites of interaction and engagement. Although affordable high trackers have been on the market for the last few years, eye tracking has been populated as a form of facial detection prioritizing head gaze to acquire a generic Point of Gaze .. Many artists included in this chapter work with facial recognition software much like open source projects such as FaceOSC (Kogan, 2018) and Microsoft Kinect facial tracking libraries. Head gaze has become synonymous with eye gaze and has been most systematically applied in robotics research in the laboratory setting as well in the media arts.This chapter seeks to identify a few artists and researchers whose work embodies Point of Gaze within its interactive stimulus.

3.2 Practices of Human-Robot Interaction using Eye Tracking

Applied to robotics, researchers have been attempting to bridge the gap between research conversations regarding gaze and responsive systems. Many Human-Robot Interaction(HRI) systems apply visual feedback mainly for task acquisition such as learning to walk or grab objects in space. These systems at many times use low resolution cameras to increase bandwidth of task application in computing power rather than a high depth eye tracker. Recently researchers have concluded that when tasked with moving objects the human gaze anticipates the physicality of the move. When applied to HRI the gaze can help systems analyze and anticipate the needs and attention of the participant (Palinko et al., 2016).

In a study conducted at the Istituto Italiano di Tecnologia Genova, Italy, headed by Oscar Palinko, an experiment was examined with a human and robot collaboration to stack toy building block in a particular order only using eye movements. The team applied gaze reading abilities to the robot seated across from a participant in an effort to understand collaborative relationships. By allowing the participant to trace out instructions with their eye movements the robot was able to stack the block in a specific order (Palinko et al., 2016). The team succeeded in creating a proof of concept of how this relationship with gaze can aid in naturally occurring phenomena perceived by robotic systems. These systems can thus use gaze detection as social cues for performing certain behaviors not only for task performance but its own distribution of engagement with its environment and participants. 

In their research Dziemian, Abbott, and Faisal conducted a set of experiments mapping the point of gaze from a Tobii Eye X tracker onto a writing pencil on the end of a 6 axis UR-10 industrial robot. By creating a 3 dimensional vector based on fixation and dwell time, the team was able to train individuals how to successfully write letters on a canvas.  By detecting fixations longer than 600 ms the team was able to achieve continuous end- point goal orientation through non – trivial tasks of reading and writing (Dziemian, Abbott and Faisal, 2016). The integration of the industrial robotic arm to eye coordinates in space begins a very fascinating conversation in tracking and transposing these coordinates and fixations into 3-dimensionally designed environments.  

In creation of safety measures for HRI, engineers have used eye tracking as a means of creating bi direction communication between humans and robots. This implicit and explicit means of communication is vital in human-human interactions and must be explored in an environment where robotics can move abruptly while failing to meet the standards maintained in analysis of perceived safety. In a study conducted regarding safety and the study of intention with eye tracking, a human was placed in relationship to a moving forklift. The team conducted experiments in which the human eye’s were to detect certain dangers in a moving object. By using spatial augmented reality the team was able to enhance perceived dangers and analyze how robotics can learn from human gaze intention to react to hazards placed in a path. The training of reactionary motion sequences yielded successful as the participants reacted in split moment decisions regarding their own safety to the robotic lift. Using this data the projected motion paths of the robot were adjusted to created real time and highly precise detection systems (Chadalavada et al., 2019).  As seen within these 4 research topics, intention of the gaze and areas of interest continue to be a common thread amongst the work. 

3.3   Media Arts responds to eye tracking technology        

 3.31 Golan Levin and Greg Baltus’s Opto- Isolator (2007)

Opto-Isolator, A mechanical eye by Golan Levin and Collaborators, 2019

The project “ inverts the condition of spectatorship by exploring the questions: “What if artworks could know how we were looking at them?   And, given this knowledge, how might they respond to us?” The sculpture presents a solitary mechatronic blinking eye, at human scale, which responds to the gaze of visitors with a variety of psychosocial eye-contact behaviors that are at once familiar and unnerving. Among other forms of feedback, Opto-Isolator looks its viewer directly in the eye; appears to intently study its viewer’s face; looks away coyly if it is stared at for too long; and blinks precisely one second after its visitor blinks” (Levin, 2009)

The novelty of behavior in a mechanical eye  allows for surface level interaction but lacks the intellectual ambitions involved with more complex eye tracking techniques and development. The project’s system intently studies the visitors face through recognition software such as Face OSC. Levin’s mechanical Point of Gaze creates a feedback of awareness between the participant and their own perception of their own eye’s motion.  The project’s leap into eye tracking technologies could not only enhance it’s design iterations but can also help participants actively engage with their eye movements rather than passively experience their existence. However as a novel expression the project sheds light on optical reflections as a means of navigating interaction design. It begins to create moments and content which helps to stimulate areas of attention and could help address the “unnerving” moments in which Levin is trying to stimulate in this project. Using the research of Palinko’s participation experiments, this project could explore more responsive behaviors to address the unfamiliar relationship to a physicalized eye. 

3.32 Takayuki Todo SEER: Simulative Emotional Expression Robot

Figure 12: SEER by Takayuki Todo, 2018

“SEER” is a compact humanoid robot developed as a result of deep research on gaze and human facial expression.The robot is able to focus the gaze directions on a certain point, without being fooled by the movement of the neck. As a result, the robot seems as if it has its own intentions in following and paying attention to it’s surrounding people and environment. Using a camera sensor, whilst tracking eyes it has interactive gaze.(Takayuki, 2018)

In his essay, The Uncanny Valley, Masahiro Mori  describes a phenomena dealing with the expectation stimulated by interaction with a robotic system. Since humans are breathing and moving we expect signs of life created by the environment. Mori argues that we create empathy to physical gestures we embody. This empathy can be used to surprise one self when dealing with movement in robotics, which makes it easy to detect micro movements that seem human or unhuman. Uncanny relationships can also form from non human forms, however we seek and project gesture.  Much like a pair of reading glasses are not designed to look like an eye, however create a personality in style and charm.(Mori, 2012) 

When dealing with humanoids as an engaged relationship with a participant it may be important to ask questions regarding the uncanny or unfamiliarity of these visual perceptions. Takayuki’s robotic system uses a camera applying a facial detection system which highlights facial gesture as a novel feedback system with the participant. SEER offers a mirroring of facial engagement rather than an analysis of visual needs of the participant. Could this robotic feedback not only mirror expression but empathetically return in a form that a human can perceive? 

3.33 Graffiti Research Labs 

Figure 13: Graffiti Research Lab’s Eye Writer glasses, 2009

In the early 2000’s a member of the Graffiti Research Labs( New York City) became ill with Amyotrophic lateral sclerosis (ALS), also known as motor neurone disease (MND) or Lou Gehrig’s disease, a specific disease that causes the death of neurons controlling voluntary muscles. This shock accelerated a research program the group had been developing in an effort to reimagine how graffiti can be created through eye tracking. “Eye Writer” is an optical device in the format of reading glasses worn on participants, see Figure 13 (Graffiti Research Labs, 2009). The work uses corneal reflection and illumination techniques to allow observers to paint graffiti using a video projector and digital software. 

3.34 Seiko Mikami

In his work, Molecular Informatics, Seiko Mikami utilizes X,Y coordinate data as an event for particles systems to inhabit scan paths as a participant uses a Virtual Reality headset. These molecular scan paths allow for an approach in which the content is moving from image centric art to body centric. By directly interacting with their own scan paths participants gain a training simulation in which they must learn how to see in their environment, especially under conditions set by virtual reality . In his piece “Eye-Tracking Informatics”, Mikami expands on Graffiti Research Lab’s Eye Writer project by trying to playfully explore how the observer can inform the behavior of the observed. By studying visitor’s line of sight in areas of attention, the work  establishes a connection between transformation of space and biological phenomena attributed by participants (Mikami 2009).    

Chapter 4: Eye-tracking Responsive Robotics 

4.1 Investigation

As an investigation into eye tracking my initial research began with the use of a Tobii 4c Eye Tracking Device. This 90 Hz gaming eye tracker sits on the low end but high accuracy eye trackers developed by Tobii. Costing at 149 GBP this eye tracker sits below a user’s visual line and incorporates eye corneal-reflection and illumination analyzed against Tobii’s on board Eye Chip. This chip’s processor allows for easy access of Tobii’s data set and artificial eye model in a fast and accurate manor. In figure 14 below the Tobii 4C exposes its camera( at center) and its illumination lights (far right). (B&H Retailer, 2017)

4.2 Project Overview I’d like to begin to playfully explore the history of the eye and perception from its earliest concluded forms. In 400 BCE Plato began research on a theory around light and body perceived by emissions of visual light (or fire) stretching from the body to physical object. Most importantly the eyes. He spoke of a gentle fire that was soft enough to pass through bodies and forms in space. In Plato’s Timaeus sensations of color are given importance by the acknowledgment of motion as a system of particles within a visual ray(Lindberg, 1978). These concepts of color and motion will be important to design iterations of the following two projects whose aim is to emit our point of gaze as a visual artifact to create meaningful response to our visual perception. 

Luna as a robotic creature fabricated out of 3mm aluminum by Dautel, Bugdayci, and Wuss, 2019

Our robotic system named Luna  is a robotic creature who is modelled after a playful lamp, a small beacon to curiosity.  To create fluidity and proper mapping of the eye to robotic motion, Luna contains three Dynamixel XM430”‘W210T servo motors each at a 4096 pixel resolution. These high end and robust motors not only gains the research speed and precision but a highly programmable interface. The kinematic relationship of the motion to motors is constrained  physically to the extensions of the creature’s arms. Each motor

represents an axis along a 3 dimensional plane for goal tracking of the motion path of the behavior. The X,Y, and Z coordinates allow the user once engaged in eye tracking mode to directly place their extended gaze into the space.

Our systems seek to incorporate eye tracking’s Point of Gaze to map behavior and relationships onto our robotic creatures. By attempting to analyze engagement, the projects ask questions of agency while accessing direct eye movement data. Is the creature informing those how to look through the space or is the participant orchestrating the collaboration? These questions help guide the research into design details. Can these meaningful and curious interactions help stimulate data research in fixations and dwell times? By accessing movement and position data from eye tracking the designer can begin to push layers of interaction beyond novel head movements, but to the contribution and conversation being applied to eye tracking developments. 

 4.3 Your Eye’s Motion, By Luna

Your Eye’s Motion by LUNA i”‹s an exploration of human perception through robotic motion. The installation tracks the position of your eye and the direction of gaze or Point of Gaze(POG) as an input for controlling a small robotic arm situated inside an enclosed environment. The gaze of the viewer is mapped on the range of motion of the robotic arm to create an extension of the eye and take the body to places it cannot access otherwise (similar to an optical device).

In early iterations of the project the group conducted experiments in bright pupil detection and illumination using Computer Vision. The team purchased a 1080p 50Hz web camera, an Infrared filter, and a LED ring of infrared lights. The observer placed their chin on a viewing port/ phoroptor, designed to keep the head at rest much like tower systems in conventional eye tracking. Figure 17 below shows a participant using our own viewing station with the web camera and illumination device close in proximity. By using Isadora, a media integration software, the group was able to create a proof of concept using exploration of bright pupil. The team was successful in mapping eye movements to easing functions on the robotic system, however were unable to robustly conduct mathematics in relationship to the glint or reflection of the illumination device. Figure 18 confirms issues of accuracy within the calibration process in Isadora, a  computer vision software. In an effort to map the eye more efficiently the team had to explore different options which lead to the purchase of a commercial eye tracker(Tobii 4C) in an effort to get more accurate data of the Point of Gaze. In Figure 19, the eye tracker is position between the robotic lamp and the participant. A viewport became an actualization of the viewing angles needed for the eye tracker as well as the 70 cm distance needed for optimal tracking. After a few design iterations the design team removed the chin holding mechanism and allowed for more free range of motion in a rectangular viewing geometry, which lead to the final design attaching itself to a steel cube with dichroic film surrounding the enclosure in an effort to create light artifacts in color and form. 

Figure 19: A participat using the Tobii 4c Eye Tracker with gaze tracking robot by Dautel, Bugdayci, and Wuss, 2019.

Figure 17: A tower head system is represented with a chin rest and viewing point.

Figure18: Isadora

4.4 The Entangled Eye

Project Description

The Entangled Eye is an exploration of visual perception through robotic motion. Luna and Laika are two robotic creatures with curious and animate behavior programmed to elicit your attention. As the whimsical creatures chase your eyes with unique kinematic expressions, the direction of your gaze orchestrates a conversation. The indeterminacy of attention coupled with the animacy of kinetic movement constitutes the basis for our attempt at reconfiguring and understanding the perceptual experience of behavioral artifacts and animism.

Both robots behaviors include a catalogue of human trained interaction based on theatrical concepts of puppeteering. Our research in HRI led to the decision to physically create motion with the creatures by recording the current position data from the Dynamixel (Robotis) motors. A Python script opens the communications port of the motors and stores the position data at  60 fps onto a text file. The writing to text speed and the Dynamixel’s 4096 resolution is important as the puppeteers accelerate motion and ease the creatures into positions. This will create very smooth motions and transitions between each movement of the robotic conversation.

Our aim is to present the project’s system design as an interactive state machine which traverses through dynamic processes created by the relationship of the observer and their ability to give the robotic creatures attention.  Can Point of Gaze inform the traversal of interactive system design management? The ambition of this project was to understand how eye movements can interpolate behaviors. Through the “eye- mind” hypothesis we can assume attention is given and solicited in our environments. By having our robotic creatures respond to reactions to attention and curiosity this project questions motion and animism attached to qualia or stimulus in an environment. 

No Presence- The two robots in this state interpolate between pre recorded behaviors created by the team and have manifested themselves in forms of behavioral narrative. The puppeteering through human interaction creates a sense of wonder as the animations represent themselves in trying to grab the participants gaze.

Presence– In this state the system is awaiting for the observer to step up to the eye tracker and communicate through visual perception with the system. Once the system has detected gaze through presence a timer begins to calculate how long the observer holds attention on one robot. By creating a digital bounding box around the specific locations of the two robots, we can create areas of attention(AOI). By trying to understand saccadic eye movement or detection we can try to process an event detection function while playful acknowledging behavior of movements between Luna and Laika. Once a participant has given a singular robot their fixed gaze that specific robot becomes an extension of their gaze as the movement is controlled by the X and Y coordinates of their gaze as previously represented in “Your Eye’s Motion by Luna.”Â 

Chapter 5: Conclusion

5.1 Evaluation       The Entangled Eye was presented as a prototype as part of the Life Rewired Hub at the Barbican Centre in London in August of 2019. Through design iterations and mechanical developments the research team was able to deliver a working prototype. Two animate robotic creatures whimsically tried to elicit one’s attention. If the user looked at robot 1 it would pause it’s playback while robot 2 continued to animate . In an effort to tighten up the narrative the group pulled back on the interpolation state machine allowing for direct mapping of the robotic motion in the environment. Feedback by a panel of industry critics was provided regarding the sensation of narrative and objectification of the robots. By creating such a large physical distance between the robots and the user some experienced a sense of isolation. Once they have stepped into the eye tracking zone and have assumed some agency over the robots participants engaged playfully with Luna and Laika. 

During their exhibition in Ars Electronica September 5-9 2019, Linz Austria, Luna and Laika performed in a collective 5 days of exhibition time and navigated over 1000 participants in conversations about art and technology. Although the installation provided fruitful in human response to performing lamp creatures, the technology had only experienced a surface level acquisition of further interaction. Limitations in the robustness of the groups code  had prevented the interaction with the point of gaze on the X, Y movements of the robot however the exhibitions main playback system worked without flaws. 

5.2 Speculation     For future iterations of the project I would like to contribute to real time calibration measures in 3-dimensional space using robotics as a tool to elicit the area of attention(AOI) much like artificial eye models analyze precision. By using devices of design and staging produced in Media Arts, the project will inhabit an aesthetically planned installation revolving around creation of light artifacts from the robotic system. Using the Tobii Pro SDK and the Tobii  X120 whic

Figure 22: Relationship of a scene camera and participant to create a custom calibration from Tobii X60 User manual, 2004.

h at 120 Hz Sample rate provides timestamp, eye position, gaze point, pupil diameter, and validity code. By adding a scene camera to the system (see Figure 22), the installation can invite guests to a calibration moment which teaches the participant and robot to collaboratively inform each other, much like previous research in HRI. By creating a game which facilitates light artifacts in the installation, the participant gets to follow their own curiosity and understanding of their own visual perception. Most modern developments occur in the calibration process, much like the addition of trained data sets. By creating  a creative training exercise the researcher can eliminate of a sterile lab environment and the replacement of the task with novel forms of communications practices as seen in HRI. By improving attention and intention of the participant the work hopes to achieve higher accuracy and precision within its prediction models. The gains of higher accuracy can be systematically driven back into the system to create a conversation between human and robot which is a higher representation of human and human non verbal communications. The novel communications of a calibration sequence can become a continuous approach to these interactions.


5.3 Conclusion

My research aims to combine cues of intent and curiosity through eye tracking technologies and Human-Robot Interaction. This work identifies system design as a communications interface for collaboration between participants and their robotic counterparts. By identifying and analyzing behavioral cues through eye movements, designers can inhabit spaces that were not so easily accessible or even were perceived by the participant. Rather than applying computer vision to teaching a robotic task, the work accepts a bottom up approach to tasking unspoken communications as mechanisms for improving accuracy and calibration in eye movement research. By adding a novel environment and task system the participants enthusiasm can be tracked and embodied through new design iterations. By expanding upon these future iterations, the work seeks to present further research in these types of behavioral cues exaggerated by Point of Gaze and Areas of Attention. 



Chadalavada, R., Andreasson, H., Schindler, M., Palm, R. and Lilienthal, A. (2019). Bi-directional navigation intent communication using spatial augmented reality and eye-tracking glasses for improved safety in human—robotinteraction.R oboticsandComputer-IntegratedManufacturing ,[online]61,p.101830. Available at: [Accessed 15 Sep. 2019].

Dziemian, S., Abbott, W. and Faisal, A. (2016). Gaze-based teleprosthetic enables intuitive continuous controlofcomplexrobotarmuse:Writing&drawing.2 0166thIEEEInternationalConferenceon Biomedical Robotics and Biomechatronics (BioRob), [online] pp.1277-1282. Available at: [Accessed 13 Sep. 2019].

EncyclopediaBritannica.(1998).B iot-Savartlaw|physics .[online]Availableat: [Accessed 4 Sep. 2019]

Gneo, M., Schmid, M., Conforto, S. and D’Alessio, T. (2012). A free geometry model-independent neural eye-gaze tracking system. Journal of NeuroEngineering and Rehabilitation, 9(1), p.82.

Holmqvist,K.,Andersson,R.,Nyström,M.,Dewhurst,R.,Jarodzka,H.andvandeWeijer,J.(2017).E ye Tracking: A comprehensive guide to methods and measures. Lund, Sweden: Lund Eye-Tracking Research Institute.

Janthanasub, V. and Meesad, P. (2015). Evaluation of a Low-cost Eye Tracking System for Computer Input.K MUTNBInternationalJournalofAppliedScienceandTechnology ,pp.1-12.

Just,M.andCarpenter,P.(1976).Eyefixationsandcognitiveprocesses.C ognitivePsychology ,8(4), pp.441-480.

Kogan, G (2018). ITP Machine Learning Lecture series [online] Available at: [Accessed 9 Jun. 2019]

Levin, G. (2009). Opto-Isolator – Interactive Art by Golan Levin and Collaborators. [online] Available at: [Accessed 13 Sep. 2019].

Lindberg,D.(1976).T heoriesofvisionfromAl-KinditoKepler .Chicago:TheUniversityofChicagoPress.

Mori, M. (2012). The Uncanny Valley. I EEE ROBOTICS & AUTOMATION MAGAZINE, [online] pp.98,99,100. Available at: [Accessed 23 Aug. 2019].

Mikami S.,(2009). Molecular Informatics. Morphogenic substance via eye tracking, Database of Virtual Art, substance-via-eye-tracking.html [last accessed 30 August 2019]

Palinko, O., Rea, F., Sandini, G. and Sciutti, A. (2016). Eye tracking for human robot interaction. Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications – ETRA ’16 .

Palinko, O., Rea, F., Sandini, G. and Sciutti, A. (2016). Robot reading human gaze: Why eye tracking is betterthanheadtrackingforhuman-robotcollaboration.2 016IEEE/RSJInternationalConferenceon Intelligent Robots and Systems (IROS).
Papoutsak, A., Sangkloy, P., Laskey, J., Daskalova, N., Huang, J. and Hays, J. (2016). WebGazer: Scalable Webcam Eye Tracking Using User Interactions. Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI), pp.3839- 3845.


Poole, A. and Ball, L. (2004). Eye Tracking in Human-Computer Interaction and Usability Research: Current Status and Future Prospects . [ebook] Psychology Department, Lancaster University, UK. Available at: [Accessed 4 Sep. 2019].

Takayuki, T. (2018). SEER: Simulative Emotional Expression Robot. [online] Takayuki Todo 藤堂高行. Available at: [Accessed 13 Sep. 2019].

Whitmire, E., Trutoiu, L., Cavin, R., Perek, D., Scally, B., Phillips, J. and Patel, S. (2019). EyeContact: Scleral Coil Eye Tracking for Virtual Reality. I SWC ’16 . [online] Available at: [Accessed 3 Sep. 2019].

Image References

Figure 1: Dautel, A., Bugdayci, I. and Wuss, R. (2019). Luna in her environment. [image] Available at: [Accessed 15 Sep. 2019].

Figure02:D autel,A.,Bugdayci,I.andWuss,R.(2019).T heEntangledEye [image]Availableat: [Accessed 15 Sep. 2019].

Figure 03: H olmqvist, K., Andersson, R., Nyström, M., Dewhurst, R., Jarodzka, H. and van de Weijer, J. (2017). Scan path [image] [Accessed 15 Sep.2019].

Figure 04 : Whitmire, E., Trutoiu, L., Cavin, R., Perek, D., Scally, B., Phillips, J. and Patel, S. (2019).
SCC Headset [image] Available at: [Accessed 3 Sep. 2019].

Figure 05: Gneo, M., Schmid, M., Conforto, S. and D’Alessio, T. (2012). Bright pupil and reflections[image] Available at: Journal of NeuroEngineering and Rehabilitation , 9(1), p.82.

Figure 06: Poole, A. and Ball, L. (2004). 3 directions of illumination reflection [image] Available at: [Accessed 4 Sep. 2019].

Figure 07: H olmqvist, K., Andersson, R., Nyström, M., Dewhurst, R., Jarodzka, H. and van de Weijer, J. (2017). SMI Tower [image] [Accessed 15 Sep.2019].

Figure 08: Janthanasub, V. and Meesad, P. (2015). :The visualization of obtained data from the calibration part with the DIY eye tracker.. [image] Available at: st_Eye_Tracking_System_for_Computer_Input.pdf [Accessed 15 Sep. 2019].

Figure09:P apoutsak,A.,Sangkloy,P.,Laskey,J.,Daskalova,N.,Huang,J.andHays,J.(2016).Pointof gaze mapping [image] Available at h ttps:// [Accessed 15 Sep. 2019].

Figure 10: W uss R. (2019)A n example of facial detection using Gene Kogan’s FaceOSC [image] Figure 11: Levin, G. (2007). O pto-Isolator2 . [image] Available at: [Accessed 13 Sep. 2019]. Figure 12: V og.Photo (2018). SEER: Simulative Emotional Expression Robot. [image] Available at:

18 [Accessed 13 Sep. 2019].

Figure 13: Graffiti Research Labs (2009). EyeWriter. [image] Available at: [Accessed 15 Sep. 2019].

Figure 14: T B&H Retailer (2017). T obii 4C Eye Tracker. [image] Available at: [Accessed 15 Sep. 2019].

Figure 15: Dautel, Bugdayci, Wuss (2019). Luna as a robotic creature[image]. Figure 16: Dautel, Bugdayci, Wuss (2019).Your Eye’s Motion, By Luna [image].

Figure 17:Dautel, Bugdayci, Wuss (2019). A tower head system is represented as a viewing port with Web camera and infrared light [image]

Figure 18: Dautel, Bugdayci, Wuss (2019). Isadora Blob Detection [image].
Figure 19: Dautel, Bugdayci, Wuss (2019). Shaune using Luna [image].
Figure 20:Dautel, Bugdayci, Wuss (2019). Luna and Laika as full sized robots [image]
Figure 21: Dautel, Bugdayci, Wuss (2019). The Entangled Eye concept drawing[image]
Figure 22: Tobii Eye Tracking (2008). Addition of a scene camera [image] Tobii User Manual X60 and X120

Figure 21: The Entangled Eye Environment concept drawing (Dautel 2019)

Submit a Comment