Image Image Image Image Image Image Image Image Image Image

Bartlett School of Architecture, UCL

Scroll to top


No Comments

Real-Time Object Tracking Systems

Real-Time Object Tracking Systems
  • On August 20, 2016

“We live in a complex world, filled with myriad objects, tools, toys, and people. Our lives are spent in diverse interaction with this environment. Yet, for the most part, our computing takes place sitting in front of, and staring at, a single glowing screen attached to an array of buttons and a mouse (Wellner, Mackay and Gold, 1993).” As one of the most popular activities, it is common to see people in front of computer screen for learning or entertainment, in both living or working spaces. This type of Graphic User Interface fixes the behaviour of the user for a long period of time, however as technology has developed, this is no longer a limitation. A new type of interactive interface, the Tangible User Interface, has been in development since 1997 by MIT’s Tangible Media Group. “The notion of a ‘Tangible User Interface’ constitutes an alternative vision for computer interfaces that brings computing back ‘into the real world (Wellner, Mackay, Gold, 1993).” It means that any daily object can be the screen for showing digital information and the responses of its users. Thus, the concepts of being tangible and interactive have become popular for future Human-Computer Interaction. This type of interaction does not only liberate human behaviour, but also establishes a bridge between the physical world and cyberspace (Ishii and Ullmer, 1997).

One of main problems with tangible interaction is how to establish the bridge between the real world and the digital space. The main approach to do this is using Augmented Reality, a form of interaction where digital images are overlaid onto real-world objects and environments in real-time (Carmigniani et al., 2010). Normally, augmented reality systems use a smartphone as the platform to merge digital information with the physical world. However, novel approaches mean that augmented reality is no longer dependent on the use of a computer or smartphone screen. Tangible user interfaces project the image directly onto the surface of the physical objects, and this is called Spatial Augmented Reality (Bimber and Raskar, 2005). Using a projector rather than a traditional computer screen as an output device, and a variety of input sensors, the physical world and digital space are able to visibly merge.

The augmented reality method is just a one-way bridge of projecting information in virtual space onto the real object. A feedback of this system, where information from the real world is able to react real-time to the digital world, is not established. The virtual space does not notice the movement of physical object, and still projects the digital information on the old location. It is as if after an object surface has been setup as tangible user interface, the object shifts the original position by its own power, or some external force. In order to solve this problem, feedbacks should be provided from the physical world. A feedback loop is a crucial element of any interaction interface. It improves the adaptability of interaction, as an automatically adjusted system (Mindell, 2002). In the case above, if a feedback system can be applied, the virtual space can adjust its projection area automatically by tracking the location of the object.

This report will explore the solution to the problem of how digital spaces are able to notice and detect changes in the position of the object it is projecting images onto. We believe that unless and until this problem can be solved, the virtual space and the physical world are not fully merged in the real sense. ‘Physical Pixels’ is a spatial augmented reality environment that seeks to establish a real information feedback loop, in which the digital space is projected onto a moving object in real time. The system aims to adapt the projection based on the movement behaviour of the object concerned, which may also be affected by feedback from the user. In summary, the aim of this paper is to search for the best tracking method in the environment of ‘Physical Pixels’.


2. Physical Pixels

As mentioned previously, the aim of this project is to establish a feedback loop between virtual spaces and the physical world. The loop has three separate areas, the virtual area, the physical area and the area that merge the virtual and physical space (See Figure 1). In the virtual area, we designed a program that could analyse the information getting from physical world and make decisions in order to respond to the outside world. In the physical area, we produced many self-organised robots which behaved differently dependent on the stimulus. These can be applied as modular furniture or interactive robots .

01 feedback loop

Figure 1. Relationship between virtual space and physical world

Between the virtual space and the physical world, the middle area depicted in Figure. 1 establishes a place for the information to change. This area is made up of two parts. The first part transmits the information from the physical area to the virtual area, detecting and tracking the location and movement of the robots to the virtual space in real-time. The other part is from the virtual space to the physical world, which projects the digital image onto the robot, and can auto-change the projection following the behaviour of the robot. The users in this loop can not only control the virtual space by changing the projection, but also by controlling the physical robots, programming them to perform a specific behaviour or go to a specific location.


2.1 Equipment in Physical World

In the physical world, we designed reconfigurable objects, which not only have a virtual presence but also an ability to sense their own environment. The robot is a cyclic quadrilateral shape (See Figure 2). Angle A, C are 90 degrees, Angle B is 60 degrees and D is angle of 120 degrees. The length of size AB and AD is same. Their morphology allows them to respond to external stimuli by modifying their behaviour. The shape of the robot aims to optimise the possibility of spatial reconfiguration. Figure 3 displays the several possibilities for reconfiguration.

02 Shape of robot03 The reconfiguration possibility

Figure 2 & 3. Shape of robot  & The reconfiguration possibility


In order to automaticlly reconfigure their shape, there are three main problems; the movement of robot, wireless communication between the robot and the computer and altering the real time projection onto the desired surface. One Arduino board, a microcontroller board which connects the robot to the computer program with sensors and actuators, was used as the main element for control (Arduino, 2016). For the purpose of controlling the robot’s movement, there are three omnidirectional wheels with 6v gear motors, which enable the robot to move anywhere. A radio frequency signal device is the component used to send messages from the main computer to the robot (See Figure 4). Finally, a tracking system and a projector were used to create the real time virtual projection system (See Figure 5).

04 structure 05 Device

Figure 4 & 5. Elements of robot & The working system


2.2 Projection in Virtual Space

Depending on the practical purpose, the programs in digital space are widely different. For the ‘Physical Pixels’ environment, we designed groups of points that could randomly float in a range, projected onto the top surface of the robots. The points on each robot have different colours to distinguish themselves from one another (Figure 6). When the robots configure as a pattern, the points cloud on the robots will merge and float on the surface of robots (Figure 7).


This program was designed is to test the feedback loop system in spatial augmented reality environments. It has a simple input, the location of the robots, and an output which is either a separated or merged pattern. For future applications of this system, the program in the virtual space will be redesigned for other purposes.

06 1 before merge 07 merge

Figure 6 & 7. Separated pattern in virtual space & Merged pattern in virtual space


2.3 The Connection of Two Areas

The connections between these two areas are divided into two parts. The bridge from the virtual area to the physical area has been built by the projection techniques, which project the result of the virtual program onto the surface of the physical objects. The reverse connection required an appropriate object tracking system to be developed.


2.4 Object Tracking Systems for Spatial Augmented Reality Environments

“Object tracking is the process of detecting a numbers of points automatically from frame to frame in a sequence, allowing the user to stabilise, track or solve object or camera movement (Seymour, 2004).” In this process, the object is detected by a number of points. By tracking the location and other information of these points, the whole object can be detected. The application of object tracking can be extended in some research areas, such as augmented reality, human-computer interaction, medical imaging and video editing (Seymour, 2004).

A necessary precondition to track the object or objects accurately is an environment with stable and bright ambient light (Yilmaz, Javed, and Shah, 2006). However, the spatial augmented reality environment that this project will use produces an unstable colourful ambient light.  Also known as projection mapping, spatial augmented reality, uses mapping software to turn often irregularly shaped objects, that could be simple furniture or a complex building, into a display surface for video projection (Bimber and Raskar, 2005). This changes the real environment into a mixed augmented reality space. Thus, tracking methods which may need a stable ambient light are inappropriate in the spatial augmented reality environment.


3. Augmented reality environment examples

The research began from exploring several previous works that projected digital images onto dynamic objects. In the augmented reality sandbox and OMOTE, the dynamic objects used are sand and human faces. In these works, the tracking systems are not only controlling the graphic on the surface of the object, but also detecting the range of projection when the objects move. Both of these projection mapping case studies work in real time, giving feedback frequently to ensure dynamic projection mapping.


3.1 Augmented Reality Sandbox

The Augmented Reality Sandbox prototype was designed by UC Davis scientists Oliver Kreylos, Burak Yikilmaz and Peter Gold as part of an NSF-funded project on lake and watershed science education (Kreylos, 2012). The aim of this project was to develop a real-time integrated augmented reality system to physically create topography models which are then scanned into a computer in real time, which could then be used as the background for a variety of graphics effects and simulations (Kreylos, 2012). The projection image followed the height of the sand mountain (Figure 8), using a Kinect Camera to detect its height. When the level of sand was below the standard height, virtual water would flow into the space. The tracking method included 2D colour tracking and was also able to detect the distance in 3D (Figure 9). Despite the black ambient light, the tracking system was able to detect the height of the sand. This showed that tracking systems can also work in colour-based cameras. However, the tracking system was only detecting height, which is different from tracking a moving object. This will be discussed further in the Section 5.


3.2  OMOTE

Designed by Japanese artist Nobumichi Asai, OMOTE is a program that combines real-time face tracking techniques and projection mapping systems in 2014 (Figure 10). During a performance, the system uses a model’s face as a projection surface and then projects a beautiful and creative image such as an animals face or a geometric pattern onto it. During the performance, the models, and their faces, move around.  This was inspired by Japan’s classical musical plays in Nogaku, where performers use Omtoe masks to express multitude of dramatic emotions (Asai, 2014).  ‘Omote’ is a Japanese word for face, or a mask, and this work explores the idea that the face is mirror of the human soul, with a separation between the ‘Omote’ (exterior) and ‘Ura’ (interior) (Asai, 2014).

The face tracking technique used in this project is based on an infrared tracking system, that uses an infrared camera and reflective markers. This work shows that the reflective markers in this system can work in any ambient light and be used on any object. However, this system has limitations in that the markers are visible on the face, as they cannot be covered otherwise they would not be able to be detected.



4. Real-Time Tracking Approaches

Both projects discussed in Section 4 utilise real-time tracking of both 2D camera images and 3D objects. Most tracking systems use just the camera to detect the location of the object. There are some other methods such as sensors, like the GPS or the Robot Arm, that can be used to track object location (Yilmaz, Javed, and Shah, 2006). Depending on the tracking system, the accuracy and range of the cameras are different. Furthermore, their cost of the cameras varies considerably. This section, will introduce four tracking methods that can be used within a projection mapping environment; colour tracking, marker tracking, Kinect tracking and optical tracking system.


4.1 Colour Tracking

The quickest and easiest method to track an object is based on colour. In this methods, in order to successfully detect objects, the object and the background colour should have a significant difference (Rasmussen, Toyama, and Hager, 1996). The main library, named OpenCV, is a library of programming functions mainly aimed at real-time computer vision. The function of OpenCV is to capture images by RGB format. In other words, images that OpenCV detected can be considered as 3 matrices; RED, GREEN and BLUE with a range of values from 0 to 255 (Oliveira and Conci, 2009). It is therefore easy to detect and track one colour or a range of colours showing on the screen.

We tested this form of tracking in a changeable projection background (Figure 11). We used a green piece of paper as our object. Through the OpenCV library, the camera can detect the green paper automatically. In order to visually check the result, we projected a white square on the centre of the green object to detect a difference in colour. The size of the white square had no significance. In theory, because the green paper and background has colour aberration, the camera should detect the white square accurately. However, as the green color has been disturbed by the pattern in the background, the tracking system not able to detect the central point accurately. During the movement of the object or the change of background projection, the location of the central white square would jump.

Figure 11 & 12. Colour Tracking & Marker Tracking


4.2 Marker Tracking

Marker tracking uses 2D RGB camera images to detect the 3D movement of the chosen object. It uses the basic methods of 2D colour tracking methods to calculate the position and orientation of the physical camera by detecting physical markers in real time (Gaschler, 2011). Once the position of physical camera has been detected, a virtual camera can be placed at the same position. Therefore, 3D models in the virtual space are overlaid exactly on the physical marker. There is a main computer tracking library called ARToolKit, which is an open-source library for the application of augmented reality. Marker-based tracking approaches are fast and accurate in a stable environment (ARToolKit documentation, 2015).

In our test (Figure 12), the marker was printed out and placed in the same environment as the colour tracking. The green square is the object that camera detected. Compared to the colour tracking, the marker tracking system can not only track the location of the object, but can also detect the 3D rotation and size of object. However, because this test was in a spatial augmented reality environment, the printed marker reflected the light of projection. Thus, the camera could not detect the whole shape of the marker. In some specific angles, the camera lost the object.


4.3 Kinect Tracking

The Kinect tracking camera uses a structured light technique to generate real-time depth maps that contain a discrete range of measurements of the physical scene (Gasparrini et al., 2014). The data collected from the Kinect camera can then be transformed into to a set of 3D point clouds. There are three main components of the camera to track the position of 3D object. The first is a graphical user interface (GUI) that can detect the state of Kinect camera. The second is a RGB camera, a 2D tracker that divides the objects by their different colour. The third is a distance detecting component of the 3D object tracker. The system then uses the iterative closest point (ICP) algorithm to search for the original point cloud in the live depth image (Gasparrini et al., 2014). The 3D tracker is based on the result of RGB camera, as it transforms the 2D colour image into 3D point cloud.

13 Kinect Large 14 Kinect Short

Figure 13 & 14. The environment image by Kinect & Short range of detection


4.4 Optical Tracking system

As Kinect tracking is not affected by the background colour, our Kinect tracking test was not used in the projection mapping environment. It is easily to obtain the 3D image of an environment (Figure 13). After changing the range of distance displayed in the software, an area of objects can be selected (Figure 14). However, it is impossible to detect a specific object by only changing the range of distance. This is the main difference between Physical Pixels and The Augmented Reality Sandbox which has been mentioned above. In the Augmented Reality Sandbox, the system only divided the range of distance, changing the graphic projection’s height range. In Physical Pixels, if the Kinect system wants to get a location of object, it needs a large number of floating-point calculations. As a result, the tests were unable to be completed.

Optical tracking is a system that detects the position of an object in real time (Kurihara et al., 2002) by infrared markers. These markers can be detected by an infrared camera (IR camera). An infrared camera is a video camera that can detect far infrared light which has wavelengths as long as 14,000 nm. (The range of visible light camera is from 400 to1000 nm wavelength.) In this approach, the markers are fit onto the object in a pattern that can be detected by the software. The markers used can either be passive or active (Boger, 2003). Active markers are typically infrared lights, such as infrared LED. The infrared camera can directly detect the points. Passive markers are retro-reflectors, that reflect the infrared light from the light source to the infrared camera (Boger, 2003). These infrared cameras, produced for tracking purposes, are equipped with an infrared flash around the main lens that is reflected off the markers.

Due to the high cost of the infrared cameras, we were unable to test optical tracking. However, this method of tracking was used in the OMOTE performance (Asai, 2014). We were therefore able to confirm that this tracking system can be used in Physical Pixels. The only limitation is that the marker has to be visible in order to reflect the infrared light. This may result in an uneven surface for the projection. None the less, optical tracking, especially the infrared marker reflection system, is very useful tracking system that would suit the Physical Pixels environment.


5. Tracking in Physical Pixels

In this project, the primary aim of the tracking system is to detect the location, degree of rotation and size of a robot in real time. After the tracking system is able to get the information of the robots’ location, the projector can project the digital graphic onto the surface of the robot. This has practical benefits as it can help the robot get to its target position by monitoring the real-time path of the robot. Normally, the simplest tracking method that this project could use would be using a camera to identify the position of an object through the colour difference between the objects and the environment. However, as this work includes graphic projections, there would be a colourful image on the surface of the robot which could confuse the tracking system. In this case, the whole ambient environment should be kept dark in order to decrease the confusion and accurately locate the robot. Therefore, the simple 2D colour tracking system is not suitable in this paradigm. The next best possible solution would be to use a 3D Kinect tracking system, to distinguish the difference between the environment and the object. However, 3D Kinect tracking requires a lot of floating-point arithmetic’s, which is not suitable for tracking fast moving objects. As a result, the infrared optical tracking system is the most suitable choice.

The most reliable and sophisticated infrared tracking system is the opti-track camera system, however as previously mentioned the infrared cameras are extremely expensive. The opti-track system needs a number of tracking cameras in order to detect the 3D space. However, the robot that is being tracked only moves on a flat surface. This means that a large number of cameras may not be useful and therefore this may not be the best option.

As a result, the design team decided to create their own homemade infrared tracking system. These cameras employ the same infrared tracking method, but use normal cameras with an infrared filter, instead of a specific infrared camera. These cameras are able to detect infrared LEDs that were used to mark the object.  Hereinafter, this paper will describe the process of real-time tracking system using our infrared camera system.


5.1 Tracking Process

The infrared tracking system used this project using is different to the infrared tracking system commonly sold on the market. The market sold infrared tracking system uses an infrared camera, that only can detect infrared light. In this project, however, to reduce costs, the design team used normal cameras with an infrared filter.


5.1.1 Basic Principle

The range of visible light waves for the human eye is about 380nm-750nm (Vision Glossary, n.d.) (Figure 15).  Infrared light waves range from larger than 750nm and less than 1000nm (Vision Glossary, n.d.). The vast majority of digital cameras use CCD or CMOS sensors to detect ambient light, with  a photosensitive range of about 300nm to 1000nm. Therefore the tracking system cameras are able to detect short-wave infrared light, from 750nm to 1000nm, in order to track objects.

15 Light Wave

Figure 15. Light Wave in human eyes and camera sensors (Spectral response, n.d.)


5.1.2 Camera and Filter

In order to obtain sufficient transmission speed and excellent image quality, the team used the Logitech C920 Webcam (Shown in Figure 16), which has a 1080P HD video quality. However, a suitable filter is necessary in order to track the infrared light. An infrared filter is a piece of glass that blocks and absorbs the short light waves (visible light), allowing the long wavelength infrared light through.

The most common infrared filters on the market have 720nm, 760nm, 850nm and 950mn light wave thresholds (Kolari Vision, n.d.). 720nm enables the absorption of visible light, allowing visible light and infrared light to co-exist. This is popular in the realm of infrared photography. The 950mn filters can only absorb 950nm to 1000nm light waves, therefore the band is short and does not apply to most of the infrared LED and as a result was not used. Both the 760nm and 850nm filters are suitable for our project, and as the infrared LED used in this project were more than 850nm, we used the 850nm filter to obtain a purer infrared image.

16 Installation

Figure 16. Installation. There is a table at bottom for surface of robot. The camera and filter is above the table. On the right side, there is the projector.


5.1.3 Installation

The setup created by the design team is shown in Figure. 16. The setup is made up of a table which is 1300mm length, 900mm wide and 1000mm high. The camera and infrared filters are placed 1500mm above the surface of table. The projector is located on the right side of the table, 2300mm above it and is upside down, rotated to a 30°  angle to ensure that the projection range could completely cover the surface of table.


The system has two axes coordinate. One, shown in Figure 17 is an absolute axes coordinate with the upper left corner of the table as the origin point. The length of table is the X-axis, the width represents the Y-axis and the height is the Z axis. The other, also shown in Figure 17, is the relative axis coordinate is for the robot. It is an independent two-dimensional axes coordinate, with the chassis shaft of robot as the centre point.

17 coordinate axes

Figure 17. Absolute and relative axes coordinates


5.1.4 Markers on Objects

After setting up the installation, the infrared LED markers we placed on the robot to ensure it could be detected. We tested three LED markers, which absorb 890nm, 875nm and 940nm wavelengths, with angles of 44°, 20° and 20° (Shown in Figure 18, from left to right retrospectivley). The image in Figure 18 shows the test results of three LEDs. The optimal LED markers were the ones with the maximum angle, of 44° and 890nm wavelength, which were therefore chosen to be the trackers in our project.


In order to detect the position and angle of the object, a basic shape of the LED markers was set up as shown in Figure 19. The centre point is the central rotation point of the entire robot, and the original point of the relative axes coordinate as mentioned above. The farthest point is the rotation direction of robot, which is perpendicular to the hypotenuse. The remaining two points which are near the central point is the auxiliary point which will be explained in the following text.

18 LED test 19 Marker pattern

Figure 18 & 19. The test result of three LED & Marker pattern


5.1.5 Mapping camera range

Once the setup in the physical world was completed, we used the VVVV software for tracking and follow up work. VVVV is toolkit software that focuses on real-time video programming in large media environments that can easily be developed (VVVV, 2014). Before testing the LED markers, in order to match the camera image size to the real space size, we had to map the camera image first. This is a crucial step as the system only uses a single 2D camera, so if it is not precise it will be unable to map the object to its specific location. Several reasons, including the wide-angle of the camera, may result in deformation of the projection, and could lead to erroneous results. The four corners of the table in the video page were matched to the four corners of the render windows by stretching the image (See Figures 20 and 21). The software was able to change the position of the camera, so that the perspective matched that of the table. This is so that the rendering window was able to use its own coordinate system to map the virtual image to the actual object.

20 Before Mapping 21 After mapping

Figure 20 & 21. Before & after mapping the camera range


5.1.6 Contour of one single object

After placing the LED markers, the software was able to identify these points and translate them into graphics. The rendered image on the software showed a black background with four white points. If the LED markers are too tilted, or if the exposure setting of video is too high, the camera may detect an extra point, which is problematic. Therefore, the settings of the camera were adjusted at the beginning. When the specific location of the four points were detected on the screen, it was necessary to identify the difference in points by a series of calculations. The first step identified the chassis centre point, then used a formula plus the position of four points to find the average location. Then the distance between the average point and four mark points was calculated. As the chassis centre point is located near the average point, it was detected by choosing the closest distance. This is why there were two auxiliary points that have been mentioned above. After obtaining the centre point, the position of the robot was successfully be obtained.

After getting the robot chassis centre, the next step was to detect the rotation angle of the robot. As mentioned above, the point used to detect the rotation degree of robot was the farthest point from the chassis centre point. This point can be obtained through distance detection that calculates the distance from the central point to other various points. The point which obtained the farthest position was the point for the Y-axis in relative axis coordinate. Through tracking the movement of theY-axis, the rotation of the robot was detected. Though these steps, the tracking system was able to get the position and degree of rotation to a sufficient accuracy, thus tracking the robot in real time.


5.1.7 Contours of more than one single objects

5.1.6 described the detection method of single robot. However, in the practical application of this project, only one robot is not enough. When the number of robots increases to two or more, some of the methods above do not apply, such as adding the location to get the average data. It is therefore necessary to insert another formula which can be used for detecting point cloud. The formula splits the point cloud according to the units of robot. The calculation method is to take a random point as a first reference point, the calculate the distance between first reference point and each other point. Because the maximum distance of the four points in each robot is less than the minimum distance between the robot to any other robot, it is easy to find the other three relevant points which are nearest points of first reference point. After that, another random point is taken from the rest of the point cloud as the second reference point and the above calculation is repeated, and so on. This method can be used to detect a plurality of robots. After splitting the point cloud, the program can use the method described in 5.1.6 to operate the position and rotation degree of each robot.


5.2 Other functions

The main function of tracking in this paper is to detect the location of robots. However, this can also be extended to some other functions, such as projection mapping on the surface of the robot and guiding the robot.


5.2.1 Projection Mapping

Importantly, we needed to project the visual images onto the surface of the robot. In order to do so, we took advantage of the VVVV software in which you can easily mapping the projection on a specified area by simply switching on or off projection surfaces in that area. In the patch, there are parts of the physical geometry and the virtual image areas. We put the shape of the robot into the geometry area, so that the movie image was able to transform into the virtual image area. We could therefore see the simulation of real-time tracking and projection mapping in the rendering screen as show in Figure 22.

22 Virtual in render wins 23 Virtual in Surface

Figure 22 & 23. projection mapping in the rendering screen & projection mapping on the robots


5.2.2 Robot Guiding

However, in order to project this onto the real surface, we needed an additional step. This was to map the real environment into the software. In order to do so, we first needed to select five points in the real environment which were not on the same plane. We then marked the five points in the same position in the virtual environment. We were then able to match the two five-points following a calculation of the position of the projector (See Figure 23).

Another function of the tracking system in our project was to check the real time location of robot and provide the target for robot, making sure it moves on the correct path. Most omnidirectional robots detect the position by gyroscopes and encoder. However, our robots can use the tracking system to detect the position of the object. By providing specific locations in the 3D model and calculating the speed of three wheels separately, we could work out required time of arrival. Meanwhile, according to the movement of robot, the tracking system can detect whether the robots are following the correct path or not. If there are any deviations, we could recalculate the wheel speeds.


6. Conclusion

In this report, we present the different approaches of tracking objects in spatial augmented reality environments, and illustrate the usage of a real-time tracking system. Through our exemplar project, we explained how to track a robot using a cheap infrared tracking system, and explored the usage of tracking system to obtain the position of projection mapping and guide the robot to a specified location.

However, we only tested this using the VVVV software. There are many other professional software options that can be used to track objects accurately. The algorithms mentioned in this paper are all designed to solve the problems specific to this project, for example how to find the central point. It therefore cannot be used in any other program. It would be useful to explore the existing algorithms that other researchers have produced and learn the methods that other software use.

Compared to the popular infrared tracking system sold on the current market, our tracking system is cheaper and can be used easily in everyday design. The whole system can be used to track dynamic objects in a dark or unstable environment. Furthermore, the tracking process in this software is relatively simple. We therefore believe that this system can easily be used in many augmented reality environments.


7. References

Arduino, 2016. BoardAnatomy. [online] Available at: <> [Accessed: 20 July 2016].

ARToolKit, 2015. ARToolKit documentation. [online] Available at: <> [Accessed: 20 July 2016].

Asai, N., 2014. Nobumichiasai.Com. [online] Available at: <>] [Accessed: 12 May 2016].

Bimber, O. and Raskar, R., 2005. Spatial augmented reality: Merging real and virtual worlds. United States: Peters, A K.

Boger, Y., 2003. Overview of Positional Tracking Technologies for Virtual Reality. [online] Available at: <> [Accessed: 20 July 2016].

Carmigniani, J., Furht, B., Anisetti, M., Ceravolo, P., Damiani, E. and Ivkovic, M., 2010. Augmented reality technologies, systems and applications. Multimedia Tools and Applications, 51(1), p.341—377.

CMSoft, n.d. Case study: color tracking. [online] Available at: <> [Accessed: 20 July 2016].

Fleder, M., Pillai, S. and Scott J., n.d. 3D Object Tracking Using the Kinect.

Gaschler, A., 2011. Real-Time Marker-Based Motion Tracking: Application to Kinematic Model Estimation of a Humanoid Robot. [e-book] Available at: <> [Accessed: 20 July 2016].

Gasparrini, S., Cippitelli, E., Spinsante, S. and Gambi, E., 2014. A depth-based fall detection system using a Kinect® sensor. Sensors, 14(2), p.2756—2775.

Hornecker, Eva., 2002. Tangible Interaction. In: Soegaard, M. and Dam, R.F., 2002. The glossary of human computer interaction.

Ishii, H. and Ullmer, B., 1997. Tangible Bits: Towards Seamless Interfaces between People, Bits and Atoms, the Proceedings of CHI ’97.

Kolari Vision, n.d. Choosing a filter. [online] Available at: <> [Accessed: 20 July 2016].

Kreylos, O., 2012. Oliver Kreylos’ research and development Homepage – augmented reality sandbox. [online] Available at: <> [Accessed: 12 May 2016].

Kreylos, O., 2014. AR Sandbox: Cutting-Edge Earth Science Visualization. [online] Available at: <> [Accessed: 20 July 2016].

Krevelen, D.W.F. and Poelman R., 2010. A survey of augmented reality technologies, applications and limitations. The International Journal of Virtual Reality, 9(2), p.1-20.

Kurihara, K., Hoshino, S., Yamane, K. and Nakamura, Y., 2002. Optical Motion Capture System with Pan-Tilt Camera Tracking and Realtime Data Processing.

Mindell, 2002. Between human & machine: Feedback, Control, and Computing Before Cybernetics. United States: Johns Hopkins University Press.

Oliveira, V.A. and Conci, A., n.d. Skin Detection using HSV color space.

Pickering, A., 2010. The cybernetic brain: Sketches of another future. Chicago: University of Chicago Press.

Rasmussen, C., Toyama, K. and Hager, G.D., 1996. Tracking objects by color alone.

Seymour, M., 2004. Art of tracking part 1: History of tracking. [online] Available at: <> [Accessed: 18 July 2016].

Vision Glossary, n.d. Spectral response. [online] Available at: <> [Accessed: 20 July 2016].

VVVV, 2014. VVVV – A multipurpose toolkit. [online] Available at: <> [Accessed: 21 July 2016].

Wellner, P., Mackay, W., Gold, R., 1993. Computer-Augmented Environments: Back to the Real World. In Communications of the ACM, 36 (7), p.24-26.

Yilmaz, A., Javed, O. and Shah, M., 2006. Object tracking: A Survey. ACM Computing Surveys, 38(4), p.13.


8. List of Figures

  • Wang, M., 2016. Relationship between virtual space and physical world. [chart].
  • Wang, M., 2016. Shape of robot. [sketch].
  • Wang, M. and Chaturvedi, S., 2016. The reconfiguration possibility. [computer graphic].
  • Wang, M. and Chaturvedi, S., 2016. Elements of robot. [photograph].
  • Wang, M. and Chaturvedi, S., 2016. The working system. [sketch].
  • Wang, M., 2016. Separated pattern in virtual space. [computer graphic].
  • Wang, M., 2016. Merged pattern in virtual space. [computer graphic].
  • Kreylos, O., 2014. Augmented Reality Sandbox. [image online] Available at: <> [Accessed: 20 July 2016].
  • Kreylos, O., 2014. The setup of AR Sandbox. [image online] Available at: <> [Accessed: 20 July 2016].
  • Asai, N., 2014. [image online] Available at: <> [Accessed: 12 May 2016].
  • Wang, M., 2016. Colour Tracking. [photograph].
  • Wang, M., 2016. Marker Tracking. [photograph].
  • Wang, M., 2016. The environment image by Kinect. [computer graphic].
  • Wang, M., 2016. Short range of detection. [computer graphic].
  • Seymour, M., 2004. Light waves in human eyes and camera sensors. [image online] Available at: <> [Accessed: 18 July 2016].
  • Wang, M. and Chaturvedi, S., 2016. There is a table at bottom for surface of robot. The camera and filter is above the table. On the right side, there is the projector. [photograph].
  • Wang, M., 2016. 3D environment and 2D robot coordinate axes. [photograph].
  • Wang, M., 2016. The test result of four LED. [computer graphic].
  • Wang, M., 2016. Marker pattern. [computer graphic].
  • Wang, M., 2016. Before mapping the camera range. [computer graphic].
  • Wang, M., 2016. After mapping the camera range. [computer graphic].
  • Wang, M., 2016. Projection mapping in the rendering screen. [computer graphic].
  • Wang, M. and Chaturvedi, S., 2016. Projection mapping on the robots. [photograph].


Submit a Comment