The multi-touch See ColOr interface

(1)

Proceedings Chapter

Reference

The multi-touch See ColOr interface

BOLOGNA, Guido, et al.

BOLOGNA, Guido, et al . The multi-touch See ColOr interface. In: The 2nd International Conference on Information and Communication Technologies and Accessibility, ICTA 2009 . 2009. p. 139-142

Available at:

http://archive-ouverte.unige.ch/unige:47658

Disclaimer: layout of this document may differ from the published version.

1 / 1

(2)

The Multi-Touch See ColOr Interface

Guido Bologna¹, Stéphane Malandain¹, Benoît Deville², Thierry Pun²

1University of Applied Science

Rue de la prairie 4, 1202 Geneva, Switzerland E-mail:{guido.bologna, stephane.malandain}@hesge.ch

2Computer Science Department, University of Geneva Route de Drize 7, 1227 Carouge, Switzerland E-mai:{benoit.deville, thierry.pun}@unige.ch

Abstract

This article presents the multi-touch See ColOr interface that will permit the interpretation of static pictures by visually impaired persons. Images are represented on a special paper emphasizing object contours by palpable roughness. With this multi-modal interface, user fingers can touch image pixels represented by laterally spatialized sound sources encoding colors by classical musical instruments. Compared to previously developed See ColOr interfaces, here the novelty is the multi-touch interaction with touch and audition.

1. Introduction

Several authors proposed specific devices substituting vision by the auditory pathway in the context of real time navigation for visually impaired individuals. The “K Sonar-Cane”

combines a cane and a torch with ultrasounds [7]. Note that with the K sonar cane, it is possible to perceive the environment by listening to a sound coding the distance.

“TheVoice” is another experimental vision substitution system that uses auditory feedback.

An image is represented by 64 columns of 64 pixels [8]. Every image is processed from left to right and each column is listened to for about 15 ms. Specifically, every pixel gray level in a column is represented by a sinusoidal wave with a distinct frequency. High frequencies are at the top of the column and low frequencies are at the bottom.

Gonzalez-Mora et al. developed a prototype using the spatialization of sounds in the three dimensional space [6]. Each sound is perceived as coming from somewhere in front of the user by means of head related transfer functions (HRTFs). The first device they achieved was capable of producing a virtual acoustic space of 17*9*8 gray level pixels covering a distance of up to 4.5 meters.

Our See ColOr interface encodes colored pixels by musical instrument sounds, in order to emphasize colored entities of the environment [2, 3, 4, 5]. The basic idea is to represent a pixel as a directional sound source with depth estimated by stereo-vision. Finally, each emitted sound is assigned to a musical instrument, depending on the color of the pixel.

In the present work we introduce a novel See ColOr interface for the interpretation of static pictures represented on a special paper with palpable roughness. The conjunction of touch and audition plays an important role for the interpretation of static pictures, as visually impaired individuals are very well acquainted with these two sensorial channels. The

(3)

essential new feature is the multi-touch modality instead of a previously developed mono- touch interface. This new interface will be useful for training purposes. In practice, potential users will be able to learn the colors/sounds associations and their spatialized rendering before using the camera-based See ColOr prototype . In the following sections we present several achieved experiments and the new See ColOr interface, followed by the conclusion.

2. Achieved experiments

During the See ColOr project, we performed several experiments with six blindfolded persons who were trained to associate colors with musical instrument sounds [2, 3]. The participants were asked to identify major components of static pictures presented on a special paper lying on a T3 tactile tablet (http://www.rncb.ac.uk/t3/index.htm) representing pictures with embossed edges. When one touched the paper lying on the tablet, a small region below the finger was sonified and provided to the user. Color was helpful for the interpretation of image scenes, as it lessened ambiguity. As a consequence, several individuals participating in the experiments were able to identify a number of major components of images. As an example, if a large region “sounded” cyan at the top of the picture it was likely to be the sky. Finally, all participants to the experiment were successful when asked to find a bright red door in a picture representing a churchyard with trees, grass and a house.

The work described in [4] introduces an experiment during which ten blindfolded individuals participants tried to match pairs of uniform colored socks by pointing a head mounted camera and by listening to the generated sounds. The results of this experiment demonstrated that matching similar colors through the use of a perceptual (auditory) language, such as that represented by instrument sounds can be successfully accomplished.

In [5] the purpose was to validate the hypothesis that navigation in an outdoor environment can be performed by “listening” to a colored path. We introduced an experiment during which ten blindfolded participants and a blind person were asked to point the camera toward a red serpentine path painted on the ground and to follow it for more than 80 meters. Results demonstrated that following a sinuous colored path through the use of the auditory perceptual language was successful. A video illustrating this experiment is available on http://cvml.unige.ch/doku.php/demonstrations/home.

3. A multi-touch interface

The purpose of our new multi-touch interface is to explore with one or several fingers a picture represented on a special paper emphasizing object contours by palpable roughness.

In this way, a visually impaired individual will be able to determine shapes in pictures by touch together with musical instrument sounds representing color. Each user finger contact point is a potential sound source with its rendering depending on the color of the touched pixel. The sound of a sonified pixel lasts 300 ms and is located at a particular azimuth angle.

In our current implementation the maximal number of sonified points is equal to 8. Sound sources are distributed uniformly on the azimuth plane. For instance, with only a finger

(4)

contact point the user perceives a sound source in the middle. When two pixels are sonified, the first sound source is on the left and the second on the right. With three contact points, the first sound source is on the left, the second on the middle and the last one on the right. A similar framework is applied for more than three sonified pixels. We reproduce the azimuth angles of sound sources with the use of the CIPIC database [1].

The audio signal from a sonified pixel is a mixture of sounds from two musical instruments, whose parameters depend on the values of the hue, saturation and luminance variables of the HSL (Hue, Color and Saturation) color system. Specifically, the six basic hue are encoded with the timbre of a musical instrument (oboe, alto, pizzicato violin, flute, trumpet, piano, saxophone), saturation is one of four possible notes (Do, Sol, Sib, Mi), and luminance is also one of the four possible notes represented by bass when luminance is rather dark and singing voice when it is relatively bright. Note that with several simultaneously sonified pixels it would be difficult for the user to understand named colors rendered by voice.

The detection of a finger contact point is performed by image processing algorithms. In order to easily determine finger coordinates we put on user nails small pieces of colored tape. With a webcam 40 cm above the special A3 paper with embossed object contours it is possible to filter this color and to determine the centers of gravity of colored tapes.

Subsequently, these points are sonified.

4. Conclusion

This work summarized the experiments carried out with See ColOr interfaces based on the mapping between colors and musical instrument sounds. Moreover, we introduced a new multi-touch interface that will be useful for training purposes. One possible development in the future would be the use of a small multi-touch touchpad integrated in a See ColOr interface with cameras.

5. References

[1] Algazi, V.R., Duda, R.O., Thompson, D.P., and Avendano, C., “The CIPIC HRTF Database”, In Proc.

WASPAA'01, New Paltz, NY, USA (2001).

[2] Bologna, G., Deville, B., Pun T., and Vinckenbosch, M., “Identifying major components of pictures by audio encoding of colors”, In Proc. IWINAC’07 (2007), (2) pp. 81-89.

[3] Bologna, G., Deville, B., Pun T., and Vinckenbosch, M., “Transforming 3D coloured pixels into musical instrument notes for vision substitution applications”, Eurasip J. of Image and Video Processing, A. Caplier, T.

Pun, D. Tzovaras, Guest Eds., (2007), Article ID 76204, 14 pages (Open access article).

[4] Bologna, G., Deville, B., Vinckenbosch, M., and Pun, T., “A perceptual interface for vision substitution in a color matching experiment”, In Proc. IEEE IJCNN, Int. Joint Conf. Neural Networks, Part of IEEE World Congress on Computational Intelligence, June 1-6, 2008, Hong Kong.

[5] Bologna, G., Deville, B., and Pun, T., “Blind navigation along a curved path by means of the See ColOr musical interface”, Submitted to CHI’09.

[6] Gonzalez-Mora, J.L., Rodriguez-Hernandez, A., Rodriguez-Ramos, L.F., Dfaz-Saco, L., and Sosa, N.,

“Development of a new space perception system for blind people, based on the creation of a virtual acoustic space”, In Proc. IWANN’99 (1999), pp. 321-330.

[7] Kay, L., “A sonar aid to enhance spatial perception of the blind: engineering design and evaluation”, The Radio and Electronic Engineer, 44 (1974), pp. 605-627.

(5)

[8] Meijer, P.B.L., “An experimental system for auditory image representations”, IEEE Trans. Biomed. Eng., 39 (2) (1992), pp. 112-121.