A computer program that translates pixels of digital computer images into musical notes has been a critical tool in the research efforts of Victor Kai-Chiu Wong grad, electrical and computer engineering. With a little over a semester of testing, the program has allowed Wong, who has been blind since the age of seven, to quickly interpret photographs and other graphical data of upper atmospheric phenomena, the focus of much of his research.
“[The software] converts the actual pixel information in an image into sound,” said engineering student Ankur Moitra ’07, the author of the program. With the use of a Wacom tablet and stylus, a device similar to a mouse that associates positions on a computer screen to points on a tablet, Moitra’s program translates the color of a point on the screen into a musical note.
“We used the sound, a pitch of the piano – that way we could actually explore a three-dimensional figure,” Wong said.
For instance, during Wong’s inspection of upper atmospheric emissions, the color dimension allowed Wong to gather crucial information about the wavelength and intensity of the emitted radiation – information that for visible light is relayed through color and brightness and for invisible light is often converted to color for interpretation.
“These [emission] lines … come from the transition of the state of molecules and atoms,” Wong said. The wavelengths provided information about the chemical makeup of the upper atmosphere, while the perturbations in intensity implied changes in “certain parameters such as electron density,” Wong added. Collectively, the data provided Wong with knowledge of the current state of the high-altitude weather surrounding the earth, which affects communication, the Global Positioning System and even radio broadcasting.
The work on the software began this past summer, when other techniques of translating visual data for Wong became too difficult and time-consuming.
“We started with the basic research question of how to represent a detailed color-scaled image to someone who is blind,” said research associate James Ferwerda in a press release. “The most natural approach was to try sound, since color and pitch can be directly related and sensitivity to changes in pitch is quite good,” Ferwerda added.
Nonetheless, when writing the program, Moitra faced other conceptual and technical challenges.
“It [was] very difficult to determine the edge of the image, and that is the first thing you want to determine,” Moitra said. The problem was solved through the use of an “advanced edge-detection algorithm” that prints out a three-dimensional copy of the image’s edge. That copy could then be overlayed on the tablet, so that Wong could feel the outlines of an image.
A second challenge arose from the need to simplify the copious amounts of data that are encoded in extremely high-resolution images.
“Oftentimes it is more beneficial to get a less fine resolution,” Moitra said. Accordingly, the translating software reduces the number of points, but retains most of image’s information through a process called a Gaussian blur.
“What you actually do is average the points around so that it gives you more of a smooth feel,” Moitra explained.
Wong’s success with the program thus far has inspired many ideas for improving the software. For instance, “we are trying to get a pattern recognition done – have the software automatically scan for major features of an image,” Wong explained. Currently, despite the benefits of the Gaussian Blur, Wong has to explore an image point by point to identify its major features.
“If there are lots of data and the data are complex, this process [can] be very time consuming,” Wong said. However, with pattern recognition, the translating program would be able to identify large blobs of a single color and provide the user with a faster global interpretation of an image.
Archived article by David Andrade
Sun Staff Writer