In any discussion of opening black boxes, someone will bring up the analogy of cars: Not everyone needs to know how a car works inside. I would agree that drivers no longer need to understand how a car works, but automotive engineers do need to know. The problem isn’t users who don’t understand how software works, it’s software developers who don’t understand how software works.

 

Have a comment? Contact me on Twitter or email me.



(download)

Have a comment? Contact me on Twitter or email me.



As one example, researchers from Georgia Institute of Technology presented a Two-Thumb-Chording approach where a character is produced by pressing several keys at once like playing a chord. It takes some time and effort to master, but then text input is clearly faster.

More details: http://nirmalpatel.com/research/keypad_chording.html

Have a comment? Contact me on Twitter or email me.



A few pics from my recent trip to Bonn, Germany for Mobile HCI 2009.

1) Glass ceiling of Frankfurt airport train station
2) View while on train
3) Crazy tiny hotel room (with complementary gummy bears)
4) A recent Jun Rekimoto project (Pet Lifelogging)
5) DJ Ludwig - the best of the Ludwig statues in Bonn
6) Some pics of the rooms used for presentations

                                     
Click here to download:
mobile_hci_pics.zip (4961 KB)

Have a comment? Contact me on Twitter or email me.



Have a comment? Contact me on Twitter or email me.



I attended a talk today given by Ray Kurzweil. I started to tweet about it but soon found that the time I took to correct typos in the tweets was detracting from actually listening to Kurzweil speak. I switched over to using Notes on my iPod.

He spent a lot of time upfront telling us why his predictions seem crazy and why we should believe him. To start with, our innate ability to predict something is based on a linear function. This served us well when we saw a predator in the distance and had to figure out how long it would take for it to get to us. This is fine in the short term but hinders our ability to think or design for the future. Second, many critics use limitations in existing technologies as counterpoints. For example, many say that the age of Moore's Law has come to an end. We can not increase the number of transistors per area anymore so the exponential progress is slowing down. To these critics he says, there was a time when vacuum tubes were not going to be any smaller then along came the transistor, and the curve kept on going as predicted. The take-away for me was, his predictions are not about the physical manifestation of the technology, but about the raw capabilities of the technology.

The second major point of his talk was that as technologies are subsumed by information technologies, they to will be subject to exponential gains. His poster child for this point is the health sciences. Before the use of simulations, drug discovery was more hit and miss. New drugs were manufactured and then tested on subjects which created a lengthy process. With advances in drug interaction simulation, researchers can now model a drug molecule and test many possible outcomes before moving to lengthy clinical trials. He then proceeded to discuss the implications of advances in health to increase the capabilities of the human body and the increase of the human lifespan. His basic premise was that our genes are "outdated software" and in the future, we will be able to re-engineer various aspects of our bodies. One example he gave is some recent research in artificial red blood cells. One such cell, is capable of increasing the ability of our blood to store oxygen. Imagine being able to stay underwater, without any oxygen storage apparatus other than yourself for an hour.

Throughout his presentation he showed many graphs detailing trends and how all of the curves transcend the technologies of their time. The curve that stood out the most to me was the progress of raw computational power. The curve showing present developments weren't the interesting parts, what was interesting was that the evolution of organic computational power also fit onto the curve. Our next steps in AI are to combine the incredible processing power offered by inorganic technologies with the right "software" unleashing the next age of intelligent organisms.

In the end, my main take-aways are:

  1. Exponential growth in technology is something independent of the physical manifestation of technology.
  2. As other technologies are subsumed, become or are enhanced by information technology, they too will be subject to the exponential gains.
  3. The exponential growth in computational power not only matches the progress of inorganic computation but is an extension of millions of years of evolutionary progress.

 

Have a comment? Contact me on Twitter or email me.



                                                                                                   
Click here to download:
trip_to_cali.zip (2809 KB)

Have a comment? Contact me on Twitter or email me.



Latest Update: See the research page for Deaf 911 for the latest details

This is a short writeup of some of the work I’ve been doing recently. I want to start this post with a story based on an actual event to highlight the problem we are tackling:

Bob, a deaf man, was making a late night grocery run and was heading to his car. Before he could reach the car a mugger stopped him and threatened him with a knife. Even though Bob was carrying a gun, he handed over his wallet, but the mugger stabbed him anyway. Bob, in self-defense, shot the mugger. Bob realized that emergency help was required, but could not simply call 911, instead he SMS’d his hearing friend. This friend called 911 and related the problem to the 911 operator. This resulted, as per policy, in emergency help showing up at his friend’s location instead of the Bob’s. By the time the emergency help was redirected to Bob’s location, the mugger had lost too much blood and could not be resuscitated.

Existing Practices

Deaf people make heavy use of SMS (text messages) to communicate with other people when mobile. However, 911 centers currently do not have the resources to support these technologies. Even if it were possible, it is not desirable for SMS to be the primary means of communication with 911 for a number of reasons, such as:

  1. SMS can not be located in the same way that voice calls to 911 can.
  2. SMS messages are not necessarily delivered to the receiver as soon as they are sent (i.e. it is store-and-forward).
  3. SMS senders can not be tracked by existing 911 cell-tower based location systems.
  4. SMS communication happens on a message by message basis and not immediately as the character is typed, unlike voice where every sound is sent as soon as it is uttered.

In 1990, the Americans with Disabilities Act mandated that all 911 centers be able to communicate with TDDs (telecommunications device for thedeaf) to provide real-time access to 911 services for deaf people. TDDs make use of 1400 and 1800Hz tones to encode text as a series of bits according to the Baudot system. With the explosion of mobile phones, came policies which required mobile phone manufacturers to produce phones which were compatible with mobile TDDs and could reliably send Baudot coded signals to 911 centers. This allowed deaf people to access 911 services even when mobile if they had a mobile TDD. However, very few if any deaf people carry both their c ellphone as well as a separate TDD device given that text based messaging systems are the primary form of communication for the deaf community.

Our Solution

Our solution is to develop software for the mobile phone that does everything the mobile TDD would do. This includes encoding user-entered text according to the Baudot system and decoding the incoming audio to show text to the user. To do this requires direct access to the incoming/outgoing voice streams of a phone in real-time. Encoding text entered into the appropriate sound is performed by playing pre-generated audio files for each letter, digit and symbol allowed by Baudot. Because these encodings are fixed there is no need to have software generate the audio at runtime. Decoding is performed using the Goertzel algorithm. Because we already know which tones (1400 and 1800Hz) we are looking for, the Goertzel algorithm allows us to detect the tones faster than the more general FFT.

Due to security concerns of modern handset makers almost all modern phones prevent a software developer from accessing the voice stream of an active phonecall. Given this restriction, our current system has been implemented on the OpenMoko mobile phone. The OpenMoko is a open-source hardware and software platform. The Linux based software platform provides us full access to the voice stream. Our prototype decoder is able to perform at 2x realtime. We have tested our prototype with a traditional acoustic coupler and Georgia Tech’s 911 center, with their much appreciated cooperation, and achieved high accuracy rates.

NENA 2009 TDC/ODC

Dr. Thad Starner and I were invited to attend the NENA 2009 TDC/ODC where Thad gave a talk on this to an audience of policy makers, equipment manufacturers and 911 center operators. During the talk I performed a live demo with the help of the Georgia Tech 911 center. We also brought along our acoustic coupler and were able to have a running demo during the break for attendants to try out. Thad muffled the acoustic coupler’s handset rest with his fleece coat which helped mitigate the fact that we were talking while demoing.

Thanks to help from Richard Ray, Paul McLaren and Steve O’Conor, we were able to spend a considerable amount of time at the Orlando 911 center and were able to gather data that will help us further improve our prototype. We were also able to talk with many manufacturers of the 911 center equipment and will hopefully be able to procure a standard system to use for testing in the lab.

Future Work

Our current system is only a prototype and is therefore not ready for full deployment. We would like to gather more data from different types of 911 center equipment and use this to further tune our decoding algorithm. Additionally, we would like to work with handset manufacturers and recreate our application on mobile phones that are more popular in the deaf community such as the Sidekick. This would allow us to perform more user studies by leveraging the equipment that deaf people are already comfortable with using.

 

Have a comment? Contact me on Twitter or email me.



This is a satellite view of my dad’s hometown in India. It is a village called Dhatwa and is in the Surat District of Gujarat.


View Larger Map

Have a comment? Contact me on Twitter or email me.



Mansi and I left home after lunch, after a quick stop in Valdosta we were at the Florida border. The Florida welcome center was a good place to pick up some literature on Jacksonville and St. Augustine. They also give you free orange juice. They had these cool cut outs that I couldn’t resist posing in.

Mansi is actually this size Closest I'll get to watersports

We finally got to our hotel around 4PM and checked in. It’s a pretty nice place and we got a room that is facing the lake. With limited daylight ahead of us we went to the mall to do some casual shopping and grab a bite to eat. After sharing a sub and a slice of Sbarro, Mansi and I wandered the mall. Everything was fine until Mansi spotted a 75% off sign at Victoria’s Secret. That captured her attention more than the rhythm got Estefan. We finally escaped and then I was captured by a 50% off all books sale at a closing bookstore. I made a few great purchases. I got all 3 of these for $25. Oh, and I saw an awesome fake iPod Shuffle, at least it has a screen :).

Pattern Recognition, Blink, Science Fiction Treasury fake iPod Shuffle

 

Have a comment? Contact me on Twitter or email me.