On my work, technology and related stuff....

Archive for October 2011 | Monthly archive page

3 comments

Like the million plus who pre-ordered their iPhone4S, I anxiously waited for my shiny new phone.  And like most of them, I was most curious about  Siri- The smart new voice recognition system unique to the iPhone4S. I’ve not had much luck with voice recognition/voice activation systems in the past- Even the rudimentary voice-control systems like a voice-activated dialer have been quite frustrating to say the least. I was interested in how well Siri would fare with my accent (quite frankly, I don’t know what accent I have- It’s a blend) and how fast Siri would respond to my commands.   Also, if Siri cares about bad (English) grammar (not that I have it!).
Voice and Natural Language processing is highly resource-intensive and the voice recognition features found on resource constrained devices rely on servers in the cloud to do the processing. This means that there is network latency to be factored into the response times. So I was curious to see if Siri works in offline mode for if it did, that would be stupendous!!
 
So when my phone arrived today, the Siri was pretty much the first feature I checked out.
OK- So not surprisingly, Siri requires an Internet connection. It sends your commands to servers in the cloud (iCloud?) to do the processing. That said, depending on the speed of the connection, the response times can vary.
 
And it responds in a woman's voice. I'm not sure why thats the case, but most voice command systems , including my GPS do. (Update: I recently found this article that might have some answers on this topic- http://edition.cnn.com/2011/10/21/tech/innovation/female-computer-voices/)

 
First, I started off with the “typical” questions. Here are a few of them  –
 
Me: “I am thinking of lunch at 12:00 today
Siri: I found 16 restaurants whose reviews mention lunch. …14 of them are fairly close to you.
 
Me: “I have to remember to pick up my daughter at 5:30PM today”
Siri: What time shall I remind you?
Me: 5:30 PM
Siri: Shall I set the reminder for 5:30 today?

Me: What are new movie releases in the US?
Siri:  I found a number of movie theaters in Washington
 
And it provided me with a list of movie theaters. But why Washington??That’s no where  close to where I live!
 
Overall, it did a decent job of understanding my accent. In fact, it did something which most of my American friends never do- pronounce my husband's last name (17 letters long) correctly!! It won brownie points right there!
 
It was not perfect- it didn’t accurately recognize words a couple of times but still understood the intent which is probably the more important part.
The Artificial Intelligence system relies on certain keywords to accomplish its tasks- For instance, “Remind” or “Remember” for setting up reminders . So a command like “Ihave to pick up my daughter at 5:30 today” does not evoke a response to set up the reminder. Instead, it responds “I don’t know want you mean by “pick up my daughter at 5:30 today’”.  Since it leverages the information about your contacts, location , address books etc to do the processing and customize the results, the accuracy of responses may vary.
 
Then as any normal person would do, for the fun of it, I decided test a slew of ambiguous or weird questions –
Me: Are you my friend?
Siri: No comment!
 
Me: I am bored. What should I do?
Siri: I’ll see what I can do to entertain you.
 
I then eagerly waited expecting it to for instance, provide me a list of movies or something- It didn’t do anything.
 
Me: I am sleepy
Siri:  I hope you are not driving!
 
Siri is funny too!
 
My initial reaction is that I am quite impressed with Siri. It handles the routine commands very well. Unlike previous voice command systems, I will be using it often. Over time, I’ll find out if it learns and responds better. Not sure if it’s a good thing, but long term, I can see this becoming a companion to a loner or a sounding board of sorts. When the APIs to Siri go public, this can open up a whole range of new possibilities to the developer community. For example, an app that takes images of you in your new dress and responds to the question- “Does this dress make me look fat?” and pat response is  “Yes Siree!!