Janine Lodato                                                                 

All rights reserved

Palo Alto, CA 94304                                                          

Approximately 2472 words

1700 Sand Hill Rd. #405                                                      

March 16, 2000

650)329-9461 or 533-3517                                                    

ClarisWorks                                                                  

                                                                          

                                                                              

                                                                     

e-mail: LaGiannina@aol.com                                                     

                           

                       

 

A New Way of Thinking -- High-Tech Style

                                                       

 

By Giannina Lodato Rakoczi 

 

Finding a voice recognizable to readers is tricky.  Finding a voice recognizable to a computer is even trickier.  After fighting off a 25-year assault of multiple sclerosis (MS), my hands can no longer type an entire document.  In order to continue writing, I must rely on voice recognition technology to do my typing for me.  Senior citizens with arthritic fingers or hands understand my plight only too well.

 

      Writers who use voice recognition technology to type written documents must keep in mind the software has two modes of operation: dictate mode and command mode.  Dictate mode is the usual method people use when speaking into

the microphone.  However, it is often necessary to access command mode to make changes in a document, say, to capitalize a word or spell it.

 

     To access command mode, one must first use a cue word.  In the case of my software, I have  programmed into the machine the word "computer" to act as the cue putting me into command mode. 

     All I need to say is, "computer select right (or left) one word," “computer capitalize this"  and "computer move right (or left) one word" and the software goes right back into dictate mode.  If I say "computer begin spell," I have accessed command mode for spelling and am ready to spell.  When  finished, I say, "computer return," and the software automatically

switches back into dictate mode.

 

    Sitting in a room nearby, my husband is easily frustrated when he hears me dictate a few words, then stop to change into command mode to correct the words the computer thought it heard me say.  He doesn't like the fact I only

dictate a few words at a time. 

 

      One day, he got up and placed a piece of paper on my computer screen so I couldn’t see what the computer was typing.  He told me to say a few sentences at a time in a natural way of speaking.  I did it and after a paragraph or so, I took down the piece of paper and looked at what the computer had typed.  Amazingly, the machine understood my words very well and

I didn't have too many corrections to make.  The lesson is, speak in a normal manner and perhaps prepare several sentences on a piece of paper so you can read them into the microphone at a normal pace.

 

       When I worked  for a California State Senator as his Correspondence Secretary, I was responsible for transcribing his dictation into written form on a computer.  I must say, his communication to me through his tape recorder was much more efficient in producing written documents than is my communication to a voice recognition machine.  Simply put, human-to-human communication cannot be beat.         

 

     I know lawyers are trying to do away with secretaries and their salaries by using voice recognition technology, but I'm not sure they have the patients required to produce quality written documents.     

 

     Lots of word-training, trial, error and patience are required when working with voice recognition software.  Once I master it though, it will be a real benefit to me as I write for class.  Until I get to know it intimately, I can produce only simple documents. 

 

     Magically, the new technology knows there are multiple spellings for some words -- to, too, two -- and gives me choices for spelling in the correction window off to the side of the document.  I need only pick the

correct spelling and the technology inserts it into the document for me.  It's wonderful!

 

     I’ve always found typographical errors (typos) amusing, but my new software’s typos take the cake.  However, if I’m tired or the computer makes the same mistake several times, I completely lose my humor and become frustrated, determined not to let it happen again. 

 

    Among the most comical computer interpretations are:

·         eat March for “emerge”  

·         in edit a bowl for “inevitable”

·         not see for “Nazi”          

·         loss low for my husband’s name, “Laszlo”

·         multiple skull roses for ”multiple sclerosis“ 

·         HBO sink receives for "idiosyncrasies" 

·         skits of frantic for “schizophrenic”

 

       The microphone into which I dictate sits right in front of my mouth, jutting out from a headset with one earphone.  The microphone is so sensitive, it even translates a heavy sigh into a, of, the, or what.  A loud sneeze ("achoo") from my husband in a room nearby inspires the computer to type aha.  Words unique to my writing must be trained.  Otherwise, if I control my breathing, monitor the whereabouts of my allergy-prone husband and enunciate clearly, the computer usually understands my words perfectly on only the second attempt.  Its first interpretations are nonetheless reminiscent of two episodes in my childhood.

 

Back in the '50s when I was a young child, I learned a song named "Mares Eat Oats."   As the words to the song go, "Mares eat oats, and does eat oats, and little lambs eat ivy, a kid'll eat ivy too, wouldn't you?"  Granted, the words do not depict anything profound, however, they are at least logical to people wanting to know what these particular animals eat.

 

     My child's ears heard instead, Mares-ie dotes ‘n does-ie dotes, ‘n little lambs-ie divey, a kiddl-ie divey too, wouldn't you?   As far as I knew, this sequence of nonsensical sounds was attached to a catchy tune and

that was the song. 

 

    I am also reminded of a grace we so often said before dinner, "Gracious Father, please bless this food for its intended uses."  I understood the prayer to say Gracious Father, please bless this food for its tender juices.

This is exactly the way my computer hears my voice, as a child would.  Lots of patient word-training is needed to make the machine familiar with my way of speaking -- my vocabulary and pronunciation.

 

    My handicap no doubt motivates me to find a way of expressing myself in writing other than by typing.  As long as I am able to talk, voice recognition technology offers me a mode of communicating never before available to people in my position. 

 

     The technology I use is called IBM ViaVoice for Mac and it has increased my productivity 10-fold.  Imagine what it could do for busines people!  Thanks to this wonderful new technology, I can now finish a book I’ve been writing in my word processing software.  My e-mail is greatly improved and I know I've only begun to tap into the wonders of voice recognition.       I still need to know how to spell, but the machine types and spells very well on its own.  As I have said before, I need to check its accuracy in choosing to, too and two or by and buy.  The technology isn't perfect, but it

does increase my productivity.

 

     At times, voice recognition even helps me spell.  When I wanted to write the word "onslaught," I sincerely did not know if the word was written with an “a” or an “o” in the middle of the word.  All I had to do was say the word into the microphone and the computer typed it for me on the screen.  I now know the word is written with an “a” and I am sure that is correct because I believe the programmer writing this software used a dictionary when

 

     Another word whose spelling I questioned was "consummate."  How many of the letter “m” should I use, one or two?  Again, all I had to do was say the word into the microphone and voila!  I was sure the spelling with two letters

“m” was correct.     

 

     Fascinated with my new voice recognition technology, I am compelled to spend as much time as possible with it learning as much about it as I can.  In spite of my handicap, I am able to produce documents I can be proud of.  I predict that before long, everyone in the computer industry will opt for voice recognition over keyboarding.   It is the wave of the future, well worth the $80 software cost, time and effort required to learn it.    

      

      Given all the changes I must make in a document produced with voice recognition software, I can see the technology is still in its infancy.  In spite of this, I find it magical, wonderful and definitely worth the effort needed to learn and adjust it.  

 

      For someone with my handicap the technology is a dream and I can only encourage those working on it and hope improvements are made quickly and smoothly. I suppose when computers are upgraded to have higher speeds and more main memory (RAM), voice recognition will improve.  In the meantime, I need only be patient.  

 

      Because I have been teaching English to foreigners for the last 20 years, I know written English grammar better than most Americans -- maybe, just maybe, even better than the American who typed the program for this

voice recognition software.  That person is an expert in computer programming, not in English writing.

 

      When we talk about particular decades, say the '60s or '70s, the apostrophe should be placed in front of the first number.  This prevents redundancy of the number 19, as in 1960s and 1970s.  Because we are talking of 10 years, we need to use a plural form tacking on an s.  Hence, the short written form of a particular decade looks like this:  '60s, '70s, NOT the way the software has it written, 60's and 70's.

              

      Another programming idiosyncrasy I spend a lot of time adjusting to my liking has to do with spaces placed between quotation marks and the text being quoted.  I suppose it is just a matter of taste, but I spend a lot of

time eliminating spaces.

 

      When I sit in front of my computer, I feel completely naked unless I have my headset for voice recognition sitting on my head.  Once connected to my new technology, I feel connected to the world via e-mail and word processing.

 

      I never know if the word I want to use is already in the vocabulary or not.  When I wanted to use the word "marzipan," the computer first gave me Mars see pan.  After I got over the giggles, under my breath because the microphone picks up every little noise, I tried again a couple of times and the computer finally typed the right word on the screen.  I was shocked to learn “marzipan” was actually in the vocabulary.  You never know until you try.       

 

My new software recognizes my voice best when I speak in half and full sentences.  Problem is, it's hard for me to be so organized in my thinking.  If the software mistakes one of my words for something else, I need only call up the correction window which appears off to the side of my document to pick the correct alternative the computer thinks it may have heard me say.  Once I choose the correct alternative, the software automatically replaces it for the mistaken word.  It is magical!

 

     If my new software does not provide me with the correct alternative, I break down and just spell the word I want using command mode and the rules of voice recognition spelling.  I often resort to spelling just because I lose my patience trying to get the machine to recognize what I have said.

 

     My husband tells me I am like a parent spoiling a child when I fail to teach my new software the right spelling of the words I say.  For example, whenever I begin a letter or an e-mail with the salutation "Dear" the computer insists on typing "Der."  Apparently, I need to teach it the proper spelling by calling up the correction window and using the choices it gives me, or by typing the proper spelling as I know it.

 

     By using voice recognition, I am able to write letters that stand up to the corporate injustices perpetrated on Americans.  With a little help from my husband, I am able to write a scathing letter to the proper authorities

and produce satisfactory results.

 

     Half the world's population is either handicapped or helping the handicapped. Because of this, voice recognition technology is a real boon to the handicapped population if they are at all computer-savvy.  Fortunately for me, my husband is responsible for getting computers started in America, so he is quite computer savvy.  Even though I represent that half of the

world which is handicapped, voice recognition technology opens up the whole world to me.

 

      It has even broadened the scope of my marital situation.  Because my husband thinks so much like a computer, my understanding of the new technology contributes to the relationship between my husband and me.      The more I get to know my new software, the less I rely on the typing skills of my Hungarian husband.  He never fails to believe that whatever I say in writing can always be better said in HIS words.  When I exercise ample patience, voice recognition technology in combination with my husband's editorial input produce written documents I can be proud of.  Without patience, I'm sunk. 

 

     When I first wrote this article in mid-March, 2000, three major companies produced voice recognition software: IBM ViaVoice, Dragon Systems and L&H from Belgium.  Dr. and Mrs. Baker of MIT developed Dragon Systems 20 years ago, forming one of the original research projects of the industry concentrating on voice software. 

 

     The Bakers’ research provided the basis for IBM ViaVoice and Dragon Systems’ NaturallySpeaking.  Another company named Kurzweil in New England provided the basis for L&H’s voice recognition software.

 

     After sitting on the article for a couple of weeks, I saw an industry   update was in order. That's how fast things move in the high-tech industry.  I only hope the biotech industry moves as quickly. 

 

     I recently learned L&H in Belgium bought Dragon Systems, giving them an advantage in the already-limited market of voice recognition technology.  This meant, only two major companies dominated the field of voice recognition: IBM ViaVoice and L&H.

 

     Yet another industry update is in order just nine months after L&H first bought Dragon Systems.  By December of the year 2000, L&H filed for bankruptcy in Belgium, leaving the field of voice recognition technology to IBM ViaVoice only.

  

     Considering both IBM ViaVoice and Dragon Systems' NaturallySpeaking, each software program is equally difficult or easy depending on the user.  IBM ViaVoice for Mac was my choice because I have an iMac, Special Edition.  The other software system runs only on personal computers (PCs).  I can do both e-mail and word processing with IBM ViaVoice. As time goes by, I will become more and more adept at using it.

 

      Writing is different when you dictate your thoughts instead of typing them out.  You must have everything organized in your mind before you open your mouth and tell somebody else how to put it down on paper.  It will be difficult for me to be so organized from the start.  It's just a new way of thinking, that's all.  What a dream to have someone else type for me. 

 

     IBM ViaVoice allows multiple users, something different for voice recognition programs.  It will be fun to watch my husband with his thick Hungarian a try to use this program. If I think the computer has problems understanding my English, I can't wait to see how it responds to my husband's English. Until now, my husband has been my voice recognition machine. But he talks back and I don't appreciate that. Besides, his spelling is atrocious, living proof one doesn't have to know English well to make money in the Silicon Valley.


I look forward to finding a voice that doesn't give me any lip.