iListen 1.7 – The Search for the Speech Recognition "Holy Grail" and the Power of Voice!
reviewed by Robert Pritchett
MacSpeech, Inc. 50A Northwestern Drive Salem, NH 03079 1-504-628-3640 FAX: 1-928-222-7978 Released: This version, June 2006. $150 USD with USB headset, $179 USD with Pro Express headset, Transcription Solution for $269 USD or Transcription Solution Professional 4000 for $549 USD. http://www.macspeech.com/store/ Also available in Australian/NewZealand English, German, Italian and Spanish. HelpDesk: http://macspeech.crmhelp.net/ For anyone interested in Òhands-off computing that would rather speak than type. Requirements: Mac OS X 10.3 or later; 512 MB RAM (more the better); high-quality noise-canceling mic. And the ability of being able to speak. Strengths: Runs in all apps, voice commands for mouse controls and also can learn inside most any apps. Ubinaried. Weaknesses: None found. Okay, built-in transcription from a file – (was removed because folks figured it could also do interviews and speeches) – Coming in TranscriptionPaks soon. It will be made freely available to current users of Transcription 1.6.8. And perhaps the Corrections function could be smoothed out a little. Running on an iMac G5 (1.8 GHz) with Mac OS X 10.4.6 Tiger and 1 GB DDR SDRAM) and either an AVID mic or VXI Parrot Pro mic. Interview – MacVoices: http://www.macvoices.com/archives/2006/660.html Other Reviews: iListen Rocks! And we discussed it in episode #61 of the Mac ReviewCast at http://www.macreviewcast.com PakPlace: http://www.115volts.org/pakplace/ |
|
I've been beta-testing the latest version of iListen and have been impressed with the behind-the-scenes activities that have made this the best Speech Recognition program for Mac OS X. Its strengths are in three areas that have improved functionality and get us closer to a speech-only way of running a Mac: TalkAnywhere (dictate anywhere keyboard entry is done); CorrectAnywhere (correct mis-recognized words within apps instead of inside iListen outside the apps); MouseAnywhere (mouse controls by mouth).
It's nearest competitor is either Nuance Communications's Dragon Speaking http://www.nuance.com/naturallyspeaking/ for the PC or Via Voice (originally from IBM) for the Mac http://www.nuance.com/viavoice/osx/ Naturally Speaking costs, much, much more and is broken out into specialty areas. It has about 25 years worth of development effort in it. ViaVoice hasnÕt been updated in a long, long time.
Good Speech Recognition has been a "Holy Grail" of sorts for the computing industry for nearly 50 years. The MacSpeech Team http://www.macspeech.com/company/people/ has been fanatical about making it work correctly on the Mac and this version is the best yet!
The key to good functionality is threefold; a good Mac with lots of RAM and the latest version of Mac OS X, a great mic that has noise cancellation, and of course, a voice. If you can't speak, you can't use this app.
The latest mic certification with MacSpeech is VXI's Parrot unit. As much as I'd like to have one mic that "does it all", I've been discovering that certain mics do things for certain functions better than others do. MacSpeech has certified mics from Andrea, Plantronics, Telex and VXI. Apparently the key is that the mics work with USB ports. I have found that using mics through the online port don't exhibit the quality of throughput found by using the USB ports instead. iListen has a "Set up my Microphone" function that tests connectivity , volume, gain, etc. per mic. If the mic doesn't meet the minimum, it says so. For example, I could never get the USB-based Snowball Studio mic to pass muster (tried all 3 switch settings). And the VXI Parrot mic without the VXI USB Translator also didn't get past the test.
Other Speech recognition apps over the years have required beefy PCs, lots of RAM and months of training to get them functional. For the last few years, MacSpeech made their app functional within 5 minutes of training iListen to recognize and use your voice. There are a number of training texts to choose from, but Chuck Rogers and company are attempting to improve the algorithms and phoneme functions to use as few as possible phrases to get the most out of iListen and these have been both fun and challenging to beta-test using new profiles. Initial accuracy is 80%, so some keyboarding is still required. It is possible to get up to 96% accuracy by going through a few of the training modules. And there is a process also available for iListen to learn your writing habits for even better accuracy. The documents have to be in either plain text (.txt) or Rich Text Format (RTF) in order for iListen to analyze them, but it works!
Each profile learns one voice. So if you have one machine and multiple users, you only need one license to use the program for various people. However, I think most office environments that want to be productive are going to want to get more than one license instead of dedicating one Mac to be a dictation machine, right?
By the way, iListen works with ScriptPaks. These are app-specific AppleScript macros that the folks at MacSpeech have created to enhance interoperability with iListen and run anywhere from $10 USD to $60 USD, depending on sophistication. Of course, if you know AppleScript, you can "roll your own", and there is a keyboard button remembering feature in version 1.7, but I've found that most folks don't either have the time or inclination to go the Òremember my keyboard processÓ. MacSpeech made it easy. The most recent releases of ScriptPaks can be seen at http://www.macspeech.com/store/default.php?cPath=11 And there is a website that discusses ScriptPak sets at: http://www.115volts.org/pakplace/
iListen plays nice with Spotlight. It also can generate text from WAV or AIFF audio files. Perhaps with the next release of Mac OS X, Automator will play nicer.
After loading iListen and adding and training a voice profile, the next step is running the program through the floating iListen Feedback window. Then it's just a matter of picking one of the buttons for dictating, command or correct spelling. There is a long list of things that can be done to either open, apps, jump to websites or edit text in your favorite text editor.
Dictation requires that you use certain vocabulary like; period, new line, exclamation mark, comma, colon, semi colon, question mark, open parenthesis, close parenthesis, quite, unquote, paragraph, etc. It is intelligent enough to know most of the time when a new sentence begins. It is good practice to not look at the screen while dictating, because it will drive you nuts watching your voice go from analog to text, as iListen tries desperately to keep up with you.
Looking at the screen while navigating around, is good practice however, when searching, selecting, correcting or switching.
When iListen encounters new words, they can be added by either creating new phonemes or by being spelled out so the next time those words can be recognized using the What You See and What You Say fields with such "words" as "macCompanion".
There are such things as text macros that can go up to 32,000 characters in length as a command set, assuming you will want to create your own application-specific commands.
And if something just doesn't jive, you can always say, "Scratch That" and begin again.
I have noticed that if I use the mouse to go to the dropdown menus, the Feedback window becomes inactive until the dropdown menus go away. iListen doesn't want to get confused. The mic in the Feedback window won't turn on until it is moused first, so complete "hands off" will need a few more AppleScript functions added, at least to my machine, if I want to navigate my Mac with my hands tied behind my back.
There is an 18-page Tutorial for version 1.7 of iListen that is well written.
As the next generation (our children) becomes more adept at multi-tasking, they will perhaps learn how to operate Macs like either helicopter pilots or hovercraft learn to use all their senses to run those sophisticated machines by using both feet, hands, eyes, ears and mouths. With iListen, we can adapt to become more proficient with our Macs by using the power of the Mac along with the ingenuity of MacSpeech, to not just use eye-hand coordination with screen, mouse and keyboard, but also by using our mouths to get work done more quickly and efficiently. This isn't just a boon to those who have physical challenges, but to all who want to make the Mac experience even better. As Chuck Rogers said, "Anybody who uses a keyboard is a candidate for iListen".
Go listen to Chuck Joiner's interview of Chuck Rogers on MacVoices: http://www.macvoices.com/archives/2006/660.html
As I get older, my hands seem to suffer from Dyslexia (letter transfers from the wrong fingers). As I learn to use iListen more and more, my dependency on the keyboard will become less and less and my dyslexia will not be such a burden as before. That alone is worth the cost of this "way cool" enabling app!