Tokyo PC Users Group
	  Home Page
Members Only
Become a Member
Meeting Info & Map
Corporate Members
Workshops & Training
Other Clubs
Job Hunting?

wHo's HeaRING thINgs?

Kurt Keller

wHo's HeaRING thINgS?

by Kurt Keller

Talking to your computer?

Many people talk to their computers, or rather shout at them. Usually this happens when something does not work as expected. Recently I took a different approach. I started telling one of my machines what I wanted it to do. Did I succeed? Read this.

Get it

Indeed, it was the Speech Recognition feature of OS/2 4.0, which I was most curious about. Having worked with OS/2 2.0 and several versions of 3, I had a good idea of what to expect from the latest release. With 4.0, there's no need to decide on one or even buy various versions. 3.0 had packages with and without support for Windows applications and versions with and without Network support. But 4.0 is a "one package has it all-install the components you need" product.

Originally I wanted to order it from the USA, but was told "Sorry, we can't ship OS/2 to abroad any more." Well, contacting the Japanese distributor was the next step. However, 40,000 Yen for the US version? That's quite tough. Having three previous packages and all of them being full versions, I sure was eligible for an upgrade instead. Bad news: the Japanese distributor only sells full versions.

That was last year, shortly after 4.0 had been released. In December, I finally got my copy. A fellow, having ordered the UK version, received the US package instead and finally ended up with two copies. So I bought his unwanted US copy. That time I knew it would be at least a couple of weeks before I was able to install it; a couple of Win95 and Unix systems needed to be installed and configured first.

Install it

My test machine has a SCSI harddisk, which is installed as a second drive, and a removable SCSI JAZ disk for booting various OSs. This works fine with DOS/Win31/Win95/WinNT/FreeBSD, however, whatever I did, OS/2 would not recognize the JAZ drive on install. It is possible to have it recognize the JAZ once the system is installed, but no way to install to or boot from the JAZ. Finally I resorted to adding a 500 MB IDE drive to the system. Once that was done, the install went fine. No hiccups, nothing.

The NE2000 network card is what just about every network card is compatible with. If you don't have a driver for your card, find out the IRQ and address it uses and then use a NE2000 driver. This is also what I usually do. However, OS/2 does NOT have a generic driver for the NE2000! Good luck-there was none of my "no name and no or only few drivers" cards in this machine. Installing drivers for the Etherlink III was a snap, but trying to get an unsupported card without drivers to work may well fail.

Installing everything, including Network utilities, VoiceType (Speech Recognition), Java etc., you'll need a couple of hundred MB. IBM says 100-300 MB of hard disk space (200 for the default installation), Intel 486/33 or higher (Pentium/100 when using Speech Navigation/Dictation) and 12-16 MB RAM (8-12 MB additional for Speech Navigation/Dictation).

Using the Speech Recognition features keeps the harddrive quite busy, so you'll probably want to have a fast drive.

OS/2 can be installed to a different partition than your C: drive. However, be warned, during installation it will create a directory \IBMINST\ on your C: drive. After installation, you can safely delete this directory again.

Aladdin's ghost in the computer

Well, ready for getting first hands-on experience with Speech Recognition. Quite honestly, even though I was much interested in it, I did not expect to actually use it much. And I did not even expect it to be all that usable. But I do believe that in a couple of years, we'll be using this on a regular basis.

Ten years ago, big banks in Switzerland invested heavily in hardware that was able to read digits from payment slips. There were two special fonts for this, OCR-A and OCR-B, anything else was not recognized.

Five years ago, you could spend a fortune on systems that attempted to convert scanned or faxed documents into editable text. And conversion was not all that reliable.

Today you can get reasonably priced and reasonably reliable software for converting faxes and scanned pages to text. And depending on what kind of and how much typing you do, a scanner and some good OCR engine can

save you much work.

I expect Speech Recognition to take a similar path as OCR. Probably today Speech Recognition technology stands at about where OCR was five years ago. In a couple of years, while taking off your coat at the office, you can tell your computer "Get my e-mail, show me todays to-do list and download todays news from Yahoo's webpage." Hey, this will allow you to sleep five minutes longer in the morning!

Today it's not yet this simple. Many simple commands are recognized without problems: close - left - right - jump to desktop - open drives - up - down... But some things are still just beyond what the ghost in my lamp-err, computer-can understand. Be it due to my accent (non-native speaker who learnt UK English, rather than American) or whatever, but for example "cancel" simply won't be recognized. No matter whether I pronounce it in UK or American English. Not even training the word yielded any better results.

Navigation by voice is quite possible, albeit slow and sometimes troublesome. The keyboard is still the absolutely fastest way to get around on your computer.


There are two types of dictation in OS/2 4.0: normal and spelling. With normal dictation, upon the command "begin dictation" a window pops up and you can start dictating a letter to your "secretary in a box" (I wonder whether they will improve the interface, so you can chose blonde, brown, red; long hair, short hair... instead of simply another open window :) While you speak, the words appear in the dictation window and as you go on, previous words may change again, depending on context. Obviously, there has been quite some research and logic gone into this. But what about the quality of such input? Hmm, maybe I should ask IBM whether they have a version for non-native speakers (or engage some more hours in training the software and myself).

The other variant is spelling words letter by letter. There is a version of Netscape Navigator for OS/2 (hopefully IBM ships the next version with Netscape only and abandons WebExplorer), which is what I used to try spelling. Entering URLs by voice. Most letters work fine, but once in a while something is completely wrong and you have to say "backspace." Accessing .com sites was a big problem, though. "n" and "m" simply sound too similar.

Speech Recognition, is it useful?

I'm a fast typist-much faster than what I can do with the current dictation tools. But things are improving. And don't forget, even the technology now, which is still quite some way from perfect, can be a godsend for people with disabilities.

If you are looking at Speech Navigation only, then we're already close. Some more improvement and you can control most of your computer's functions by voice. Or if you hook up your TV and VCR to the computer, you don't need to worry any more if the remote control unit can't be found. (Just hope none of the characters in a film have something like "Stop the video recorder and turn off the TV" in their script.)

Going slim

Installing OS/2 4.0 on my test machine was no problem, the resources are sufficient. But what about slimmer environments? I have a couple of 200 MB disks for my notebook and on my main system there is an OS/2 3.0 installation, without Windows and network. The whole OS/2 system resides on a 70 MB partition, with further 30 MB for miscellaneous things and the remaining 100 MB for my DOS work stuff. What I wanted was Windows support so I could use Eudora and TCP/IP networking for doing internet work from the notebook using the LAN, while still keeping my DOS environment. It was clear, that fitting all this into 70 MB would be impossible. So I tried a network installation to a 100 MB partition. Hadn't it been for the over 20 MB the TCP/IP utilities require, I could have succeeded. Well finally I ended up deleting not really necessary things from the DOS partitions, so that 150 MB could be assigned to OS/2. Installation was tight, but successful. Once installed, deleting helpfiles and utilities I did not need, gave me an installation of OS/2 4.0 with Windows support and TCP/IP networking, occupying just over 100 MB and working reasonably well on my 486/33 notebook.

I'm keeping it

After working a couple of days with the new setup, the hard disk died. (Well, this disk had given me some errors before and I don't think the murderer was OS/2.) My notebook can't handle hard disks greater than 540 MB, thus I bought the smallest drive available, 820 MB and reinstalled OS/2 4.0, this time with a luxurious 200 MB for DOS and another 100 MB dedicated to the Windows part. Even though the BIOS won't recognize the harddisk's real estate over 540 MB, I still can use it under OS/2. (If I had known, I'd have bought an even bigger drive.)

Already the OS/2 3.0 installation on my notebook has mainly been used as a superior task switcher for DOS. Now, with 4.0 I can do almost all of my work on the notebook, as I have gained support for Windows 3.x programs (including Win32s) and can access the internet and my Unix machines from the notebook via the LAN, all under OS/2 without rebooting. Also some keyboard issues present in 3.0 have been resolved. The only problem so far is a distorted screen on the full screen command line sessions once in a while when starting a Windows application. However, resetting the screen with a simple "mode" command brings it back to normal.

Personally, I find OS/2 much more stable than Win95. It has never locked up the whole machine when one process went crazy or I hanged a DOS session during some software debugging, while Win95 does this regularly. With Win95 I often find myself rebooting the machine, but in OS/2 I simply switch to the desktop, get the list of running programs and close the offender, then go on working. business private

© Algorithmica Japonica Copyright Notice: Copyright of material rests with the individual author. Articles may be reprinted by other user groups if the author and original publication are credited. Any other reproduction or use of material herein is prohibited without prior written permission from TPC. The mention of names of products without indication of Trademark or Registered Trademark status in no way implies that these products are not so protected by law.

Algorithmica Japonica

June, 1997

The Newsletter of the Tokyo PC Users Group

Submissions : Editor

Tokyo PC Users Group, Post Office Box 103, Shibuya-Ku, Tokyo 150-8691, JAPAN