?

Log in

No account? Create an account
color cycle (slow)

Kistaro Windrider, Reptillian Situation Assessor

Unfortunately, I Really Am That Nerdy

Previous Entry Share Next Entry
Start Semagic; click login; go to Subject; Dear Aunt, let's set so double...
nyah, tongueout, glasses, nerd
kistaro
I configured Voice Recognition on Vista and then used it for most of my Parallel Architecture homework. Despite my only using minimal training instead of full training, it's a lot more accurate than XP recognition was; this is actually really usable, while XP required so much correction it wasn't worth the trouble. I really like not having to switch between Dictation and Command modes, although I'll have to look up a workaround for when what I want to write would be otherwise recognized as a command. (Although Vista figures it out in most cases by my tone of voice, which is a remarkably clever bit of code. I've tested this- say a command word, but change the inflection, and it will change whether it dictates it (in Word) or performs the command.)

Unfortunately, I've gotten slightly too used to its voice recognition, in a very short amount of time. At least when I'm in that mindset. Voice recognition is really convenient for me with how my computers are set up; it's not that convenient to type on my Tablet PC given how my desktop takes up my entire desk space (with its two monitors), so I probably actually will be using it; it's convenient to have homework in one computer and references and the assignment in the other. (And by keeping my homework on my Tablet, I take it with me to class- if I forget to print out an assignment, I can switch the most recent copy to a jump drive and print it off a lab computer with relatively little trouble.) But getting used to voice recognition can be a problem. Let's go back a few hours to when I decided I wanted a hard copy of the general voice command reference.

"What can I say?"
ding!
"Maximize that."
ding!
"Show all."
ding!
"Print."
doot?
"1." (It wasn't sure whether I meant Print or Print Preview, so it labeled the buttons with numbers to ask for clarification.)
ding!
"OK."(Numbered commands have to be confirmed.)
ding!
"Print."
doot?
"2." (This was ambiguity between Print and Print Settings.)
ding!
"OK."
ding!
*whirrr... shoof, shoof, shoof* (well, what do you think a laser printer sounds like, if not whirr shoof shoof shoof?)
*rustle rustle*
"Staple that."
blat?
"Staple that."
blat?
"St- wait, never mind."
blat?
"Stop listening."
donk!
*kchunk*

Maybe I need to implement voice recognition on my stapler next.

  • 1
My laser printer (HP 1320n) sounds more like "whrrrine chucka chucka fwoosh kachunk." Maybe it speaks a different dialect?

Somehow, I had to read this a couple times before I noticed the joke.

WHAT? You ganked my post topic of the night. But I'm using Dragon 9. :>

Telepathic intellectual property theft! *waves fingers spookily*

I find it mildly amusing that the one of us who does not seem to be a dragon is the one using speech recognition software by that name. Then again, I am easily amused.

There exists a picture of me with a dragon alt. *nod* Somewhere. *whistles*

I want one of the ooooold silk banners that Dragon used to give their VAR's in the 90's. Oh my those were pretty. Pity their software sucked compared to VoiceType Dictation/2 at the time. IBM just owned that market utterly.

I remeber VoiceType! OS/2 was my family's operating system of choice then. I remember it working very, very badly. Then again, my age was in the single-digit range at the time, and I know I didn't have the patience to do the Kirk-like   split   word   articulation    required   for   it   to   figure   out   what   I   said. I mean, you'be observed how quickly and incoherently I naturally talk, and that doesn't compare to what was required back with speech technology then! The hit rate for Vista is, relatively speaking, absolutely phenomenal.

It definitely didn't help that I had a speech impediment. Most people had trouble understanding me, so it's not a surprise that it was basically a lost cause for my computer. Apparently, VoiceType cannot adapt to someone who nasalizes the letter-sound "R". (If you try to figure out how to nasalize an R- if it feels like you're choking, you're on the right track. At the time- and it took years to really learn otherwise- I thought that was correct.)

And, of course, yay for dragons. But then, I'm biased... (*is one of those scary insane spiritualist people*)

"Maybe I need to implement voice recognition on my stapler next."

That would be keen. Then when I get asked by 328420394 students every day do we have a stapler, or where is the stapler, it would just zoom up and present itself to them.

Implementing the hoverjets for the stapler so that it could in fact "zoom up" is left as an exercise for the reader.

Ignoring automatic activation of the stapler, semi-automated stapler retrieval can be done with floor decorations and a standard LEGO Mindstorms set. A line-follower that detects and counts intersections can follow a black grid (flanked by white so the light sensor can see when it's gone off track, with intersections marked with white spots) to any destination in a regular grid of desks- or any intersection, should some other arrangement be more practical.

Voice activation can be done by a computerized base station, although in the non-quiet environment it would likely be in, it would probably have too many false alarms to be practical. Individual call buttons at possible stapler-retrieval points would work better.

All things considered, hoverjets would probably be more practical.

...but darn it, don't little plastic robots have more style?

*tap tippity tap*
Ding!
*tap*
Ding!
*click*
*tap tippity tip tap... tap*
Ding!

I was going to try to dictate what printing something does, but it just goes *whrrrr* with no end in sight. Time to fix my printer...

I fear it wouldn't be very useful for my essays, as it won't know a lot of scientific words (judging by Word's dictionary, at any rate). I probably type faster than I talk, too.

I think the largest usability problems they have, from when I've used them before, are to do with punctuation, though. Do you have to say "Open quotes stop listening period close quotes" or the like? How about "It wasn't sure whether I meant Print capitalise that or Print capitalise that Preview capitalise that comma..."?

I also wonder if they localise for non-Americans...

I type and talk at about equal speeds- I talk slightly more quickly naturally, but waiting for the recognizer slows me down. The advantage is in the way my computers are set up on my desk- in a word, clumsily; there's no way to set them up so I can comfortably type on my laptop while still being able to move the mouse of my desktop computer. (This is a full desk. College dorms are like that.) Voice recognition makes that much less of a problem.

Yes, punctuation is inconvenient. You are almost verbatim correct about the punctuation- except I added a few "literal" in there to make sure it didn't interpret my command as a command. ("Literal" is the escape, forcing it to interpret the next word as dictation; "literal comma" is required to get "comma" instead of ",".)

I know there's a setting that would understand you, although other localizations aren't too far along- other than a few variants of English, you've got Japanese and Chinese. Guess what countries the original researchers were from...

It does have a really clever feature to dramatically enhance recognition: if you let it, it will read all your e-mail and documents and look for words it doesn't know. It will tentatively add them to the dictionary as possible things to recognize, with unknown pronunciation or part-of-speech. This is how it got the words "draconity", "draconid", "Otherkin", and "motherfucker" right the first time I said each of them.

Obviously, it can also be explicitly taught new words- if it misrecognizes something, if it doesn't have it listed as an alternate- and saying the word again doesn't bring it up as an option- I can spell it, and if it still doesn't know the word, it will add it to the dictionary- and explicitly ask about capitalization and for a very clear reference pronunciation to maximize the chance of recognition.

Inconvenient exactly once, but then it tends to remain correct!

  • 1