Project - Voice Activated Eblock
Team Members: Eric Frohnhoefer
CS 179J: Senior Design Project in Architecture / Embedded
Systems
Project Complexity and Design Description:
We want to create a voice eBlock that is able to interface with the rest of the eBlock set. This eBlock would be able to take voice commands from either everyone or a specified user, based on the mode selected. In normal mode, the block would listen for a specific two word phrase from any user. The first word would be it’s “name”, this name is programmed by the user or through DIP switch settings. Once the device hears it’s name it will await a command, either On (device outputs a “Yes”) or OFF (device outputs a “No”). Because each device is named, the user does not need to worry about multiple voice eBlocks interfering with each other. All this is done with no training and no need for a PC.
In security mode, the device does the same thing, except only for a specified user. That user would have to train the block for the user’s voice and after that the block will only function for that user. Some security measures will be put in place to prevent retraining to circumvent security mode.
Range of device would be approx. 3-4 meters with a slightly elevated voice. Background noise would not affect device (within reason).
Possible Features (Time permitting):
Trade Off Analysis:
| Technology | Technology Description | Cons | Pros | Design Decision |
|
Speaker Independent Vs. Speaker Dependent
|
Speaker independent voice recognition requires no training on the part of the user. Where as Speaker dependent requires some initial training. |
Speaker Independent:
Speaker Dependent:
|
Speaker Independent:
Speaker Dependent:
|
We decided to go with the speaker independent technology because the product will be easier for the consumer use. In addition the time-to-market will not be greatly affected if we go with the ASSP (Application Specific Standard Part) option. In fact it may be less because we would not have to worry about the extra logic and programming required for voice training. |
|
Software Recognition Vs. Integrated Circuit Recognition
|
Using an IC to handle voice recognition in addition to a general purpose processor for control functions. In contrast to using a general purpose processor for handling voice recognition as well as control functions. |
Software Recognition:
|
Software Recognition:
IC Recognition:
|
The decision was made to go with IC recognition due to the short time-to-market constraint. This is done at the expense of power and cost constraints. However, we may be able to shut the PIC down when no words are being recognized saving us some power. |
|
Unique Block Vs. Common Block
|
Can multiple blocks be controlled independently when in close proximity? Unique block has an additional word to differentiate it from other blocks whereas with the common block there would be no way to differentiate between different voice blocks. |
Unique Block:
Common Block:
|
Unique Block:
Common Block:
|
We decided to go with the unique block because it allows us to add more flexibility to the product without making the product extremely difficult to use. As far a engineering goes we would have more complex control software and addition hardware (dip switches). |
|
Continuous Listening Vs. Non-Continuous Listening
|
With Continuous Listening, the voice block would listen constantly for commands, whereas the Non-Continuous would only start to listen when given some “yes” input, such as a button press from a button block. |
Continuous Listening:
Non-Continuous Listening:
|
Continuous Listening:
Non-Continuous Listening:
|
Decided to go with continuous listening mode however may we may add a low-power mode so that both are possible time permitting. |
|
Multiple Word Recognition Vs. Isolated Word Recognition
|
With multiple word recognition, small phrases are possible, with more natural speech, whereas with isolated word, the user would speak slower and be able to only use single words. |
Multiple Word Recognition:
Isolated Word:
|
Multiple Word Recognition:
Isolated Word:
|
We decide to go with multiple word recognition because it allows for more natural speech which would make it easier for the user to use. Currently most Voice ASSP support multiple word recognition. |
Cost:
$150 for development kit and speech hardware.