News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Voice Capturing Utility

Started by Ali, April 02, 2005, 05:53:30 PM

Previous topic - Next topic

Ali

I have been assigned the task to make a Voice Capturing Utility using MASM615 as my semester project. can i find some help aor guidance as to which is the best way to go about .......

OceanJeff32

Assuming you can access a microphone, that's all you need there, just record? You've captured a voice.

Or are you talking about Voice Recognition?

Hmmm, much more complicated if you are....?

Later,

Jeff C
::)
Any good programmer knows, every large and/or small job, is equally large, to the programmer!

Ali

Well, I studied the assigned task a bit today and discussed it too. What is required is that I am to capture sound from the microphone and save the sound file in WAV format. Once this is done, I have to clear uncessary disturbances in the sound file for sound optimization as much as possible, meaning to make the recorded voice more clear.

Suggestions will be appreciated! Thanks alot! Just as a reminder, I am working on MASM615 (32-bit)

Robert Collins

Quote from: Ali on April 02, 2005, 05:53:30 PM
I have been assigned the task to make a Voice Capturing Utility using MASM615 as my semester project. can i find some help aor guidance as to which is the best way to go about .......

I can't tell you how to write a sound recording application in assembly but you can check into the MS MMCI.DLL which is a complete wave developing API. Recording and playing sound is not too complicated but editing it is a different story. I did a sound application but it was in VB using mmci.dll but as far as editing it I had to use an application called 'Cool Edit'. I wouldn't have the slightest idea how to write my own editing software.

Infro_X

None of this code has been checked. and i wouldn't have the slightest if it works, but this should get you started. :)

LOCAL WF:WAVEFORMATEX
LOCAL WHDR:WAVEHDR
invoke waveInGetNumDevs
or eax,eax
jz nodevices_error
  invoke waveInOpen,addr hWave,1,addr WF,null,null,CALLBACK_NULL or WAVE_MAPPED
  invoke waveInAddBuffer,[hWave],addr WHDR,sizeof WHDR
;to start recording
  invoke waveInStart,[hWave]
  invoke waveInStop,[hWave]

gabor

Hi!

Removing the noise is a bit complicated process. I think the most important thing about this is to know exactly what is considered as noise. To remove noise would mean to remove or shrink frequency components of the sound:
1. Transform the sample into frequency domain, Fourier transformation
2. Mess with the components that can be part of the noise.
3. Transform the sound back.

I think lots of Fourier algos can be found on the net, so this is not the problem. To achieve good result you have to create an adjustable "filter" and test the results by listening to it. The more people listen to it the better your results will be, statistically. This is not a really exact way, but as far as I know there is no other method for evaluate such test results. After all, how noisy a sample is is not a mathematical problem, rather a philosophical question  :bg

I hope this can help to begin the work! Good luck!

Greets, Gábor

Mark Jones

 Regarding filtering, one thing you could try is a bandpass filter. Remove any frequencies outside say 200Hz to 6KHz. This is easy to do in the electronics domain but seems kinda difficult in software. Perhaps try a google search for existing code, somebody out there must have already done this. Converting the sound to a FFT would seem to be an easy way to "clip off" the unwanted frequencies, but I haven't touched FFTs yet, so... :)

Then if you felt like [re]compressing the audio, it would get better compression after being bandpass-filtered because there are less frequencies involved. There are some codecs designed specifically for voice also like CCITT A-Law and u-Law, GSM, CELP, G.723.1, etc, which use very little bandwidth. Come to think of it, if you record the audio as a .WAV and simpy recompress it into one of these formats, that should clip a similar frequency range, because those low-bitrate codecs achieve such low rates by having a narrow bandwidth.
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08