My note detection algorithm is failing on few cases?

I am using a simple approach to find out the musical note using FFT in python
steps involved are:

Reading the sound file(.wave)

Detecting silence in the file(by computing square sum of squared elements of input falling within the window)

Detecting the location of notes using data obtained from (2)

Calculating the frequency of each detected note by using DFT

Matching the calculated frequency to the standard frequencies of notes to identify the note that is being played.

but in a case where the note should come out to be A4/440hz, I am getting a huge variation(2K Hz)
is there any fundamental error in my approach?

UPDATE: how can I pass my audio.wav file to this frequency
estimator?

the complete python code is here

window_size = 2000    # Size of window to be used for detecting silence

beta = 1   # Silence detection parameter

max_notes = 100    # Maximum number of notes in file, for efficiency

sampling_freq = 44100   # Sampling frequency of audio signal

threshold = 200





 # traversing sound_square array with a fixed window_size

while(i<=len(sound_square)-window_size):

    s = 0.0

    j = 0

    while(j<=window_size):

        s = s + sound_square[i+j]

        j = j + 1   

        # detecting the silence waves

    if s < threshold:

        if(i-k>window_size*4):

            dft = np.array(dft) # applying fourier transform function

            dft = np.fft.fft(sound[k:i])

            dft = np.argsort(dft)



            if(dft[0]>dft[-1] and dft[1]>dft[-1]):

                i_max = dft[-1]

            elif(dft[1]>dft[0] and dft[-1]>dft[0]):

                i_max = dft[0]

            else :  

                i_max = dft[1]

                        # claculating frequency             

            frequency.append((i_max*sampling_freq)/(i-k))

            dft = 

            k = i+1

    i = i + window_size

edited Nov 14 '18 at 13:42

asked Nov 14 '18 at 11:52

John

145

2

You are assuming that the frequency of the highest magnitude peak in your spectrum corresponds to the pitch of the musical note - this may be true in some cases, but it is not true in general. See numerous other similar questions here on StackOverflow for a full discussion of why pitch detection is much more complicated than you might think.

– Paul R
Nov 14 '18 at 12:19

1

@PaulR That is what we were instructed to do. I would appreciate if you tell me which question are you talking about can you provide the link?

– John
Nov 14 '18 at 12:24

1

There are a lot of previous questions - it seems that implementing guitar tuners and other similar pitch detection apps is a popular project choice for undergraduates. Just search for "guitar tuner" or "pitch detection" along with the [fft] tag and you should find lots of relevant info.

– Paul R
Nov 14 '18 at 12:37

1

@PaulR Can you tell me how can I pass my audio.wav to see if the freq_estimator.py works or not? github.com/endolith/waveform_analysis/blob/master/…

– John
Nov 14 '18 at 13:43

Please adhere to the "one question per question" rule - it seems like your update should really be a separate question.

– Paul R
Nov 14 '18 at 15:49

|
show 1 more comment

I am using a simple approach to find out the musical note using FFT in python
steps involved are:

Reading the sound file(.wave)

Detecting silence in the file(by computing square sum of squared elements of input falling within the window)

Detecting the location of notes using data obtained from (2)

Calculating the frequency of each detected note by using DFT

Matching the calculated frequency to the standard frequencies of notes to identify the note that is being played.

but in a case where the note should come out to be A4/440hz, I am getting a huge variation(2K Hz)
is there any fundamental error in my approach?

UPDATE: how can I pass my audio.wav file to this frequency
estimator?

the complete python code is here

window_size = 2000    # Size of window to be used for detecting silence

beta = 1   # Silence detection parameter

max_notes = 100    # Maximum number of notes in file, for efficiency

sampling_freq = 44100   # Sampling frequency of audio signal

threshold = 200





 # traversing sound_square array with a fixed window_size

while(i<=len(sound_square)-window_size):

    s = 0.0

    j = 0

    while(j<=window_size):

        s = s + sound_square[i+j]

        j = j + 1   

        # detecting the silence waves

    if s < threshold:

        if(i-k>window_size*4):

            dft = np.array(dft) # applying fourier transform function

            dft = np.fft.fft(sound[k:i])

            dft = np.argsort(dft)



            if(dft[0]>dft[-1] and dft[1]>dft[-1]):

                i_max = dft[-1]

            elif(dft[1]>dft[0] and dft[-1]>dft[0]):

                i_max = dft[0]

            else :  

                i_max = dft[1]

                        # claculating frequency             

            frequency.append((i_max*sampling_freq)/(i-k))

            dft = 

            k = i+1

    i = i + window_size

edited Nov 14 '18 at 13:42

asked Nov 14 '18 at 11:52

John

145

2

You are assuming that the frequency of the highest magnitude peak in your spectrum corresponds to the pitch of the musical note - this may be true in some cases, but it is not true in general. See numerous other similar questions here on StackOverflow for a full discussion of why pitch detection is much more complicated than you might think.

– Paul R
Nov 14 '18 at 12:19

1

@PaulR That is what we were instructed to do. I would appreciate if you tell me which question are you talking about can you provide the link?

– John
Nov 14 '18 at 12:24

1

There are a lot of previous questions - it seems that implementing guitar tuners and other similar pitch detection apps is a popular project choice for undergraduates. Just search for "guitar tuner" or "pitch detection" along with the [fft] tag and you should find lots of relevant info.

– Paul R
Nov 14 '18 at 12:37

1

@PaulR Can you tell me how can I pass my audio.wav to see if the freq_estimator.py works or not? github.com/endolith/waveform_analysis/blob/master/…

– John
Nov 14 '18 at 13:43

Please adhere to the "one question per question" rule - it seems like your update should really be a separate question.

– Paul R
Nov 14 '18 at 15:49

|
show 1 more comment

I am using a simple approach to find out the musical note using FFT in python
steps involved are:

Reading the sound file(.wave)

Detecting silence in the file(by computing square sum of squared elements of input falling within the window)

Detecting the location of notes using data obtained from (2)

Calculating the frequency of each detected note by using DFT

Matching the calculated frequency to the standard frequencies of notes to identify the note that is being played.

but in a case where the note should come out to be A4/440hz, I am getting a huge variation(2K Hz)
is there any fundamental error in my approach?

UPDATE: how can I pass my audio.wav file to this frequency
estimator?

the complete python code is here

window_size = 2000    # Size of window to be used for detecting silence

beta = 1   # Silence detection parameter

max_notes = 100    # Maximum number of notes in file, for efficiency

sampling_freq = 44100   # Sampling frequency of audio signal

threshold = 200





 # traversing sound_square array with a fixed window_size

while(i<=len(sound_square)-window_size):

    s = 0.0

    j = 0

    while(j<=window_size):

        s = s + sound_square[i+j]

        j = j + 1   

        # detecting the silence waves

    if s < threshold:

        if(i-k>window_size*4):

            dft = np.array(dft) # applying fourier transform function

            dft = np.fft.fft(sound[k:i])

            dft = np.argsort(dft)



            if(dft[0]>dft[-1] and dft[1]>dft[-1]):

                i_max = dft[-1]

            elif(dft[1]>dft[0] and dft[-1]>dft[0]):

                i_max = dft[0]

            else :  

                i_max = dft[1]

                        # claculating frequency             

            frequency.append((i_max*sampling_freq)/(i-k))

            dft = 

            k = i+1

    i = i + window_size

edited Nov 14 '18 at 13:42

asked Nov 14 '18 at 11:52

John

145

I am using a simple approach to find out the musical note using FFT in python
steps involved are:

Reading the sound file(.wave)

Detecting silence in the file(by computing square sum of squared elements of input falling within the window)

Detecting the location of notes using data obtained from (2)

Calculating the frequency of each detected note by using DFT

Matching the calculated frequency to the standard frequencies of notes to identify the note that is being played.

but in a case where the note should come out to be A4/440hz, I am getting a huge variation(2K Hz)
is there any fundamental error in my approach?

UPDATE: how can I pass my audio.wav file to this frequency
estimator?

the complete python code is here

window_size = 2000    # Size of window to be used for detecting silence

beta = 1   # Silence detection parameter

max_notes = 100    # Maximum number of notes in file, for efficiency

sampling_freq = 44100   # Sampling frequency of audio signal

threshold = 200





 # traversing sound_square array with a fixed window_size

while(i<=len(sound_square)-window_size):

    s = 0.0

    j = 0

    while(j<=window_size):

        s = s + sound_square[i+j]

        j = j + 1   

        # detecting the silence waves

    if s < threshold:

        if(i-k>window_size*4):

            dft = np.array(dft) # applying fourier transform function

            dft = np.fft.fft(sound[k:i])

            dft = np.argsort(dft)



            if(dft[0]>dft[-1] and dft[1]>dft[-1]):

                i_max = dft[-1]

            elif(dft[1]>dft[0] and dft[-1]>dft[0]):

                i_max = dft[0]

            else :  

                i_max = dft[1]

                        # claculating frequency             

            frequency.append((i_max*sampling_freq)/(i-k))

            dft = 

            k = i+1

    i = i + window_size

python python-2.7 signal-processing fft audio-processing

edited Nov 14 '18 at 13:42

asked Nov 14 '18 at 11:52

John

145

edited Nov 14 '18 at 13:42

asked Nov 14 '18 at 11:52

John

145

edited Nov 14 '18 at 13:42

asked Nov 14 '18 at 11:52

John

145

asked Nov 14 '18 at 11:52

John

145

asked Nov 14 '18 at 11:52

John

145

2

You are assuming that the frequency of the highest magnitude peak in your spectrum corresponds to the pitch of the musical note - this may be true in some cases, but it is not true in general. See numerous other similar questions here on StackOverflow for a full discussion of why pitch detection is much more complicated than you might think.

– Paul R
Nov 14 '18 at 12:19

1

@PaulR That is what we were instructed to do. I would appreciate if you tell me which question are you talking about can you provide the link?

– John
Nov 14 '18 at 12:24

1

There are a lot of previous questions - it seems that implementing guitar tuners and other similar pitch detection apps is a popular project choice for undergraduates. Just search for "guitar tuner" or "pitch detection" along with the [fft] tag and you should find lots of relevant info.

– Paul R
Nov 14 '18 at 12:37

1

@PaulR Can you tell me how can I pass my audio.wav to see if the freq_estimator.py works or not? github.com/endolith/waveform_analysis/blob/master/…

– John
Nov 14 '18 at 13:43

Please adhere to the "one question per question" rule - it seems like your update should really be a separate question.

– Paul R
Nov 14 '18 at 15:49

|
show 1 more comment

2

You are assuming that the frequency of the highest magnitude peak in your spectrum corresponds to the pitch of the musical note - this may be true in some cases, but it is not true in general. See numerous other similar questions here on StackOverflow for a full discussion of why pitch detection is much more complicated than you might think.

– Paul R
Nov 14 '18 at 12:19

1

@PaulR That is what we were instructed to do. I would appreciate if you tell me which question are you talking about can you provide the link?

– John
Nov 14 '18 at 12:24

1

There are a lot of previous questions - it seems that implementing guitar tuners and other similar pitch detection apps is a popular project choice for undergraduates. Just search for "guitar tuner" or "pitch detection" along with the [fft] tag and you should find lots of relevant info.

– Paul R
Nov 14 '18 at 12:37

1

@PaulR Can you tell me how can I pass my audio.wav to see if the freq_estimator.py works or not? github.com/endolith/waveform_analysis/blob/master/…

– John
Nov 14 '18 at 13:43

Please adhere to the "one question per question" rule - it seems like your update should really be a separate question.

– Paul R
Nov 14 '18 at 15:49

You are assuming that the frequency of the highest magnitude peak in your spectrum corresponds to the pitch of the musical note - this may be true in some cases, but it is not true in general. See numerous other similar questions here on StackOverflow for a full discussion of why pitch detection is much more complicated than you might think.

– Paul R
Nov 14 '18 at 12:19

@PaulR That is what we were instructed to do. I would appreciate if you tell me which question are you talking about can you provide the link?

– John
Nov 14 '18 at 12:24

There are a lot of previous questions - it seems that implementing guitar tuners and other similar pitch detection apps is a popular project choice for undergraduates. Just search for "guitar tuner" or "pitch detection" along with the [fft] tag and you should find lots of relevant info.

– Paul R
Nov 14 '18 at 12:37

@PaulR Can you tell me how can I pass my audio.wav to see if the freq_estimator.py works or not? github.com/endolith/waveform_analysis/blob/master/…

– John
Nov 14 '18 at 13:43

Please adhere to the "one question per question" rule - it seems like your update should really be a separate question.

– Paul R
Nov 14 '18 at 15:49

|
show 1 more comment

2 Answers
2

active

oldest

votes

Pitch is not the same as peak magnitude frequency bin of an FFT. Pitch is a human psycho-acoustic phenomena. The pitch sound could have a missing or very weak fundamental (common in some voice, piano and guitar sounds) and/or lots of powerful overtones in its spectrum that overwhelm the pitch frequency (but still be heard as that pitch note by a human). So any FFT peak frequency detector (even including some windowing and interpolation, which your code does not) will not be a robust method of musical pitch estimation. An FFT will also quantize frequency to some bin resolution (perhaps coarser than your requirements) that depends on the FFT (or window) length.

An answer to this stackoverflow question includes a list of some alternate methods of estimating pitch that might produce better results.

answered Nov 16 '18 at 16:33

hotpaw2

61.3k970128

add a comment |

Pitch tracking is implemented in librosa.piptrack
https://librosa.github.io/librosa/generated/librosa.core.piptrack.html#librosa.core.piptrack

answered Dec 2 '18 at 2:16

jonnor

72349

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53299627%2fmy-note-detection-algorithm-is-failing-on-few-cases%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

An answer to this stackoverflow question includes a list of some alternate methods of estimating pitch that might produce better results.

answered Nov 16 '18 at 16:33

hotpaw2

61.3k970128

add a comment |

An answer to this stackoverflow question includes a list of some alternate methods of estimating pitch that might produce better results.

answered Nov 16 '18 at 16:33

hotpaw2

61.3k970128

add a comment |

An answer to this stackoverflow question includes a list of some alternate methods of estimating pitch that might produce better results.

answered Nov 16 '18 at 16:33

hotpaw2

61.3k970128

An answer to this stackoverflow question includes a list of some alternate methods of estimating pitch that might produce better results.

answered Nov 16 '18 at 16:33

hotpaw2

61.3k970128

answered Nov 16 '18 at 16:33

hotpaw2

61.3k970128

answered Nov 16 '18 at 16:33

hotpaw2

61.3k970128

answered Nov 16 '18 at 16:33

hotpaw2

61.3k970128

add a comment |

Pitch tracking is implemented in librosa.piptrack
https://librosa.github.io/librosa/generated/librosa.core.piptrack.html#librosa.core.piptrack

answered Dec 2 '18 at 2:16

jonnor

72349

add a comment |

Pitch tracking is implemented in librosa.piptrack
https://librosa.github.io/librosa/generated/librosa.core.piptrack.html#librosa.core.piptrack

answered Dec 2 '18 at 2:16

jonnor

72349

add a comment |

Pitch tracking is implemented in librosa.piptrack
https://librosa.github.io/librosa/generated/librosa.core.piptrack.html#librosa.core.piptrack

answered Dec 2 '18 at 2:16

jonnor

72349

Pitch tracking is implemented in librosa.piptrack
https://librosa.github.io/librosa/generated/librosa.core.piptrack.html#librosa.core.piptrack

answered Dec 2 '18 at 2:16

jonnor

72349

answered Dec 2 '18 at 2:16

jonnor

72349

answered Dec 2 '18 at 2:16

jonnor

72349

answered Dec 2 '18 at 2:16

jonnor

72349

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Vfrdtyky