Check if a specific value occur multiple times in a numpy array

up vote
1
down vote

favorite

I'm working on a simple KNN algorithm, where I want to add a if statement that resolves a tie (if there's an equal number of neighbors from several different classes around a test point). The problem occurs when I want to find if the maximum value of an array occurs more than once, but i can't seem to find a function that does this. What i want:

unique, counts = np.unique(k_nearest_labels, return_counts=True)



if (len(unique)>1) and (frequency of max(counts) in counts > 1)

    return the nearest of the tied points

Where counts is the frequency of the numbers in unique. How do I solve the second condition in the if statetent? Or is there a different solution I'm overlooking.

edited yesterday

desertnaut

15.1k53161

asked 2 days ago

Øystein Skogvold

New contributor

3

Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
– Hemerson Tacon
2 days ago

add a comment |

up vote
1
down vote

favorite

unique, counts = np.unique(k_nearest_labels, return_counts=True)



if (len(unique)>1) and (frequency of max(counts) in counts > 1)

    return the nearest of the tied points

Where counts is the frequency of the numbers in unique. How do I solve the second condition in the if statetent? Or is there a different solution I'm overlooking.

edited yesterday

desertnaut

15.1k53161

asked 2 days ago

Øystein Skogvold

New contributor

3

Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
– Hemerson Tacon
2 days ago

add a comment |

up vote
1
down vote

favorite

unique, counts = np.unique(k_nearest_labels, return_counts=True)



if (len(unique)>1) and (frequency of max(counts) in counts > 1)

    return the nearest of the tied points

Where counts is the frequency of the numbers in unique. How do I solve the second condition in the if statetent? Or is there a different solution I'm overlooking.

edited yesterday

desertnaut

15.1k53161

asked 2 days ago

Øystein Skogvold

New contributor

unique, counts = np.unique(k_nearest_labels, return_counts=True)



if (len(unique)>1) and (frequency of max(counts) in counts > 1)

    return the nearest of the tied points

Where counts is the frequency of the numbers in unique. How do I solve the second condition in the if statetent? Or is there a different solution I'm overlooking.

python numpy

edited yesterday

desertnaut

15.1k53161

asked 2 days ago

Øystein Skogvold

New contributor

edited yesterday

desertnaut

15.1k53161

asked 2 days ago

Øystein Skogvold

New contributor

edited yesterday

desertnaut

15.1k53161

edited yesterday

desertnaut

15.1k53161

edited yesterday

desertnaut

15.1k53161

asked 2 days ago

Øystein Skogvold

New contributor

asked 2 days ago

Øystein Skogvold

asked 2 days ago

Øystein Skogvold

New contributor

Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

3

Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
– Hemerson Tacon
2 days ago

add a comment |

3

Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
– Hemerson Tacon
2 days ago

Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
– Hemerson Tacon
2 days ago

add a comment |

2 Answers
2

active

oldest

votes

up vote
0
down vote

You can actually skip the use of np.unique (which is fairly computationally expensive) and still get what you want:

maxcount = (k_nearest_labels == k_nearest_labels.max()).sum()

if k_nearest_labels.size > maxcount and maxcount > 1:

    ...do stuff...

Also: yaaay! You answered your own question while you were writing it. That's always fun. You should definitely take Hemerson's suggestion and split your edit with the answer into a proper answer (it'll make it easier to find for others).

edited 2 days ago

answered 2 days ago

tel

1,9421322

I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest): 7_nearest_labels = [1,1,1, 4, 2, 2, 2]. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
– Øystein Skogvold
yesterday

@ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes of k_nearest_labels . In that case, unique is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in which len(unique)>1 will be False when np.sum(counts == np.max(counts)) > 1 is True.
– tel
yesterday

add a comment |

up vote
0
down vote

I solved it, forgot I could create a boolean array and take the sum, here's my solution for people who stumble upon this.

 if (len(unique)>1) and (frequency of max(counts) in counts > 1)

can be written as:

 if (len(unique)>1) and (np.sum(counts == np.max(counts)) > 1):

edited yesterday

desertnaut

15.1k53161

answered yesterday

Øystein Skogvold

New contributor

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Øystein Skogvold is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53235400%2fcheck-if-a-specific-value-occur-multiple-times-in-a-numpy-array%23new-answer', 'question_page');
}
);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
0
down vote

You can actually skip the use of np.unique (which is fairly computationally expensive) and still get what you want:

maxcount = (k_nearest_labels == k_nearest_labels.max()).sum()

if k_nearest_labels.size > maxcount and maxcount > 1:

    ...do stuff...

edited 2 days ago

answered 2 days ago

tel

1,9421322

I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest): 7_nearest_labels = [1,1,1, 4, 2, 2, 2]. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
– Øystein Skogvold
yesterday

@ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes of k_nearest_labels . In that case, unique is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in which len(unique)>1 will be False when np.sum(counts == np.max(counts)) > 1 is True.
– tel
yesterday

add a comment |

up vote
0
down vote

You can actually skip the use of np.unique (which is fairly computationally expensive) and still get what you want:

maxcount = (k_nearest_labels == k_nearest_labels.max()).sum()

if k_nearest_labels.size > maxcount and maxcount > 1:

    ...do stuff...

edited 2 days ago

answered 2 days ago

tel

1,9421322

I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest): 7_nearest_labels = [1,1,1, 4, 2, 2, 2]. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
– Øystein Skogvold
yesterday

@ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes of k_nearest_labels . In that case, unique is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in which len(unique)>1 will be False when np.sum(counts == np.max(counts)) > 1 is True.
– tel
yesterday

add a comment |

up vote
0
down vote

You can actually skip the use of np.unique (which is fairly computationally expensive) and still get what you want:

maxcount = (k_nearest_labels == k_nearest_labels.max()).sum()

if k_nearest_labels.size > maxcount and maxcount > 1:

    ...do stuff...

edited 2 days ago

answered 2 days ago

tel

1,9421322

You can actually skip the use of np.unique (which is fairly computationally expensive) and still get what you want:

maxcount = (k_nearest_labels == k_nearest_labels.max()).sum()

if k_nearest_labels.size > maxcount and maxcount > 1:

    ...do stuff...

edited 2 days ago

answered 2 days ago

tel

1,9421322

edited 2 days ago

answered 2 days ago

tel

1,9421322

answered 2 days ago

tel

1,9421322

answered 2 days ago

tel

1,9421322

I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest): 7_nearest_labels = [1,1,1, 4, 2, 2, 2]. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
– Øystein Skogvold
yesterday

@ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes of k_nearest_labels . In that case, unique is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in which len(unique)>1 will be False when np.sum(counts == np.max(counts)) > 1 is True.
– tel
yesterday

add a comment |

I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest): 7_nearest_labels = [1,1,1, 4, 2, 2, 2]. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
– Øystein Skogvold
yesterday

@ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes of k_nearest_labels . In that case, unique is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in which len(unique)>1 will be False when np.sum(counts == np.max(counts)) > 1 is True.
– tel
yesterday

I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest): 7_nearest_labels = [1,1,1, 4, 2, 2, 2]. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
– Øystein Skogvold
yesterday

@ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes of k_nearest_labels . In that case, unique is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in which len(unique)>1 will be False when np.sum(counts == np.max(counts)) > 1 is True.
– tel
yesterday

add a comment |

up vote
0
down vote

I solved it, forgot I could create a boolean array and take the sum, here's my solution for people who stumble upon this.

 if (len(unique)>1) and (frequency of max(counts) in counts > 1)

can be written as:

 if (len(unique)>1) and (np.sum(counts == np.max(counts)) > 1):

edited yesterday

desertnaut

15.1k53161

answered yesterday

Øystein Skogvold

New contributor

add a comment |

up vote
0
down vote

I solved it, forgot I could create a boolean array and take the sum, here's my solution for people who stumble upon this.

 if (len(unique)>1) and (frequency of max(counts) in counts > 1)

can be written as:

 if (len(unique)>1) and (np.sum(counts == np.max(counts)) > 1):

edited yesterday

desertnaut

15.1k53161

answered yesterday

Øystein Skogvold

New contributor

add a comment |

up vote
0
down vote

I solved it, forgot I could create a boolean array and take the sum, here's my solution for people who stumble upon this.

 if (len(unique)>1) and (frequency of max(counts) in counts > 1)

can be written as:

 if (len(unique)>1) and (np.sum(counts == np.max(counts)) > 1):

edited yesterday

desertnaut

15.1k53161

answered yesterday

Øystein Skogvold

New contributor

I solved it, forgot I could create a boolean array and take the sum, here's my solution for people who stumble upon this.

 if (len(unique)>1) and (frequency of max(counts) in counts > 1)

can be written as:

 if (len(unique)>1) and (np.sum(counts == np.max(counts)) > 1):

edited yesterday

desertnaut

15.1k53161

answered yesterday

Øystein Skogvold

New contributor

edited yesterday

desertnaut

15.1k53161

edited yesterday

desertnaut

15.1k53161

edited yesterday

desertnaut

15.1k53161

answered yesterday

Øystein Skogvold

New contributor

answered yesterday

Øystein Skogvold

answered yesterday

Øystein Skogvold

New contributor

Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

Øystein Skogvold is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Øystein Skogvold is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Name

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Vfrdtyky