Check if a specific value occur multiple times in a numpy array
up vote
1
down vote
favorite
I'm working on a simple KNN algorithm, where I want to add a if statement that resolves a tie (if there's an equal number of neighbors from several different classes around a test point). The problem occurs when I want to find if the maximum value of an array occurs more than once, but i can't seem to find a function that does this. What i want:
unique, counts = np.unique(k_nearest_labels, return_counts=True)
if (len(unique)>1) and (frequency of max(counts) in counts > 1)
return the nearest of the tied points
Where counts
is the frequency of the numbers in unique
. How do I solve the second condition in the if statetent? Or is there a different solution I'm overlooking.
python numpy
New contributor
add a comment |
up vote
1
down vote
favorite
I'm working on a simple KNN algorithm, where I want to add a if statement that resolves a tie (if there's an equal number of neighbors from several different classes around a test point). The problem occurs when I want to find if the maximum value of an array occurs more than once, but i can't seem to find a function that does this. What i want:
unique, counts = np.unique(k_nearest_labels, return_counts=True)
if (len(unique)>1) and (frequency of max(counts) in counts > 1)
return the nearest of the tied points
Where counts
is the frequency of the numbers in unique
. How do I solve the second condition in the if statetent? Or is there a different solution I'm overlooking.
python numpy
New contributor
3
Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
– Hemerson Tacon
2 days ago
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I'm working on a simple KNN algorithm, where I want to add a if statement that resolves a tie (if there's an equal number of neighbors from several different classes around a test point). The problem occurs when I want to find if the maximum value of an array occurs more than once, but i can't seem to find a function that does this. What i want:
unique, counts = np.unique(k_nearest_labels, return_counts=True)
if (len(unique)>1) and (frequency of max(counts) in counts > 1)
return the nearest of the tied points
Where counts
is the frequency of the numbers in unique
. How do I solve the second condition in the if statetent? Or is there a different solution I'm overlooking.
python numpy
New contributor
I'm working on a simple KNN algorithm, where I want to add a if statement that resolves a tie (if there's an equal number of neighbors from several different classes around a test point). The problem occurs when I want to find if the maximum value of an array occurs more than once, but i can't seem to find a function that does this. What i want:
unique, counts = np.unique(k_nearest_labels, return_counts=True)
if (len(unique)>1) and (frequency of max(counts) in counts > 1)
return the nearest of the tied points
Where counts
is the frequency of the numbers in unique
. How do I solve the second condition in the if statetent? Or is there a different solution I'm overlooking.
python numpy
python numpy
New contributor
New contributor
edited yesterday
desertnaut
15.1k53161
15.1k53161
New contributor
asked 2 days ago
Øystein Skogvold
62
62
New contributor
New contributor
3
Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
– Hemerson Tacon
2 days ago
add a comment |
3
Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
– Hemerson Tacon
2 days ago
3
3
Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
– Hemerson Tacon
2 days ago
Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
– Hemerson Tacon
2 days ago
add a comment |
2 Answers
2
active
oldest
votes
up vote
0
down vote
You can actually skip the use of np.unique
(which is fairly computationally expensive) and still get what you want:
maxcount = (k_nearest_labels == k_nearest_labels.max()).sum()
if k_nearest_labels.size > maxcount and maxcount > 1:
...do stuff...
Also: yaaay! You answered your own question while you were writing it. That's always fun. You should definitely take Hemerson's suggestion and split your edit with the answer into a proper answer (it'll make it easier to find for others).
I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest):7_nearest_labels = [1,1,1, 4, 2, 2, 2]
. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
– Øystein Skogvold
yesterday
@ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes ofk_nearest_labels
. In that case,unique
is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in whichlen(unique)>1
will be False whennp.sum(counts == np.max(counts)) > 1
is True.
– tel
yesterday
add a comment |
up vote
0
down vote
I solved it, forgot I could create a boolean array and take the sum, here's my solution for people who stumble upon this.
if (len(unique)>1) and (frequency of max(counts) in counts > 1)
can be written as:
if (len(unique)>1) and (np.sum(counts == np.max(counts)) > 1):
New contributor
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
You can actually skip the use of np.unique
(which is fairly computationally expensive) and still get what you want:
maxcount = (k_nearest_labels == k_nearest_labels.max()).sum()
if k_nearest_labels.size > maxcount and maxcount > 1:
...do stuff...
Also: yaaay! You answered your own question while you were writing it. That's always fun. You should definitely take Hemerson's suggestion and split your edit with the answer into a proper answer (it'll make it easier to find for others).
I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest):7_nearest_labels = [1,1,1, 4, 2, 2, 2]
. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
– Øystein Skogvold
yesterday
@ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes ofk_nearest_labels
. In that case,unique
is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in whichlen(unique)>1
will be False whennp.sum(counts == np.max(counts)) > 1
is True.
– tel
yesterday
add a comment |
up vote
0
down vote
You can actually skip the use of np.unique
(which is fairly computationally expensive) and still get what you want:
maxcount = (k_nearest_labels == k_nearest_labels.max()).sum()
if k_nearest_labels.size > maxcount and maxcount > 1:
...do stuff...
Also: yaaay! You answered your own question while you were writing it. That's always fun. You should definitely take Hemerson's suggestion and split your edit with the answer into a proper answer (it'll make it easier to find for others).
I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest):7_nearest_labels = [1,1,1, 4, 2, 2, 2]
. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
– Øystein Skogvold
yesterday
@ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes ofk_nearest_labels
. In that case,unique
is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in whichlen(unique)>1
will be False whennp.sum(counts == np.max(counts)) > 1
is True.
– tel
yesterday
add a comment |
up vote
0
down vote
up vote
0
down vote
You can actually skip the use of np.unique
(which is fairly computationally expensive) and still get what you want:
maxcount = (k_nearest_labels == k_nearest_labels.max()).sum()
if k_nearest_labels.size > maxcount and maxcount > 1:
...do stuff...
Also: yaaay! You answered your own question while you were writing it. That's always fun. You should definitely take Hemerson's suggestion and split your edit with the answer into a proper answer (it'll make it easier to find for others).
You can actually skip the use of np.unique
(which is fairly computationally expensive) and still get what you want:
maxcount = (k_nearest_labels == k_nearest_labels.max()).sum()
if k_nearest_labels.size > maxcount and maxcount > 1:
...do stuff...
Also: yaaay! You answered your own question while you were writing it. That's always fun. You should definitely take Hemerson's suggestion and split your edit with the answer into a proper answer (it'll make it easier to find for others).
edited 2 days ago
answered 2 days ago
tel
1,9421322
1,9421322
I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest):7_nearest_labels = [1,1,1, 4, 2, 2, 2]
. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
– Øystein Skogvold
yesterday
@ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes ofk_nearest_labels
. In that case,unique
is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in whichlen(unique)>1
will be False whennp.sum(counts == np.max(counts)) > 1
is True.
– tel
yesterday
add a comment |
I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest):7_nearest_labels = [1,1,1, 4, 2, 2, 2]
. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
– Øystein Skogvold
yesterday
@ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes ofk_nearest_labels
. In that case,unique
is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in whichlen(unique)>1
will be False whennp.sum(counts == np.max(counts)) > 1
is True.
– tel
yesterday
I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest):
7_nearest_labels = [1,1,1, 4, 2, 2, 2]
. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!– Øystein Skogvold
yesterday
I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest):
7_nearest_labels = [1,1,1, 4, 2, 2, 2]
. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!– Øystein Skogvold
yesterday
@ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes of
k_nearest_labels
. In that case, unique
is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in which len(unique)>1
will be False when np.sum(counts == np.max(counts)) > 1
is True.– tel
yesterday
@ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes of
k_nearest_labels
. In that case, unique
is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in which len(unique)>1
will be False when np.sum(counts == np.max(counts)) > 1
is True.– tel
yesterday
add a comment |
up vote
0
down vote
I solved it, forgot I could create a boolean array and take the sum, here's my solution for people who stumble upon this.
if (len(unique)>1) and (frequency of max(counts) in counts > 1)
can be written as:
if (len(unique)>1) and (np.sum(counts == np.max(counts)) > 1):
New contributor
add a comment |
up vote
0
down vote
I solved it, forgot I could create a boolean array and take the sum, here's my solution for people who stumble upon this.
if (len(unique)>1) and (frequency of max(counts) in counts > 1)
can be written as:
if (len(unique)>1) and (np.sum(counts == np.max(counts)) > 1):
New contributor
add a comment |
up vote
0
down vote
up vote
0
down vote
I solved it, forgot I could create a boolean array and take the sum, here's my solution for people who stumble upon this.
if (len(unique)>1) and (frequency of max(counts) in counts > 1)
can be written as:
if (len(unique)>1) and (np.sum(counts == np.max(counts)) > 1):
New contributor
I solved it, forgot I could create a boolean array and take the sum, here's my solution for people who stumble upon this.
if (len(unique)>1) and (frequency of max(counts) in counts > 1)
can be written as:
if (len(unique)>1) and (np.sum(counts == np.max(counts)) > 1):
New contributor
edited yesterday
desertnaut
15.1k53161
15.1k53161
New contributor
answered yesterday
Øystein Skogvold
62
62
New contributor
New contributor
add a comment |
add a comment |
Øystein Skogvold is a new contributor. Be nice, and check out our Code of Conduct.
Øystein Skogvold is a new contributor. Be nice, and check out our Code of Conduct.
Øystein Skogvold is a new contributor. Be nice, and check out our Code of Conduct.
Øystein Skogvold is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53235400%2fcheck-if-a-specific-value-occur-multiple-times-in-a-numpy-array%23new-answer', 'question_page');
}
);
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
3
Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
– Hemerson Tacon
2 days ago