Check if a specific value occur multiple times in a numpy array











up vote
1
down vote

favorite












I'm working on a simple KNN algorithm, where I want to add a if statement that resolves a tie (if there's an equal number of neighbors from several different classes around a test point). The problem occurs when I want to find if the maximum value of an array occurs more than once, but i can't seem to find a function that does this. What i want:



unique, counts = np.unique(k_nearest_labels, return_counts=True)

if (len(unique)>1) and (frequency of max(counts) in counts > 1)
return the nearest of the tied points


Where counts is the frequency of the numbers in unique. How do I solve the second condition in the if statetent? Or is there a different solution I'm overlooking.










share|improve this question









New contributor




Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 3




    Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
    – Hemerson Tacon
    2 days ago















up vote
1
down vote

favorite












I'm working on a simple KNN algorithm, where I want to add a if statement that resolves a tie (if there's an equal number of neighbors from several different classes around a test point). The problem occurs when I want to find if the maximum value of an array occurs more than once, but i can't seem to find a function that does this. What i want:



unique, counts = np.unique(k_nearest_labels, return_counts=True)

if (len(unique)>1) and (frequency of max(counts) in counts > 1)
return the nearest of the tied points


Where counts is the frequency of the numbers in unique. How do I solve the second condition in the if statetent? Or is there a different solution I'm overlooking.










share|improve this question









New contributor




Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
















  • 3




    Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
    – Hemerson Tacon
    2 days ago













up vote
1
down vote

favorite









up vote
1
down vote

favorite











I'm working on a simple KNN algorithm, where I want to add a if statement that resolves a tie (if there's an equal number of neighbors from several different classes around a test point). The problem occurs when I want to find if the maximum value of an array occurs more than once, but i can't seem to find a function that does this. What i want:



unique, counts = np.unique(k_nearest_labels, return_counts=True)

if (len(unique)>1) and (frequency of max(counts) in counts > 1)
return the nearest of the tied points


Where counts is the frequency of the numbers in unique. How do I solve the second condition in the if statetent? Or is there a different solution I'm overlooking.










share|improve this question









New contributor




Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I'm working on a simple KNN algorithm, where I want to add a if statement that resolves a tie (if there's an equal number of neighbors from several different classes around a test point). The problem occurs when I want to find if the maximum value of an array occurs more than once, but i can't seem to find a function that does this. What i want:



unique, counts = np.unique(k_nearest_labels, return_counts=True)

if (len(unique)>1) and (frequency of max(counts) in counts > 1)
return the nearest of the tied points


Where counts is the frequency of the numbers in unique. How do I solve the second condition in the if statetent? Or is there a different solution I'm overlooking.







python numpy






share|improve this question









New contributor




Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited yesterday









desertnaut

15.1k53161




15.1k53161






New contributor




Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 2 days ago









Øystein Skogvold

62




62




New contributor




Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








  • 3




    Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
    – Hemerson Tacon
    2 days ago














  • 3




    Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
    – Hemerson Tacon
    2 days ago








3




3




Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
– Hemerson Tacon
2 days ago




Welcome to Stack Overflow! I suggest that you edit your question removing the answer part and create a post below answering your own question.
– Hemerson Tacon
2 days ago












2 Answers
2






active

oldest

votes

















up vote
0
down vote













You can actually skip the use of np.unique (which is fairly computationally expensive) and still get what you want:



maxcount = (k_nearest_labels == k_nearest_labels.max()).sum()
if k_nearest_labels.size > maxcount and maxcount > 1:
...do stuff...


Also: yaaay! You answered your own question while you were writing it. That's always fun. You should definitely take Hemerson's suggestion and split your edit with the answer into a proper answer (it'll make it easier to find for others).






share|improve this answer























  • I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest): 7_nearest_labels = [1,1,1, 4, 2, 2, 2]. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
    – Øystein Skogvold
    yesterday










  • @ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes of k_nearest_labels . In that case, unique is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in which len(unique)>1 will be False when np.sum(counts == np.max(counts)) > 1 is True.
    – tel
    yesterday


















up vote
0
down vote













I solved it, forgot I could create a boolean array and take the sum, here's my solution for people who stumble upon this.



 if (len(unique)>1) and (frequency of max(counts) in counts > 1)


can be written as:



 if (len(unique)>1) and (np.sum(counts == np.max(counts)) > 1):





share|improve this answer










New contributor




Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.


















    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });






    Øystein Skogvold is a new contributor. Be nice, and check out our Code of Conduct.










     

    draft saved


    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53235400%2fcheck-if-a-specific-value-occur-multiple-times-in-a-numpy-array%23new-answer', 'question_page');
    }
    );

    Post as a guest
































    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote













    You can actually skip the use of np.unique (which is fairly computationally expensive) and still get what you want:



    maxcount = (k_nearest_labels == k_nearest_labels.max()).sum()
    if k_nearest_labels.size > maxcount and maxcount > 1:
    ...do stuff...


    Also: yaaay! You answered your own question while you were writing it. That's always fun. You should definitely take Hemerson's suggestion and split your edit with the answer into a proper answer (it'll make it easier to find for others).






    share|improve this answer























    • I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest): 7_nearest_labels = [1,1,1, 4, 2, 2, 2]. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
      – Øystein Skogvold
      yesterday










    • @ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes of k_nearest_labels . In that case, unique is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in which len(unique)>1 will be False when np.sum(counts == np.max(counts)) > 1 is True.
      – tel
      yesterday















    up vote
    0
    down vote













    You can actually skip the use of np.unique (which is fairly computationally expensive) and still get what you want:



    maxcount = (k_nearest_labels == k_nearest_labels.max()).sum()
    if k_nearest_labels.size > maxcount and maxcount > 1:
    ...do stuff...


    Also: yaaay! You answered your own question while you were writing it. That's always fun. You should definitely take Hemerson's suggestion and split your edit with the answer into a proper answer (it'll make it easier to find for others).






    share|improve this answer























    • I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest): 7_nearest_labels = [1,1,1, 4, 2, 2, 2]. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
      – Øystein Skogvold
      yesterday










    • @ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes of k_nearest_labels . In that case, unique is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in which len(unique)>1 will be False when np.sum(counts == np.max(counts)) > 1 is True.
      – tel
      yesterday













    up vote
    0
    down vote










    up vote
    0
    down vote









    You can actually skip the use of np.unique (which is fairly computationally expensive) and still get what you want:



    maxcount = (k_nearest_labels == k_nearest_labels.max()).sum()
    if k_nearest_labels.size > maxcount and maxcount > 1:
    ...do stuff...


    Also: yaaay! You answered your own question while you were writing it. That's always fun. You should definitely take Hemerson's suggestion and split your edit with the answer into a proper answer (it'll make it easier to find for others).






    share|improve this answer














    You can actually skip the use of np.unique (which is fairly computationally expensive) and still get what you want:



    maxcount = (k_nearest_labels == k_nearest_labels.max()).sum()
    if k_nearest_labels.size > maxcount and maxcount > 1:
    ...do stuff...


    Also: yaaay! You answered your own question while you were writing it. That's always fun. You should definitely take Hemerson's suggestion and split your edit with the answer into a proper answer (it'll make it easier to find for others).







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 2 days ago

























    answered 2 days ago









    tel

    1,9421322




    1,9421322












    • I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest): 7_nearest_labels = [1,1,1, 4, 2, 2, 2]. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
      – Øystein Skogvold
      yesterday










    • @ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes of k_nearest_labels . In that case, unique is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in which len(unique)>1 will be False when np.sum(counts == np.max(counts)) > 1 is True.
      – tel
      yesterday


















    • I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest): 7_nearest_labels = [1,1,1, 4, 2, 2, 2]. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
      – Øystein Skogvold
      yesterday










    • @ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes of k_nearest_labels . In that case, unique is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in which len(unique)>1 will be False when np.sum(counts == np.max(counts)) > 1 is True.
      – tel
      yesterday
















    I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest): 7_nearest_labels = [1,1,1, 4, 2, 2, 2]. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
    – Øystein Skogvold
    yesterday




    I just tried this, but it doesn't apply unfortunately. I need the to know if the frequency of the labels in the k_nearest_labels array. The max value of this array will always be the class with highest label. So if i had a 7NN result for a single test point (sorted from closest to furthest): 7_nearest_labels = [1,1,1, 4, 2, 2, 2]. Here class 1 and 2 is tied, but the max value is class 4. So here I would choose 1 as the predicted class, as it's closer than 2. But thanks for the answer!
    – Øystein Skogvold
    yesterday












    @ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes of k_nearest_labels . In that case, unique is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in which len(unique)>1 will be False when np.sum(counts == np.max(counts)) > 1 is True.
    – tel
    yesterday




    @ØysteinSkogvold Ah. Your example makes it much clearer: you're trying to make a decision based on the modes of k_nearest_labels . In that case, unique is definitely the function you want to use. The only thing I'd add is that the first part of your conditional statement is redundant, since there's no condition in which len(unique)>1 will be False when np.sum(counts == np.max(counts)) > 1 is True.
    – tel
    yesterday












    up vote
    0
    down vote













    I solved it, forgot I could create a boolean array and take the sum, here's my solution for people who stumble upon this.



     if (len(unique)>1) and (frequency of max(counts) in counts > 1)


    can be written as:



     if (len(unique)>1) and (np.sum(counts == np.max(counts)) > 1):





    share|improve this answer










    New contributor




    Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






















      up vote
      0
      down vote













      I solved it, forgot I could create a boolean array and take the sum, here's my solution for people who stumble upon this.



       if (len(unique)>1) and (frequency of max(counts) in counts > 1)


      can be written as:



       if (len(unique)>1) and (np.sum(counts == np.max(counts)) > 1):





      share|improve this answer










      New contributor




      Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




















        up vote
        0
        down vote










        up vote
        0
        down vote









        I solved it, forgot I could create a boolean array and take the sum, here's my solution for people who stumble upon this.



         if (len(unique)>1) and (frequency of max(counts) in counts > 1)


        can be written as:



         if (len(unique)>1) and (np.sum(counts == np.max(counts)) > 1):





        share|improve this answer










        New contributor




        Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        I solved it, forgot I could create a boolean array and take the sum, here's my solution for people who stumble upon this.



         if (len(unique)>1) and (frequency of max(counts) in counts > 1)


        can be written as:



         if (len(unique)>1) and (np.sum(counts == np.max(counts)) > 1):






        share|improve this answer










        New contributor




        Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        share|improve this answer



        share|improve this answer








        edited yesterday









        desertnaut

        15.1k53161




        15.1k53161






        New contributor




        Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        answered yesterday









        Øystein Skogvold

        62




        62




        New contributor




        Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.





        New contributor





        Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






        Øystein Skogvold is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






















            Øystein Skogvold is a new contributor. Be nice, and check out our Code of Conduct.










             

            draft saved


            draft discarded


















            Øystein Skogvold is a new contributor. Be nice, and check out our Code of Conduct.













            Øystein Skogvold is a new contributor. Be nice, and check out our Code of Conduct.












            Øystein Skogvold is a new contributor. Be nice, and check out our Code of Conduct.















             


            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53235400%2fcheck-if-a-specific-value-occur-multiple-times-in-a-numpy-array%23new-answer', 'question_page');
            }
            );

            Post as a guest




















































































            Popular posts from this blog

            Xamarin.iOS Cant Deploy on Iphone

            Glorious Revolution

            Dulmage-Mendelsohn matrix decomposition in Python