Bash Iterate through repeated values in file





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















I have a file with this format:



User_ID , Place_ID , Rating 
U32 , 1305 , 2
U32 , 1276 , 2
U32 , 1789 , 3
U65 , 1985 , 1
U65 , 1305 , 1
U65 , 1276 , 2


I would like to iterate through this file, sort by Place_ID, iterate through repeated values in Place_ID and add the ratings, once the last element of the Place_ID is added, check if value > x and if true, push the Place_ID into an array.



Ex: Place_ID 1305: 2 + 1 / 2 = 1.5 > 1 ----> ids+=($id)



Place_ID 1276: 2 + 2 / 2 = 2 > 1 -----> ids+=($id)



I have tried with



test5 () {

id=0
count=0
rating=0
ids=()
ratings=()
for i in `sort -t',' -k 2 ratings.csv`
do
aux=`echo "$i"| cut -f2 -d','`
if (( $id != $aux )); then
if (( $rating != 0 )); then
rating=`echo "scale=1; $rating / $count" | bc -l`
if (( $(echo "$rating >= 1" | bc -l) )); then
ids+=($id)
ratings+=($rating)
fi
fi
id=$aux
count=0
rating=0
else
rating=$(($rating + `echo "$i"| cut -f3 -d','`))
count=$(($count + 1))
fi
done

echo ${#ids[@]}
echo ${#ratings[@]}
}


EDIT: I think it works, but is there a way to make it better? Something that doesn't force me to use as many if's and count.



Thanks for the help.










share|improve this question

























  • This sounds like a bad fit for Bash, but a very good fit for Awk.

    – tripleee
    Nov 16 '18 at 12:41











  • Your code seems to be meant for a comma-delimited file, not a pipe-delimited file like in your example data.

    – tripleee
    Nov 16 '18 at 12:47











  • I'm not allowed to use Awk. True, sorry, going to edit the data.

    – Roger Piera
    Nov 16 '18 at 12:50











  • Is specifically Awk off limits, or are you not allowed to use Python, Perl, Ruby etc either? That sounds like a really bad assignment; use the correct tool for the job.

    – tripleee
    Nov 16 '18 at 12:56






  • 1





    @RogerPiera I'm a bit confused, you say you are limited to bash, but call sort -- why couldn't you also call awk from your script? It's just another utility. (you could use it to fill ids and ratings)

    – David C. Rankin
    Nov 16 '18 at 13:21




















0















I have a file with this format:



User_ID , Place_ID , Rating 
U32 , 1305 , 2
U32 , 1276 , 2
U32 , 1789 , 3
U65 , 1985 , 1
U65 , 1305 , 1
U65 , 1276 , 2


I would like to iterate through this file, sort by Place_ID, iterate through repeated values in Place_ID and add the ratings, once the last element of the Place_ID is added, check if value > x and if true, push the Place_ID into an array.



Ex: Place_ID 1305: 2 + 1 / 2 = 1.5 > 1 ----> ids+=($id)



Place_ID 1276: 2 + 2 / 2 = 2 > 1 -----> ids+=($id)



I have tried with



test5 () {

id=0
count=0
rating=0
ids=()
ratings=()
for i in `sort -t',' -k 2 ratings.csv`
do
aux=`echo "$i"| cut -f2 -d','`
if (( $id != $aux )); then
if (( $rating != 0 )); then
rating=`echo "scale=1; $rating / $count" | bc -l`
if (( $(echo "$rating >= 1" | bc -l) )); then
ids+=($id)
ratings+=($rating)
fi
fi
id=$aux
count=0
rating=0
else
rating=$(($rating + `echo "$i"| cut -f3 -d','`))
count=$(($count + 1))
fi
done

echo ${#ids[@]}
echo ${#ratings[@]}
}


EDIT: I think it works, but is there a way to make it better? Something that doesn't force me to use as many if's and count.



Thanks for the help.










share|improve this question

























  • This sounds like a bad fit for Bash, but a very good fit for Awk.

    – tripleee
    Nov 16 '18 at 12:41











  • Your code seems to be meant for a comma-delimited file, not a pipe-delimited file like in your example data.

    – tripleee
    Nov 16 '18 at 12:47











  • I'm not allowed to use Awk. True, sorry, going to edit the data.

    – Roger Piera
    Nov 16 '18 at 12:50











  • Is specifically Awk off limits, or are you not allowed to use Python, Perl, Ruby etc either? That sounds like a really bad assignment; use the correct tool for the job.

    – tripleee
    Nov 16 '18 at 12:56






  • 1





    @RogerPiera I'm a bit confused, you say you are limited to bash, but call sort -- why couldn't you also call awk from your script? It's just another utility. (you could use it to fill ids and ratings)

    – David C. Rankin
    Nov 16 '18 at 13:21
















0












0








0


0






I have a file with this format:



User_ID , Place_ID , Rating 
U32 , 1305 , 2
U32 , 1276 , 2
U32 , 1789 , 3
U65 , 1985 , 1
U65 , 1305 , 1
U65 , 1276 , 2


I would like to iterate through this file, sort by Place_ID, iterate through repeated values in Place_ID and add the ratings, once the last element of the Place_ID is added, check if value > x and if true, push the Place_ID into an array.



Ex: Place_ID 1305: 2 + 1 / 2 = 1.5 > 1 ----> ids+=($id)



Place_ID 1276: 2 + 2 / 2 = 2 > 1 -----> ids+=($id)



I have tried with



test5 () {

id=0
count=0
rating=0
ids=()
ratings=()
for i in `sort -t',' -k 2 ratings.csv`
do
aux=`echo "$i"| cut -f2 -d','`
if (( $id != $aux )); then
if (( $rating != 0 )); then
rating=`echo "scale=1; $rating / $count" | bc -l`
if (( $(echo "$rating >= 1" | bc -l) )); then
ids+=($id)
ratings+=($rating)
fi
fi
id=$aux
count=0
rating=0
else
rating=$(($rating + `echo "$i"| cut -f3 -d','`))
count=$(($count + 1))
fi
done

echo ${#ids[@]}
echo ${#ratings[@]}
}


EDIT: I think it works, but is there a way to make it better? Something that doesn't force me to use as many if's and count.



Thanks for the help.










share|improve this question
















I have a file with this format:



User_ID , Place_ID , Rating 
U32 , 1305 , 2
U32 , 1276 , 2
U32 , 1789 , 3
U65 , 1985 , 1
U65 , 1305 , 1
U65 , 1276 , 2


I would like to iterate through this file, sort by Place_ID, iterate through repeated values in Place_ID and add the ratings, once the last element of the Place_ID is added, check if value > x and if true, push the Place_ID into an array.



Ex: Place_ID 1305: 2 + 1 / 2 = 1.5 > 1 ----> ids+=($id)



Place_ID 1276: 2 + 2 / 2 = 2 > 1 -----> ids+=($id)



I have tried with



test5 () {

id=0
count=0
rating=0
ids=()
ratings=()
for i in `sort -t',' -k 2 ratings.csv`
do
aux=`echo "$i"| cut -f2 -d','`
if (( $id != $aux )); then
if (( $rating != 0 )); then
rating=`echo "scale=1; $rating / $count" | bc -l`
if (( $(echo "$rating >= 1" | bc -l) )); then
ids+=($id)
ratings+=($rating)
fi
fi
id=$aux
count=0
rating=0
else
rating=$(($rating + `echo "$i"| cut -f3 -d','`))
count=$(($count + 1))
fi
done

echo ${#ids[@]}
echo ${#ratings[@]}
}


EDIT: I think it works, but is there a way to make it better? Something that doesn't force me to use as many if's and count.



Thanks for the help.







bash






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 16 '18 at 13:20







Roger Piera

















asked Nov 16 '18 at 12:23









Roger PieraRoger Piera

205




205













  • This sounds like a bad fit for Bash, but a very good fit for Awk.

    – tripleee
    Nov 16 '18 at 12:41











  • Your code seems to be meant for a comma-delimited file, not a pipe-delimited file like in your example data.

    – tripleee
    Nov 16 '18 at 12:47











  • I'm not allowed to use Awk. True, sorry, going to edit the data.

    – Roger Piera
    Nov 16 '18 at 12:50











  • Is specifically Awk off limits, or are you not allowed to use Python, Perl, Ruby etc either? That sounds like a really bad assignment; use the correct tool for the job.

    – tripleee
    Nov 16 '18 at 12:56






  • 1





    @RogerPiera I'm a bit confused, you say you are limited to bash, but call sort -- why couldn't you also call awk from your script? It's just another utility. (you could use it to fill ids and ratings)

    – David C. Rankin
    Nov 16 '18 at 13:21





















  • This sounds like a bad fit for Bash, but a very good fit for Awk.

    – tripleee
    Nov 16 '18 at 12:41











  • Your code seems to be meant for a comma-delimited file, not a pipe-delimited file like in your example data.

    – tripleee
    Nov 16 '18 at 12:47











  • I'm not allowed to use Awk. True, sorry, going to edit the data.

    – Roger Piera
    Nov 16 '18 at 12:50











  • Is specifically Awk off limits, or are you not allowed to use Python, Perl, Ruby etc either? That sounds like a really bad assignment; use the correct tool for the job.

    – tripleee
    Nov 16 '18 at 12:56






  • 1





    @RogerPiera I'm a bit confused, you say you are limited to bash, but call sort -- why couldn't you also call awk from your script? It's just another utility. (you could use it to fill ids and ratings)

    – David C. Rankin
    Nov 16 '18 at 13:21



















This sounds like a bad fit for Bash, but a very good fit for Awk.

– tripleee
Nov 16 '18 at 12:41





This sounds like a bad fit for Bash, but a very good fit for Awk.

– tripleee
Nov 16 '18 at 12:41













Your code seems to be meant for a comma-delimited file, not a pipe-delimited file like in your example data.

– tripleee
Nov 16 '18 at 12:47





Your code seems to be meant for a comma-delimited file, not a pipe-delimited file like in your example data.

– tripleee
Nov 16 '18 at 12:47













I'm not allowed to use Awk. True, sorry, going to edit the data.

– Roger Piera
Nov 16 '18 at 12:50





I'm not allowed to use Awk. True, sorry, going to edit the data.

– Roger Piera
Nov 16 '18 at 12:50













Is specifically Awk off limits, or are you not allowed to use Python, Perl, Ruby etc either? That sounds like a really bad assignment; use the correct tool for the job.

– tripleee
Nov 16 '18 at 12:56





Is specifically Awk off limits, or are you not allowed to use Python, Perl, Ruby etc either? That sounds like a really bad assignment; use the correct tool for the job.

– tripleee
Nov 16 '18 at 12:56




1




1





@RogerPiera I'm a bit confused, you say you are limited to bash, but call sort -- why couldn't you also call awk from your script? It's just another utility. (you could use it to fill ids and ratings)

– David C. Rankin
Nov 16 '18 at 13:21







@RogerPiera I'm a bit confused, you say you are limited to bash, but call sort -- why couldn't you also call awk from your script? It's just another utility. (you could use it to fill ids and ratings)

– David C. Rankin
Nov 16 '18 at 13:21














1 Answer
1






active

oldest

votes


















1














This is another option using less if's:



#!/bin/bash

sum=()
count=()

while read -r line; do

place=$(echo "$line" | cut -d',' -f2)
rating=$(echo "$line" | cut -d',' -f3)

sum[$place]=$(echo "$rating + ${sum[$place]-0}" | bc -l)
count[$place]=$((count[$place] + 1))

done < <( sed 1d ratings.csv | sort -t',' -k 2 | tr -d '[:blank:]' )

ratings=()
for place in "${!sum[@]}"; do
ratings[$place]=$(echo "scale=1; ${sum[$place]} / ${count[$place]}" | bc -l)
done

# ratings at this point has the ratings for each place
echo ${!ratings[@]} # place ids
echo ${ratings[@]} # ratings


I'm assuming your ratings.csv has headers that is why this has sed 1d ratings.csv






share|improve this answer


























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53337875%2fbash-iterate-through-repeated-values-in-file%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    This is another option using less if's:



    #!/bin/bash

    sum=()
    count=()

    while read -r line; do

    place=$(echo "$line" | cut -d',' -f2)
    rating=$(echo "$line" | cut -d',' -f3)

    sum[$place]=$(echo "$rating + ${sum[$place]-0}" | bc -l)
    count[$place]=$((count[$place] + 1))

    done < <( sed 1d ratings.csv | sort -t',' -k 2 | tr -d '[:blank:]' )

    ratings=()
    for place in "${!sum[@]}"; do
    ratings[$place]=$(echo "scale=1; ${sum[$place]} / ${count[$place]}" | bc -l)
    done

    # ratings at this point has the ratings for each place
    echo ${!ratings[@]} # place ids
    echo ${ratings[@]} # ratings


    I'm assuming your ratings.csv has headers that is why this has sed 1d ratings.csv






    share|improve this answer






























      1














      This is another option using less if's:



      #!/bin/bash

      sum=()
      count=()

      while read -r line; do

      place=$(echo "$line" | cut -d',' -f2)
      rating=$(echo "$line" | cut -d',' -f3)

      sum[$place]=$(echo "$rating + ${sum[$place]-0}" | bc -l)
      count[$place]=$((count[$place] + 1))

      done < <( sed 1d ratings.csv | sort -t',' -k 2 | tr -d '[:blank:]' )

      ratings=()
      for place in "${!sum[@]}"; do
      ratings[$place]=$(echo "scale=1; ${sum[$place]} / ${count[$place]}" | bc -l)
      done

      # ratings at this point has the ratings for each place
      echo ${!ratings[@]} # place ids
      echo ${ratings[@]} # ratings


      I'm assuming your ratings.csv has headers that is why this has sed 1d ratings.csv






      share|improve this answer




























        1












        1








        1







        This is another option using less if's:



        #!/bin/bash

        sum=()
        count=()

        while read -r line; do

        place=$(echo "$line" | cut -d',' -f2)
        rating=$(echo "$line" | cut -d',' -f3)

        sum[$place]=$(echo "$rating + ${sum[$place]-0}" | bc -l)
        count[$place]=$((count[$place] + 1))

        done < <( sed 1d ratings.csv | sort -t',' -k 2 | tr -d '[:blank:]' )

        ratings=()
        for place in "${!sum[@]}"; do
        ratings[$place]=$(echo "scale=1; ${sum[$place]} / ${count[$place]}" | bc -l)
        done

        # ratings at this point has the ratings for each place
        echo ${!ratings[@]} # place ids
        echo ${ratings[@]} # ratings


        I'm assuming your ratings.csv has headers that is why this has sed 1d ratings.csv






        share|improve this answer















        This is another option using less if's:



        #!/bin/bash

        sum=()
        count=()

        while read -r line; do

        place=$(echo "$line" | cut -d',' -f2)
        rating=$(echo "$line" | cut -d',' -f3)

        sum[$place]=$(echo "$rating + ${sum[$place]-0}" | bc -l)
        count[$place]=$((count[$place] + 1))

        done < <( sed 1d ratings.csv | sort -t',' -k 2 | tr -d '[:blank:]' )

        ratings=()
        for place in "${!sum[@]}"; do
        ratings[$place]=$(echo "scale=1; ${sum[$place]} / ${count[$place]}" | bc -l)
        done

        # ratings at this point has the ratings for each place
        echo ${!ratings[@]} # place ids
        echo ${ratings[@]} # ratings


        I'm assuming your ratings.csv has headers that is why this has sed 1d ratings.csv







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 16 '18 at 13:32

























        answered Nov 16 '18 at 13:25









        ssemillassemilla

        3,147525




        3,147525
































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53337875%2fbash-iterate-through-repeated-values-in-file%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Xamarin.iOS Cant Deploy on Iphone

            Glorious Revolution

            Dulmage-Mendelsohn matrix decomposition in Python