Replace value with the average of it's column - many columns











up vote
2
down vote

favorite












I have an excel sheet with over 1000 columns and 11000 rows - all with numeric data. Within the data, there are missing values represented with '*'.



I would like to replace all of the '*' values with the average of the column that it is in.



Doing this manually would take a long time, so is there a formula that would achieve this?



Thanks so much in advanced for any help.










share|improve this question


























    up vote
    2
    down vote

    favorite












    I have an excel sheet with over 1000 columns and 11000 rows - all with numeric data. Within the data, there are missing values represented with '*'.



    I would like to replace all of the '*' values with the average of the column that it is in.



    Doing this manually would take a long time, so is there a formula that would achieve this?



    Thanks so much in advanced for any help.










    share|improve this question
























      up vote
      2
      down vote

      favorite









      up vote
      2
      down vote

      favorite











      I have an excel sheet with over 1000 columns and 11000 rows - all with numeric data. Within the data, there are missing values represented with '*'.



      I would like to replace all of the '*' values with the average of the column that it is in.



      Doing this manually would take a long time, so is there a formula that would achieve this?



      Thanks so much in advanced for any help.










      share|improve this question













      I have an excel sheet with over 1000 columns and 11000 rows - all with numeric data. Within the data, there are missing values represented with '*'.



      I would like to replace all of the '*' values with the average of the column that it is in.



      Doing this manually would take a long time, so is there a formula that would achieve this?



      Thanks so much in advanced for any help.







      excel machine-learning






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 10 at 14:36









      Sam

      213




      213
























          2 Answers
          2






          active

          oldest

          votes

















          up vote
          1
          down vote













          I can give you a three sheet solution Sam?:



          Sheet 2:
          Cell A1=



          =AVERAGE(Sheet1!A:A)


          Paste that along the top row for each of 1000 columns in sheet 2.



          Sheet 3:
          Cell A1=



          =IF(Sheet1!A1="*",Sheet2!A$1,Sheet1!A1)


          Copy that and then paste it into the entire worksheet 3 (i.e., that top left corner symbol that allows you to do that). It's gonna take a while to update but will deliver what you want!






          share|improve this answer























          • thank you sooo much - that works brilliantly!!
            – Sam
            Nov 10 at 15:25










          • You're welcome :-) ...please could you mark it as answered then :-) link
            – RichardBJ
            Nov 10 at 15:29




















          up vote
          1
          down vote













          As you have mentioned machine learning I thought I would introduce you to how you could do this with Azure Machine Learning Studio (AML) using a free account.



          By using AML you gain access to a number of methods for replacing missing values which are extremely quick. AML has a Clean Missing Data module which exposes methods of replacement such as Multivariate Imputation using Chained Equation, Mean, Median and several others. The great thing here is you can visualize the dataset columns by right clicking on the dataset and see which columns have skew. You can then select on a column by column basis which replacement method to use. If you have heavily skewed columns you might use median instead for instance. This also offers great opportunities for data normalization (scale and reduce). You also gain access to using Python and R with your dataset.



          I don't know if there is a method for directly treating "*" as missing values, I am trying to find that out, but if you do a little processing in advance of load then all is fine. The step before loading requires:




          1. Export the sheet as a CSV and save it.

          2. Use Ctrl+ F to bring up the find and replace dialog and enter "~*" for Find and leave Replace blank




          Then login into AML and click the + New at the bottom of the screen





          Select New > DATASET > FROM LOCAL FILE and select your file





          When selecting type ensure to select CSV with no header if you data has no header row or with header if it does:





          Your dataset will start uploading as shown by progress bar at bottom of screen and then appear in the SAVED DATASETS collection.



          Click the + New button again and select EXPERIMENT > BLANK EXPERIMENT





          Drag and drop your saved dataset onto the canvas on the right:





          In the Search experiment items box on the right, type: Clean Missing Data
          then drag the module that appears onto the canvas





          Join the 2 boxes by clicking the dot at the bottom of the top box and dragging to the other box





          Select the bottom box and then input the following parameters on the right (here is where you can choose which method to apply for missing values e.g. replace missing with mean, or perhaps median if your column data is skewed.







          Right click the bottom module and select Run selected





          Right click again and select Cleaned dataset > Save as Dataset





          The progress bar at the bottom will inform you when complete



          Type in the Search experiment items box again: convert to csv and drag that onto the canvas and connect the left hand side bottom of the second module to the top of the newly added third:







          Select the bottom module and right click > Run selected



          Wait for the progress bar to complete.





          Right-click the bottom module and hit Download. Done.








          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














             

            draft saved


            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53239988%2freplace-value-with-the-average-of-its-column-many-columns%23new-answer', 'question_page');
            }
            );

            Post as a guest
































            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            1
            down vote













            I can give you a three sheet solution Sam?:



            Sheet 2:
            Cell A1=



            =AVERAGE(Sheet1!A:A)


            Paste that along the top row for each of 1000 columns in sheet 2.



            Sheet 3:
            Cell A1=



            =IF(Sheet1!A1="*",Sheet2!A$1,Sheet1!A1)


            Copy that and then paste it into the entire worksheet 3 (i.e., that top left corner symbol that allows you to do that). It's gonna take a while to update but will deliver what you want!






            share|improve this answer























            • thank you sooo much - that works brilliantly!!
              – Sam
              Nov 10 at 15:25










            • You're welcome :-) ...please could you mark it as answered then :-) link
              – RichardBJ
              Nov 10 at 15:29

















            up vote
            1
            down vote













            I can give you a three sheet solution Sam?:



            Sheet 2:
            Cell A1=



            =AVERAGE(Sheet1!A:A)


            Paste that along the top row for each of 1000 columns in sheet 2.



            Sheet 3:
            Cell A1=



            =IF(Sheet1!A1="*",Sheet2!A$1,Sheet1!A1)


            Copy that and then paste it into the entire worksheet 3 (i.e., that top left corner symbol that allows you to do that). It's gonna take a while to update but will deliver what you want!






            share|improve this answer























            • thank you sooo much - that works brilliantly!!
              – Sam
              Nov 10 at 15:25










            • You're welcome :-) ...please could you mark it as answered then :-) link
              – RichardBJ
              Nov 10 at 15:29















            up vote
            1
            down vote










            up vote
            1
            down vote









            I can give you a three sheet solution Sam?:



            Sheet 2:
            Cell A1=



            =AVERAGE(Sheet1!A:A)


            Paste that along the top row for each of 1000 columns in sheet 2.



            Sheet 3:
            Cell A1=



            =IF(Sheet1!A1="*",Sheet2!A$1,Sheet1!A1)


            Copy that and then paste it into the entire worksheet 3 (i.e., that top left corner symbol that allows you to do that). It's gonna take a while to update but will deliver what you want!






            share|improve this answer














            I can give you a three sheet solution Sam?:



            Sheet 2:
            Cell A1=



            =AVERAGE(Sheet1!A:A)


            Paste that along the top row for each of 1000 columns in sheet 2.



            Sheet 3:
            Cell A1=



            =IF(Sheet1!A1="*",Sheet2!A$1,Sheet1!A1)


            Copy that and then paste it into the entire worksheet 3 (i.e., that top left corner symbol that allows you to do that). It's gonna take a while to update but will deliver what you want!







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 10 at 15:12

























            answered Nov 10 at 14:59









            RichardBJ

            1265




            1265












            • thank you sooo much - that works brilliantly!!
              – Sam
              Nov 10 at 15:25










            • You're welcome :-) ...please could you mark it as answered then :-) link
              – RichardBJ
              Nov 10 at 15:29




















            • thank you sooo much - that works brilliantly!!
              – Sam
              Nov 10 at 15:25










            • You're welcome :-) ...please could you mark it as answered then :-) link
              – RichardBJ
              Nov 10 at 15:29


















            thank you sooo much - that works brilliantly!!
            – Sam
            Nov 10 at 15:25




            thank you sooo much - that works brilliantly!!
            – Sam
            Nov 10 at 15:25












            You're welcome :-) ...please could you mark it as answered then :-) link
            – RichardBJ
            Nov 10 at 15:29






            You're welcome :-) ...please could you mark it as answered then :-) link
            – RichardBJ
            Nov 10 at 15:29














            up vote
            1
            down vote













            As you have mentioned machine learning I thought I would introduce you to how you could do this with Azure Machine Learning Studio (AML) using a free account.



            By using AML you gain access to a number of methods for replacing missing values which are extremely quick. AML has a Clean Missing Data module which exposes methods of replacement such as Multivariate Imputation using Chained Equation, Mean, Median and several others. The great thing here is you can visualize the dataset columns by right clicking on the dataset and see which columns have skew. You can then select on a column by column basis which replacement method to use. If you have heavily skewed columns you might use median instead for instance. This also offers great opportunities for data normalization (scale and reduce). You also gain access to using Python and R with your dataset.



            I don't know if there is a method for directly treating "*" as missing values, I am trying to find that out, but if you do a little processing in advance of load then all is fine. The step before loading requires:




            1. Export the sheet as a CSV and save it.

            2. Use Ctrl+ F to bring up the find and replace dialog and enter "~*" for Find and leave Replace blank




            Then login into AML and click the + New at the bottom of the screen





            Select New > DATASET > FROM LOCAL FILE and select your file





            When selecting type ensure to select CSV with no header if you data has no header row or with header if it does:





            Your dataset will start uploading as shown by progress bar at bottom of screen and then appear in the SAVED DATASETS collection.



            Click the + New button again and select EXPERIMENT > BLANK EXPERIMENT





            Drag and drop your saved dataset onto the canvas on the right:





            In the Search experiment items box on the right, type: Clean Missing Data
            then drag the module that appears onto the canvas





            Join the 2 boxes by clicking the dot at the bottom of the top box and dragging to the other box





            Select the bottom box and then input the following parameters on the right (here is where you can choose which method to apply for missing values e.g. replace missing with mean, or perhaps median if your column data is skewed.







            Right click the bottom module and select Run selected





            Right click again and select Cleaned dataset > Save as Dataset





            The progress bar at the bottom will inform you when complete



            Type in the Search experiment items box again: convert to csv and drag that onto the canvas and connect the left hand side bottom of the second module to the top of the newly added third:







            Select the bottom module and right click > Run selected



            Wait for the progress bar to complete.





            Right-click the bottom module and hit Download. Done.








            share|improve this answer



























              up vote
              1
              down vote













              As you have mentioned machine learning I thought I would introduce you to how you could do this with Azure Machine Learning Studio (AML) using a free account.



              By using AML you gain access to a number of methods for replacing missing values which are extremely quick. AML has a Clean Missing Data module which exposes methods of replacement such as Multivariate Imputation using Chained Equation, Mean, Median and several others. The great thing here is you can visualize the dataset columns by right clicking on the dataset and see which columns have skew. You can then select on a column by column basis which replacement method to use. If you have heavily skewed columns you might use median instead for instance. This also offers great opportunities for data normalization (scale and reduce). You also gain access to using Python and R with your dataset.



              I don't know if there is a method for directly treating "*" as missing values, I am trying to find that out, but if you do a little processing in advance of load then all is fine. The step before loading requires:




              1. Export the sheet as a CSV and save it.

              2. Use Ctrl+ F to bring up the find and replace dialog and enter "~*" for Find and leave Replace blank




              Then login into AML and click the + New at the bottom of the screen





              Select New > DATASET > FROM LOCAL FILE and select your file





              When selecting type ensure to select CSV with no header if you data has no header row or with header if it does:





              Your dataset will start uploading as shown by progress bar at bottom of screen and then appear in the SAVED DATASETS collection.



              Click the + New button again and select EXPERIMENT > BLANK EXPERIMENT





              Drag and drop your saved dataset onto the canvas on the right:





              In the Search experiment items box on the right, type: Clean Missing Data
              then drag the module that appears onto the canvas





              Join the 2 boxes by clicking the dot at the bottom of the top box and dragging to the other box





              Select the bottom box and then input the following parameters on the right (here is where you can choose which method to apply for missing values e.g. replace missing with mean, or perhaps median if your column data is skewed.







              Right click the bottom module and select Run selected





              Right click again and select Cleaned dataset > Save as Dataset





              The progress bar at the bottom will inform you when complete



              Type in the Search experiment items box again: convert to csv and drag that onto the canvas and connect the left hand side bottom of the second module to the top of the newly added third:







              Select the bottom module and right click > Run selected



              Wait for the progress bar to complete.





              Right-click the bottom module and hit Download. Done.








              share|improve this answer

























                up vote
                1
                down vote










                up vote
                1
                down vote









                As you have mentioned machine learning I thought I would introduce you to how you could do this with Azure Machine Learning Studio (AML) using a free account.



                By using AML you gain access to a number of methods for replacing missing values which are extremely quick. AML has a Clean Missing Data module which exposes methods of replacement such as Multivariate Imputation using Chained Equation, Mean, Median and several others. The great thing here is you can visualize the dataset columns by right clicking on the dataset and see which columns have skew. You can then select on a column by column basis which replacement method to use. If you have heavily skewed columns you might use median instead for instance. This also offers great opportunities for data normalization (scale and reduce). You also gain access to using Python and R with your dataset.



                I don't know if there is a method for directly treating "*" as missing values, I am trying to find that out, but if you do a little processing in advance of load then all is fine. The step before loading requires:




                1. Export the sheet as a CSV and save it.

                2. Use Ctrl+ F to bring up the find and replace dialog and enter "~*" for Find and leave Replace blank




                Then login into AML and click the + New at the bottom of the screen





                Select New > DATASET > FROM LOCAL FILE and select your file





                When selecting type ensure to select CSV with no header if you data has no header row or with header if it does:





                Your dataset will start uploading as shown by progress bar at bottom of screen and then appear in the SAVED DATASETS collection.



                Click the + New button again and select EXPERIMENT > BLANK EXPERIMENT





                Drag and drop your saved dataset onto the canvas on the right:





                In the Search experiment items box on the right, type: Clean Missing Data
                then drag the module that appears onto the canvas





                Join the 2 boxes by clicking the dot at the bottom of the top box and dragging to the other box





                Select the bottom box and then input the following parameters on the right (here is where you can choose which method to apply for missing values e.g. replace missing with mean, or perhaps median if your column data is skewed.







                Right click the bottom module and select Run selected





                Right click again and select Cleaned dataset > Save as Dataset





                The progress bar at the bottom will inform you when complete



                Type in the Search experiment items box again: convert to csv and drag that onto the canvas and connect the left hand side bottom of the second module to the top of the newly added third:







                Select the bottom module and right click > Run selected



                Wait for the progress bar to complete.





                Right-click the bottom module and hit Download. Done.








                share|improve this answer














                As you have mentioned machine learning I thought I would introduce you to how you could do this with Azure Machine Learning Studio (AML) using a free account.



                By using AML you gain access to a number of methods for replacing missing values which are extremely quick. AML has a Clean Missing Data module which exposes methods of replacement such as Multivariate Imputation using Chained Equation, Mean, Median and several others. The great thing here is you can visualize the dataset columns by right clicking on the dataset and see which columns have skew. You can then select on a column by column basis which replacement method to use. If you have heavily skewed columns you might use median instead for instance. This also offers great opportunities for data normalization (scale and reduce). You also gain access to using Python and R with your dataset.



                I don't know if there is a method for directly treating "*" as missing values, I am trying to find that out, but if you do a little processing in advance of load then all is fine. The step before loading requires:




                1. Export the sheet as a CSV and save it.

                2. Use Ctrl+ F to bring up the find and replace dialog and enter "~*" for Find and leave Replace blank




                Then login into AML and click the + New at the bottom of the screen





                Select New > DATASET > FROM LOCAL FILE and select your file





                When selecting type ensure to select CSV with no header if you data has no header row or with header if it does:





                Your dataset will start uploading as shown by progress bar at bottom of screen and then appear in the SAVED DATASETS collection.



                Click the + New button again and select EXPERIMENT > BLANK EXPERIMENT





                Drag and drop your saved dataset onto the canvas on the right:





                In the Search experiment items box on the right, type: Clean Missing Data
                then drag the module that appears onto the canvas





                Join the 2 boxes by clicking the dot at the bottom of the top box and dragging to the other box





                Select the bottom box and then input the following parameters on the right (here is where you can choose which method to apply for missing values e.g. replace missing with mean, or perhaps median if your column data is skewed.







                Right click the bottom module and select Run selected





                Right click again and select Cleaned dataset > Save as Dataset





                The progress bar at the bottom will inform you when complete



                Type in the Search experiment items box again: convert to csv and drag that onto the canvas and connect the left hand side bottom of the second module to the top of the newly added third:







                Select the bottom module and right click > Run selected



                Wait for the progress bar to complete.





                Right-click the bottom module and hit Download. Done.









                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited 2 days ago

























                answered Nov 10 at 19:07









                QHarr

                25.3k81839




                25.3k81839






























                     

                    draft saved


                    draft discarded



















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53239988%2freplace-value-with-the-average-of-its-column-many-columns%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest




















































































                    Popular posts from this blog

                    Bressuire

                    Vorschmack

                    Quarantine