Combine imputed and non imputed data





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















I have a question about merging datasets after multiple imputation. I have created an example to explain my problem:



id <- c(1,2,3,4,5,6,7,8,9,10)
age <- c(60,NA,90,55,60,61,77,67,88,90)
bmi <- c(30,NA,NA,23,24,NA,27,23,26,21)
time <- c(62,88,85,NA,68,62,89,62,70,99)
dat <- data.frame(id, age, bmi, time)
dat

id <- c(1,2,3,4,5,6,7,8,9,10)
m1 <- c(60,78,90,55,60,61,77,67,88,90)
m2 <- c(30,44,35,23,24,22,27,23,26,21)
m3 <- c(62,88,85,78,68,62,89,62,70,99)
dat2 <- data.frame(id, m1, m2, m3)
dat2


I have two datasets, dat and dat2. The dataset dat contains missing variables, so I use multiple imputation to impute this dataset (package MICE):



library(mice)
impdat <- mice(dat, maxit = 0)
methdat <- impdat$method
preddat <- impdat$predictorMatrix
preddat["id",] <- 0
preddat[,"id"] <- 0
impdat <- mice(dat, method = methdat, predictorMatrix = preddat, seed =
2018, maxit = 10, m = 5)


Now I want to merge the imputed dataset impdat with the dataset dat2. But that is were my problem arises. I tried the following:



completedat <- complete(impdat, include = T, action = 'long')
finaldat <- merge(completedat, dat2, by = "id")

finaldat <- as.mids(finaldat)
Error in `[<-.data.frame`(`*tmp*`, j, value = c(61, 88)) : replacement has 2 rows, data has 1


However, this gives me an error message. The merging is successful, because the dataframe completedat is what I want. The problem is that I cannot transform it back to a mids object.



I know I can add the variables from dat2 one by one. That does work:



completedat <- complete(impdat, include = T, action = 'long')
completedat$m1 <- dat2$m1
finaldat2 <- as.mids(completedat)


In this example, this is okay, because dat2 only has 4 variables. In my real data, I have approximately 200 variables that I want to add to my multiple imputed dataset, so I hope there is an easier way to add all those variables to my imputed dataset. Can somebody help me?










share|improve this question





























    0















    I have a question about merging datasets after multiple imputation. I have created an example to explain my problem:



    id <- c(1,2,3,4,5,6,7,8,9,10)
    age <- c(60,NA,90,55,60,61,77,67,88,90)
    bmi <- c(30,NA,NA,23,24,NA,27,23,26,21)
    time <- c(62,88,85,NA,68,62,89,62,70,99)
    dat <- data.frame(id, age, bmi, time)
    dat

    id <- c(1,2,3,4,5,6,7,8,9,10)
    m1 <- c(60,78,90,55,60,61,77,67,88,90)
    m2 <- c(30,44,35,23,24,22,27,23,26,21)
    m3 <- c(62,88,85,78,68,62,89,62,70,99)
    dat2 <- data.frame(id, m1, m2, m3)
    dat2


    I have two datasets, dat and dat2. The dataset dat contains missing variables, so I use multiple imputation to impute this dataset (package MICE):



    library(mice)
    impdat <- mice(dat, maxit = 0)
    methdat <- impdat$method
    preddat <- impdat$predictorMatrix
    preddat["id",] <- 0
    preddat[,"id"] <- 0
    impdat <- mice(dat, method = methdat, predictorMatrix = preddat, seed =
    2018, maxit = 10, m = 5)


    Now I want to merge the imputed dataset impdat with the dataset dat2. But that is were my problem arises. I tried the following:



    completedat <- complete(impdat, include = T, action = 'long')
    finaldat <- merge(completedat, dat2, by = "id")

    finaldat <- as.mids(finaldat)
    Error in `[<-.data.frame`(`*tmp*`, j, value = c(61, 88)) : replacement has 2 rows, data has 1


    However, this gives me an error message. The merging is successful, because the dataframe completedat is what I want. The problem is that I cannot transform it back to a mids object.



    I know I can add the variables from dat2 one by one. That does work:



    completedat <- complete(impdat, include = T, action = 'long')
    completedat$m1 <- dat2$m1
    finaldat2 <- as.mids(completedat)


    In this example, this is okay, because dat2 only has 4 variables. In my real data, I have approximately 200 variables that I want to add to my multiple imputed dataset, so I hope there is an easier way to add all those variables to my imputed dataset. Can somebody help me?










    share|improve this question

























      0












      0








      0


      0






      I have a question about merging datasets after multiple imputation. I have created an example to explain my problem:



      id <- c(1,2,3,4,5,6,7,8,9,10)
      age <- c(60,NA,90,55,60,61,77,67,88,90)
      bmi <- c(30,NA,NA,23,24,NA,27,23,26,21)
      time <- c(62,88,85,NA,68,62,89,62,70,99)
      dat <- data.frame(id, age, bmi, time)
      dat

      id <- c(1,2,3,4,5,6,7,8,9,10)
      m1 <- c(60,78,90,55,60,61,77,67,88,90)
      m2 <- c(30,44,35,23,24,22,27,23,26,21)
      m3 <- c(62,88,85,78,68,62,89,62,70,99)
      dat2 <- data.frame(id, m1, m2, m3)
      dat2


      I have two datasets, dat and dat2. The dataset dat contains missing variables, so I use multiple imputation to impute this dataset (package MICE):



      library(mice)
      impdat <- mice(dat, maxit = 0)
      methdat <- impdat$method
      preddat <- impdat$predictorMatrix
      preddat["id",] <- 0
      preddat[,"id"] <- 0
      impdat <- mice(dat, method = methdat, predictorMatrix = preddat, seed =
      2018, maxit = 10, m = 5)


      Now I want to merge the imputed dataset impdat with the dataset dat2. But that is were my problem arises. I tried the following:



      completedat <- complete(impdat, include = T, action = 'long')
      finaldat <- merge(completedat, dat2, by = "id")

      finaldat <- as.mids(finaldat)
      Error in `[<-.data.frame`(`*tmp*`, j, value = c(61, 88)) : replacement has 2 rows, data has 1


      However, this gives me an error message. The merging is successful, because the dataframe completedat is what I want. The problem is that I cannot transform it back to a mids object.



      I know I can add the variables from dat2 one by one. That does work:



      completedat <- complete(impdat, include = T, action = 'long')
      completedat$m1 <- dat2$m1
      finaldat2 <- as.mids(completedat)


      In this example, this is okay, because dat2 only has 4 variables. In my real data, I have approximately 200 variables that I want to add to my multiple imputed dataset, so I hope there is an easier way to add all those variables to my imputed dataset. Can somebody help me?










      share|improve this question














      I have a question about merging datasets after multiple imputation. I have created an example to explain my problem:



      id <- c(1,2,3,4,5,6,7,8,9,10)
      age <- c(60,NA,90,55,60,61,77,67,88,90)
      bmi <- c(30,NA,NA,23,24,NA,27,23,26,21)
      time <- c(62,88,85,NA,68,62,89,62,70,99)
      dat <- data.frame(id, age, bmi, time)
      dat

      id <- c(1,2,3,4,5,6,7,8,9,10)
      m1 <- c(60,78,90,55,60,61,77,67,88,90)
      m2 <- c(30,44,35,23,24,22,27,23,26,21)
      m3 <- c(62,88,85,78,68,62,89,62,70,99)
      dat2 <- data.frame(id, m1, m2, m3)
      dat2


      I have two datasets, dat and dat2. The dataset dat contains missing variables, so I use multiple imputation to impute this dataset (package MICE):



      library(mice)
      impdat <- mice(dat, maxit = 0)
      methdat <- impdat$method
      preddat <- impdat$predictorMatrix
      preddat["id",] <- 0
      preddat[,"id"] <- 0
      impdat <- mice(dat, method = methdat, predictorMatrix = preddat, seed =
      2018, maxit = 10, m = 5)


      Now I want to merge the imputed dataset impdat with the dataset dat2. But that is were my problem arises. I tried the following:



      completedat <- complete(impdat, include = T, action = 'long')
      finaldat <- merge(completedat, dat2, by = "id")

      finaldat <- as.mids(finaldat)
      Error in `[<-.data.frame`(`*tmp*`, j, value = c(61, 88)) : replacement has 2 rows, data has 1


      However, this gives me an error message. The merging is successful, because the dataframe completedat is what I want. The problem is that I cannot transform it back to a mids object.



      I know I can add the variables from dat2 one by one. That does work:



      completedat <- complete(impdat, include = T, action = 'long')
      completedat$m1 <- dat2$m1
      finaldat2 <- as.mids(completedat)


      In this example, this is okay, because dat2 only has 4 variables. In my real data, I have approximately 200 variables that I want to add to my multiple imputed dataset, so I hope there is an easier way to add all those variables to my imputed dataset. Can somebody help me?







      r merge r-mice






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 16 '18 at 13:11









      Anna_70Anna_70

      263




      263
























          1 Answer
          1






          active

          oldest

          votes


















          0














          Wouldn't cbind work provided that you want to combine imputed and non-imputed data?



          id <- c(1,2,3,4,5,6,7,8,9,10)
          age <- c(60,NA,90,55,60,61,77,67,88,90)
          bmi <- c(30,NA,NA,23,24,NA,27,23,26,21)
          time <- c(62,88,85,NA,68,62,89,62,70,99)
          dat <- data.frame(id, age, bmi, time)
          dat

          id <- c(1,2,3,4,5,6,7,8,9,10)
          m1 <- c(60,78,90,55,60,61,77,67,88,90)
          m2 <- c(30,44,35,23,24,22,27,23,26,21)
          m3 <- c(62,88,85,78,68,62,89,62,70,99)
          dat2 <- data.frame(id, m1, m2, m3)
          dat2

          # install.packages("mice")
          library(mice)
          impdat <- mice(dat,
          seed = 2018,
          maxit = 10,
          m = 5)
          impdat
          # Class: mids
          # Number of multiple imputations: 5
          # Imputation methods:
          # id age bmi time
          # "" "pmm" "pmm" "pmm"
          # PredictorMatrix:
          # id age bmi time
          # id 0 1 1 1
          # age 1 0 1 1
          # bmi 1 1 0 1
          # time 1 1 1 0

          impdat = complete(impdat)
          impdat

          # id age bmi time
          # 1 1 60 30 62
          # 2 2 60 24 88
          # 3 3 90 24 85
          # 4 4 55 23 89
          # 5 5 60 24 68
          # 6 6 61 24 62
          # 7 7 77 27 89
          # 8 8 67 23 62
          # 9 9 88 26 70
          # 10 10 90 21 99

          final_data = cbind(impdat, dat2)
          final_data
          # id age bmi time id m1 m2 m3
          # 1 1 60 30 62 1 60 30 62
          # 2 2 60 24 88 2 78 44 88
          # 3 3 90 24 85 3 90 35 85
          # 4 4 55 23 89 4 55 23 78
          # 5 5 60 24 68 5 60 24 68
          # 6 6 61 24 62 6 61 22 62
          # 7 7 77 27 89 7 77 27 89
          # 8 8 67 23 62 8 67 23 62
          # 9 9 88 26 70 9 88 26 70
          # 10 10 90 21 99 10 90 21 99


          enter image description here






          share|improve this answer
























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53338606%2fcombine-imputed-and-non-imputed-data%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            Wouldn't cbind work provided that you want to combine imputed and non-imputed data?



            id <- c(1,2,3,4,5,6,7,8,9,10)
            age <- c(60,NA,90,55,60,61,77,67,88,90)
            bmi <- c(30,NA,NA,23,24,NA,27,23,26,21)
            time <- c(62,88,85,NA,68,62,89,62,70,99)
            dat <- data.frame(id, age, bmi, time)
            dat

            id <- c(1,2,3,4,5,6,7,8,9,10)
            m1 <- c(60,78,90,55,60,61,77,67,88,90)
            m2 <- c(30,44,35,23,24,22,27,23,26,21)
            m3 <- c(62,88,85,78,68,62,89,62,70,99)
            dat2 <- data.frame(id, m1, m2, m3)
            dat2

            # install.packages("mice")
            library(mice)
            impdat <- mice(dat,
            seed = 2018,
            maxit = 10,
            m = 5)
            impdat
            # Class: mids
            # Number of multiple imputations: 5
            # Imputation methods:
            # id age bmi time
            # "" "pmm" "pmm" "pmm"
            # PredictorMatrix:
            # id age bmi time
            # id 0 1 1 1
            # age 1 0 1 1
            # bmi 1 1 0 1
            # time 1 1 1 0

            impdat = complete(impdat)
            impdat

            # id age bmi time
            # 1 1 60 30 62
            # 2 2 60 24 88
            # 3 3 90 24 85
            # 4 4 55 23 89
            # 5 5 60 24 68
            # 6 6 61 24 62
            # 7 7 77 27 89
            # 8 8 67 23 62
            # 9 9 88 26 70
            # 10 10 90 21 99

            final_data = cbind(impdat, dat2)
            final_data
            # id age bmi time id m1 m2 m3
            # 1 1 60 30 62 1 60 30 62
            # 2 2 60 24 88 2 78 44 88
            # 3 3 90 24 85 3 90 35 85
            # 4 4 55 23 89 4 55 23 78
            # 5 5 60 24 68 5 60 24 68
            # 6 6 61 24 62 6 61 22 62
            # 7 7 77 27 89 7 77 27 89
            # 8 8 67 23 62 8 67 23 62
            # 9 9 88 26 70 9 88 26 70
            # 10 10 90 21 99 10 90 21 99


            enter image description here






            share|improve this answer




























              0














              Wouldn't cbind work provided that you want to combine imputed and non-imputed data?



              id <- c(1,2,3,4,5,6,7,8,9,10)
              age <- c(60,NA,90,55,60,61,77,67,88,90)
              bmi <- c(30,NA,NA,23,24,NA,27,23,26,21)
              time <- c(62,88,85,NA,68,62,89,62,70,99)
              dat <- data.frame(id, age, bmi, time)
              dat

              id <- c(1,2,3,4,5,6,7,8,9,10)
              m1 <- c(60,78,90,55,60,61,77,67,88,90)
              m2 <- c(30,44,35,23,24,22,27,23,26,21)
              m3 <- c(62,88,85,78,68,62,89,62,70,99)
              dat2 <- data.frame(id, m1, m2, m3)
              dat2

              # install.packages("mice")
              library(mice)
              impdat <- mice(dat,
              seed = 2018,
              maxit = 10,
              m = 5)
              impdat
              # Class: mids
              # Number of multiple imputations: 5
              # Imputation methods:
              # id age bmi time
              # "" "pmm" "pmm" "pmm"
              # PredictorMatrix:
              # id age bmi time
              # id 0 1 1 1
              # age 1 0 1 1
              # bmi 1 1 0 1
              # time 1 1 1 0

              impdat = complete(impdat)
              impdat

              # id age bmi time
              # 1 1 60 30 62
              # 2 2 60 24 88
              # 3 3 90 24 85
              # 4 4 55 23 89
              # 5 5 60 24 68
              # 6 6 61 24 62
              # 7 7 77 27 89
              # 8 8 67 23 62
              # 9 9 88 26 70
              # 10 10 90 21 99

              final_data = cbind(impdat, dat2)
              final_data
              # id age bmi time id m1 m2 m3
              # 1 1 60 30 62 1 60 30 62
              # 2 2 60 24 88 2 78 44 88
              # 3 3 90 24 85 3 90 35 85
              # 4 4 55 23 89 4 55 23 78
              # 5 5 60 24 68 5 60 24 68
              # 6 6 61 24 62 6 61 22 62
              # 7 7 77 27 89 7 77 27 89
              # 8 8 67 23 62 8 67 23 62
              # 9 9 88 26 70 9 88 26 70
              # 10 10 90 21 99 10 90 21 99


              enter image description here






              share|improve this answer


























                0












                0








                0







                Wouldn't cbind work provided that you want to combine imputed and non-imputed data?



                id <- c(1,2,3,4,5,6,7,8,9,10)
                age <- c(60,NA,90,55,60,61,77,67,88,90)
                bmi <- c(30,NA,NA,23,24,NA,27,23,26,21)
                time <- c(62,88,85,NA,68,62,89,62,70,99)
                dat <- data.frame(id, age, bmi, time)
                dat

                id <- c(1,2,3,4,5,6,7,8,9,10)
                m1 <- c(60,78,90,55,60,61,77,67,88,90)
                m2 <- c(30,44,35,23,24,22,27,23,26,21)
                m3 <- c(62,88,85,78,68,62,89,62,70,99)
                dat2 <- data.frame(id, m1, m2, m3)
                dat2

                # install.packages("mice")
                library(mice)
                impdat <- mice(dat,
                seed = 2018,
                maxit = 10,
                m = 5)
                impdat
                # Class: mids
                # Number of multiple imputations: 5
                # Imputation methods:
                # id age bmi time
                # "" "pmm" "pmm" "pmm"
                # PredictorMatrix:
                # id age bmi time
                # id 0 1 1 1
                # age 1 0 1 1
                # bmi 1 1 0 1
                # time 1 1 1 0

                impdat = complete(impdat)
                impdat

                # id age bmi time
                # 1 1 60 30 62
                # 2 2 60 24 88
                # 3 3 90 24 85
                # 4 4 55 23 89
                # 5 5 60 24 68
                # 6 6 61 24 62
                # 7 7 77 27 89
                # 8 8 67 23 62
                # 9 9 88 26 70
                # 10 10 90 21 99

                final_data = cbind(impdat, dat2)
                final_data
                # id age bmi time id m1 m2 m3
                # 1 1 60 30 62 1 60 30 62
                # 2 2 60 24 88 2 78 44 88
                # 3 3 90 24 85 3 90 35 85
                # 4 4 55 23 89 4 55 23 78
                # 5 5 60 24 68 5 60 24 68
                # 6 6 61 24 62 6 61 22 62
                # 7 7 77 27 89 7 77 27 89
                # 8 8 67 23 62 8 67 23 62
                # 9 9 88 26 70 9 88 26 70
                # 10 10 90 21 99 10 90 21 99


                enter image description here






                share|improve this answer













                Wouldn't cbind work provided that you want to combine imputed and non-imputed data?



                id <- c(1,2,3,4,5,6,7,8,9,10)
                age <- c(60,NA,90,55,60,61,77,67,88,90)
                bmi <- c(30,NA,NA,23,24,NA,27,23,26,21)
                time <- c(62,88,85,NA,68,62,89,62,70,99)
                dat <- data.frame(id, age, bmi, time)
                dat

                id <- c(1,2,3,4,5,6,7,8,9,10)
                m1 <- c(60,78,90,55,60,61,77,67,88,90)
                m2 <- c(30,44,35,23,24,22,27,23,26,21)
                m3 <- c(62,88,85,78,68,62,89,62,70,99)
                dat2 <- data.frame(id, m1, m2, m3)
                dat2

                # install.packages("mice")
                library(mice)
                impdat <- mice(dat,
                seed = 2018,
                maxit = 10,
                m = 5)
                impdat
                # Class: mids
                # Number of multiple imputations: 5
                # Imputation methods:
                # id age bmi time
                # "" "pmm" "pmm" "pmm"
                # PredictorMatrix:
                # id age bmi time
                # id 0 1 1 1
                # age 1 0 1 1
                # bmi 1 1 0 1
                # time 1 1 1 0

                impdat = complete(impdat)
                impdat

                # id age bmi time
                # 1 1 60 30 62
                # 2 2 60 24 88
                # 3 3 90 24 85
                # 4 4 55 23 89
                # 5 5 60 24 68
                # 6 6 61 24 62
                # 7 7 77 27 89
                # 8 8 67 23 62
                # 9 9 88 26 70
                # 10 10 90 21 99

                final_data = cbind(impdat, dat2)
                final_data
                # id age bmi time id m1 m2 m3
                # 1 1 60 30 62 1 60 30 62
                # 2 2 60 24 88 2 78 44 88
                # 3 3 90 24 85 3 90 35 85
                # 4 4 55 23 89 4 55 23 78
                # 5 5 60 24 68 5 60 24 68
                # 6 6 61 24 62 6 61 22 62
                # 7 7 77 27 89 7 77 27 89
                # 8 8 67 23 62 8 67 23 62
                # 9 9 88 26 70 9 88 26 70
                # 10 10 90 21 99 10 90 21 99


                enter image description here







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 16 '18 at 14:08









                kon_ukon_u

                1966




                1966
































                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53338606%2fcombine-imputed-and-non-imputed-data%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Xamarin.iOS Cant Deploy on Iphone

                    Glorious Revolution

                    Dulmage-Mendelsohn matrix decomposition in Python