Dimensions problem linear regression Python scikit learn












0















I'm implementing a function in which I have to perform a linear regression using scikit learn.



What I have when running it with an example:



X_train.shape=(34,3)
X_test.shape=(12,3)
Y_train.shape=(34,1)
Y_test.shape=(12,1)


Then



lm.fit(X_train,Y_train)
Y_pred = lm.predict(X_test)


However Python tells me there is a mistake at this line



 dico['R2 value']=lm.score(Y_test, Y_pred)


What Python tells me:



 ValueError: shapes (12,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)


Thanks in advance for the help anyone could bring me :)



Alex










share|improve this question





























    0















    I'm implementing a function in which I have to perform a linear regression using scikit learn.



    What I have when running it with an example:



    X_train.shape=(34,3)
    X_test.shape=(12,3)
    Y_train.shape=(34,1)
    Y_test.shape=(12,1)


    Then



    lm.fit(X_train,Y_train)
    Y_pred = lm.predict(X_test)


    However Python tells me there is a mistake at this line



     dico['R2 value']=lm.score(Y_test, Y_pred)


    What Python tells me:



     ValueError: shapes (12,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)


    Thanks in advance for the help anyone could bring me :)



    Alex










    share|improve this question



























      0












      0








      0








      I'm implementing a function in which I have to perform a linear regression using scikit learn.



      What I have when running it with an example:



      X_train.shape=(34,3)
      X_test.shape=(12,3)
      Y_train.shape=(34,1)
      Y_test.shape=(12,1)


      Then



      lm.fit(X_train,Y_train)
      Y_pred = lm.predict(X_test)


      However Python tells me there is a mistake at this line



       dico['R2 value']=lm.score(Y_test, Y_pred)


      What Python tells me:



       ValueError: shapes (12,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)


      Thanks in advance for the help anyone could bring me :)



      Alex










      share|improve this question
















      I'm implementing a function in which I have to perform a linear regression using scikit learn.



      What I have when running it with an example:



      X_train.shape=(34,3)
      X_test.shape=(12,3)
      Y_train.shape=(34,1)
      Y_test.shape=(12,1)


      Then



      lm.fit(X_train,Y_train)
      Y_pred = lm.predict(X_test)


      However Python tells me there is a mistake at this line



       dico['R2 value']=lm.score(Y_test, Y_pred)


      What Python tells me:



       ValueError: shapes (12,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)


      Thanks in advance for the help anyone could bring me :)



      Alex







      python-2.7 scikit-learn linear-regression






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 16 '18 at 11:42









      Vivek Kumar

      16.8k42156




      16.8k42156










      asked Nov 16 '18 at 11:36









      Alex Alex

      175




      175
























          1 Answer
          1






          active

          oldest

          votes


















          1














          For using lm.score() you need to pass X_test, y_test.



          dico['R2 value']=lm.score(X_test, Y_test)


          See the documentation here:




          score(X, y, sample_weight=None)



          X : array-like, shape = (n_samples, n_features) Test samples. 
          For some estimators this may be a precomputed kernel matrix instead,
          shape = (n_samples, n_samples_fitted], where n_samples_fitted is the
          number of samples used in the fitting for the estimator.

          y : array-like, shape = (n_samples) or (n_samples, n_outputs) True values for X.

          sample_weight : array-like, shape = [n_samples], optional Sample weights.



          You are trying to use the score method as a metric method, which is wrong. A score() method on any estimator will itself calculate the predictions and then send them to appropriate metric scorer.



          If you want to use Y_test and Y_pred yourself, then you can do this:



          from sklearn.metrics import r2_score
          dico['R2 value'] = r2_score(Y_test, Y_pred)





          share|improve this answer


























          • Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...

            – Alex
            Nov 16 '18 at 11:58











          • @Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which have Classifier in their names

            – Vivek Kumar
            Nov 16 '18 at 12:04











          • Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?

            – Alex
            Nov 16 '18 at 12:59











          • @Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.

            – Vivek Kumar
            Nov 19 '18 at 10:02











          • One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)

            – Alex
            Nov 22 '18 at 9:21












          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53337090%2fdimensions-problem-linear-regression-python-scikit-learn%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          For using lm.score() you need to pass X_test, y_test.



          dico['R2 value']=lm.score(X_test, Y_test)


          See the documentation here:




          score(X, y, sample_weight=None)



          X : array-like, shape = (n_samples, n_features) Test samples. 
          For some estimators this may be a precomputed kernel matrix instead,
          shape = (n_samples, n_samples_fitted], where n_samples_fitted is the
          number of samples used in the fitting for the estimator.

          y : array-like, shape = (n_samples) or (n_samples, n_outputs) True values for X.

          sample_weight : array-like, shape = [n_samples], optional Sample weights.



          You are trying to use the score method as a metric method, which is wrong. A score() method on any estimator will itself calculate the predictions and then send them to appropriate metric scorer.



          If you want to use Y_test and Y_pred yourself, then you can do this:



          from sklearn.metrics import r2_score
          dico['R2 value'] = r2_score(Y_test, Y_pred)





          share|improve this answer


























          • Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...

            – Alex
            Nov 16 '18 at 11:58











          • @Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which have Classifier in their names

            – Vivek Kumar
            Nov 16 '18 at 12:04











          • Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?

            – Alex
            Nov 16 '18 at 12:59











          • @Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.

            – Vivek Kumar
            Nov 19 '18 at 10:02











          • One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)

            – Alex
            Nov 22 '18 at 9:21
















          1














          For using lm.score() you need to pass X_test, y_test.



          dico['R2 value']=lm.score(X_test, Y_test)


          See the documentation here:




          score(X, y, sample_weight=None)



          X : array-like, shape = (n_samples, n_features) Test samples. 
          For some estimators this may be a precomputed kernel matrix instead,
          shape = (n_samples, n_samples_fitted], where n_samples_fitted is the
          number of samples used in the fitting for the estimator.

          y : array-like, shape = (n_samples) or (n_samples, n_outputs) True values for X.

          sample_weight : array-like, shape = [n_samples], optional Sample weights.



          You are trying to use the score method as a metric method, which is wrong. A score() method on any estimator will itself calculate the predictions and then send them to appropriate metric scorer.



          If you want to use Y_test and Y_pred yourself, then you can do this:



          from sklearn.metrics import r2_score
          dico['R2 value'] = r2_score(Y_test, Y_pred)





          share|improve this answer


























          • Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...

            – Alex
            Nov 16 '18 at 11:58











          • @Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which have Classifier in their names

            – Vivek Kumar
            Nov 16 '18 at 12:04











          • Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?

            – Alex
            Nov 16 '18 at 12:59











          • @Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.

            – Vivek Kumar
            Nov 19 '18 at 10:02











          • One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)

            – Alex
            Nov 22 '18 at 9:21














          1












          1








          1







          For using lm.score() you need to pass X_test, y_test.



          dico['R2 value']=lm.score(X_test, Y_test)


          See the documentation here:




          score(X, y, sample_weight=None)



          X : array-like, shape = (n_samples, n_features) Test samples. 
          For some estimators this may be a precomputed kernel matrix instead,
          shape = (n_samples, n_samples_fitted], where n_samples_fitted is the
          number of samples used in the fitting for the estimator.

          y : array-like, shape = (n_samples) or (n_samples, n_outputs) True values for X.

          sample_weight : array-like, shape = [n_samples], optional Sample weights.



          You are trying to use the score method as a metric method, which is wrong. A score() method on any estimator will itself calculate the predictions and then send them to appropriate metric scorer.



          If you want to use Y_test and Y_pred yourself, then you can do this:



          from sklearn.metrics import r2_score
          dico['R2 value'] = r2_score(Y_test, Y_pred)





          share|improve this answer















          For using lm.score() you need to pass X_test, y_test.



          dico['R2 value']=lm.score(X_test, Y_test)


          See the documentation here:




          score(X, y, sample_weight=None)



          X : array-like, shape = (n_samples, n_features) Test samples. 
          For some estimators this may be a precomputed kernel matrix instead,
          shape = (n_samples, n_samples_fitted], where n_samples_fitted is the
          number of samples used in the fitting for the estimator.

          y : array-like, shape = (n_samples) or (n_samples, n_outputs) True values for X.

          sample_weight : array-like, shape = [n_samples], optional Sample weights.



          You are trying to use the score method as a metric method, which is wrong. A score() method on any estimator will itself calculate the predictions and then send them to appropriate metric scorer.



          If you want to use Y_test and Y_pred yourself, then you can do this:



          from sklearn.metrics import r2_score
          dico['R2 value'] = r2_score(Y_test, Y_pred)






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 16 '18 at 11:48

























          answered Nov 16 '18 at 11:42









          Vivek KumarVivek Kumar

          16.8k42156




          16.8k42156













          • Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...

            – Alex
            Nov 16 '18 at 11:58











          • @Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which have Classifier in their names

            – Vivek Kumar
            Nov 16 '18 at 12:04











          • Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?

            – Alex
            Nov 16 '18 at 12:59











          • @Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.

            – Vivek Kumar
            Nov 19 '18 at 10:02











          • One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)

            – Alex
            Nov 22 '18 at 9:21



















          • Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...

            – Alex
            Nov 16 '18 at 11:58











          • @Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which have Classifier in their names

            – Vivek Kumar
            Nov 16 '18 at 12:04











          • Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?

            – Alex
            Nov 16 '18 at 12:59











          • @Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.

            – Vivek Kumar
            Nov 19 '18 at 10:02











          • One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)

            – Alex
            Nov 22 '18 at 9:21

















          Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...

          – Alex
          Nov 16 '18 at 11:58





          Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...

          – Alex
          Nov 16 '18 at 11:58













          @Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which have Classifier in their names

          – Vivek Kumar
          Nov 16 '18 at 12:04





          @Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which have Classifier in their names

          – Vivek Kumar
          Nov 16 '18 at 12:04













          Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?

          – Alex
          Nov 16 '18 at 12:59





          Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?

          – Alex
          Nov 16 '18 at 12:59













          @Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.

          – Vivek Kumar
          Nov 19 '18 at 10:02





          @Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.

          – Vivek Kumar
          Nov 19 '18 at 10:02













          One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)

          – Alex
          Nov 22 '18 at 9:21





          One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)

          – Alex
          Nov 22 '18 at 9:21




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53337090%2fdimensions-problem-linear-regression-python-scikit-learn%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Xamarin.iOS Cant Deploy on Iphone

          Glorious Revolution

          Dulmage-Mendelsohn matrix decomposition in Python