Dimensions problem linear regression Python scikit learn

I'm implementing a function in which I have to perform a linear regression using scikit learn.

What I have when running it with an example:

X_train.shape=(34,3)

X_test.shape=(12,3)

Y_train.shape=(34,1)

Y_test.shape=(12,1)

Then

lm.fit(X_train,Y_train)

Y_pred = lm.predict(X_test)

However Python tells me there is a mistake at this line

 dico['R2 value']=lm.score(Y_test, Y_pred)

What Python tells me:

 ValueError: shapes (12,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)

Thanks in advance for the help anyone could bring me :)

Alex

edited Nov 16 '18 at 11:42

Vivek Kumar

16.8k42156

asked Nov 16 '18 at 11:36

Alex

175

add a comment |

I'm implementing a function in which I have to perform a linear regression using scikit learn.

What I have when running it with an example:

X_train.shape=(34,3)

X_test.shape=(12,3)

Y_train.shape=(34,1)

Y_test.shape=(12,1)

Then

lm.fit(X_train,Y_train)

Y_pred = lm.predict(X_test)

However Python tells me there is a mistake at this line

 dico['R2 value']=lm.score(Y_test, Y_pred)

What Python tells me:

 ValueError: shapes (12,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)

Thanks in advance for the help anyone could bring me :)

Alex

edited Nov 16 '18 at 11:42

Vivek Kumar

16.8k42156

asked Nov 16 '18 at 11:36

Alex

175

add a comment |

I'm implementing a function in which I have to perform a linear regression using scikit learn.

What I have when running it with an example:

X_train.shape=(34,3)

X_test.shape=(12,3)

Y_train.shape=(34,1)

Y_test.shape=(12,1)

Then

lm.fit(X_train,Y_train)

Y_pred = lm.predict(X_test)

However Python tells me there is a mistake at this line

 dico['R2 value']=lm.score(Y_test, Y_pred)

What Python tells me:

 ValueError: shapes (12,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)

Thanks in advance for the help anyone could bring me :)

Alex

edited Nov 16 '18 at 11:42

Vivek Kumar

16.8k42156

asked Nov 16 '18 at 11:36

Alex

175

I'm implementing a function in which I have to perform a linear regression using scikit learn.

What I have when running it with an example:

X_train.shape=(34,3)

X_test.shape=(12,3)

Y_train.shape=(34,1)

Y_test.shape=(12,1)

Then

lm.fit(X_train,Y_train)

Y_pred = lm.predict(X_test)

However Python tells me there is a mistake at this line

 dico['R2 value']=lm.score(Y_test, Y_pred)

What Python tells me:

 ValueError: shapes (12,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)

Thanks in advance for the help anyone could bring me :)

Alex

python-2.7 scikit-learn linear-regression

edited Nov 16 '18 at 11:42

Vivek Kumar

16.8k42156

asked Nov 16 '18 at 11:36

Alex

175

edited Nov 16 '18 at 11:42

Vivek Kumar

16.8k42156

asked Nov 16 '18 at 11:36

Alex

175

edited Nov 16 '18 at 11:42

Vivek Kumar

16.8k42156

edited Nov 16 '18 at 11:42

Vivek Kumar

16.8k42156

edited Nov 16 '18 at 11:42

Vivek Kumar

16.8k42156

asked Nov 16 '18 at 11:36

Alex

175

asked Nov 16 '18 at 11:36

Alex

175

asked Nov 16 '18 at 11:36

Alex

175

add a comment |

1 Answer
1

active

oldest

votes

For using lm.score() you need to pass X_test, y_test.

dico['R2 value']=lm.score(X_test, Y_test)

See the documentation here:

score(X, y, sample_weight=None)

X : array-like, shape = (n_samples, n_features) Test samples. 

    For some estimators this may be a precomputed kernel matrix instead, 

    shape = (n_samples, n_samples_fitted], where n_samples_fitted is the 

    number of samples used in the fitting for the estimator.



y : array-like, shape = (n_samples) or (n_samples, n_outputs) True values for X.



sample_weight : array-like, shape = [n_samples], optional Sample weights.

You are trying to use the score method as a metric method, which is wrong. A score() method on any estimator will itself calculate the predictions and then send them to appropriate metric scorer.

If you want to use Y_test and Y_pred yourself, then you can do this:

from sklearn.metrics import r2_score

dico['R2 value'] = r2_score(Y_test, Y_pred)

edited Nov 16 '18 at 11:48

answered Nov 16 '18 at 11:42

Vivek Kumar

16.8k42156

Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...

– Alex
Nov 16 '18 at 11:58

@Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which have Classifier in their names

– Vivek Kumar
Nov 16 '18 at 12:04

Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?

– Alex
Nov 16 '18 at 12:59

@Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.

– Vivek Kumar
Nov 19 '18 at 10:02

One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)

– Alex
Nov 22 '18 at 9:21

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53337090%2fdimensions-problem-linear-regression-python-scikit-learn%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

For using lm.score() you need to pass X_test, y_test.

dico['R2 value']=lm.score(X_test, Y_test)

See the documentation here:

score(X, y, sample_weight=None)

X : array-like, shape = (n_samples, n_features) Test samples. 

    For some estimators this may be a precomputed kernel matrix instead, 

    shape = (n_samples, n_samples_fitted], where n_samples_fitted is the 

    number of samples used in the fitting for the estimator.



y : array-like, shape = (n_samples) or (n_samples, n_outputs) True values for X.



sample_weight : array-like, shape = [n_samples], optional Sample weights.

You are trying to use the score method as a metric method, which is wrong. A score() method on any estimator will itself calculate the predictions and then send them to appropriate metric scorer.

If you want to use Y_test and Y_pred yourself, then you can do this:

from sklearn.metrics import r2_score

dico['R2 value'] = r2_score(Y_test, Y_pred)

edited Nov 16 '18 at 11:48

answered Nov 16 '18 at 11:42

Vivek Kumar

16.8k42156

Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...

– Alex
Nov 16 '18 at 11:58

@Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which have Classifier in their names

– Vivek Kumar
Nov 16 '18 at 12:04

Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?

– Alex
Nov 16 '18 at 12:59

@Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.

– Vivek Kumar
Nov 19 '18 at 10:02

One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)

– Alex
Nov 22 '18 at 9:21

add a comment |

For using lm.score() you need to pass X_test, y_test.

dico['R2 value']=lm.score(X_test, Y_test)

See the documentation here:

score(X, y, sample_weight=None)

X : array-like, shape = (n_samples, n_features) Test samples. 

    For some estimators this may be a precomputed kernel matrix instead, 

    shape = (n_samples, n_samples_fitted], where n_samples_fitted is the 

    number of samples used in the fitting for the estimator.



y : array-like, shape = (n_samples) or (n_samples, n_outputs) True values for X.



sample_weight : array-like, shape = [n_samples], optional Sample weights.

You are trying to use the score method as a metric method, which is wrong. A score() method on any estimator will itself calculate the predictions and then send them to appropriate metric scorer.

If you want to use Y_test and Y_pred yourself, then you can do this:

from sklearn.metrics import r2_score

dico['R2 value'] = r2_score(Y_test, Y_pred)

edited Nov 16 '18 at 11:48

answered Nov 16 '18 at 11:42

Vivek Kumar

16.8k42156

Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...

– Alex
Nov 16 '18 at 11:58

@Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which have Classifier in their names

– Vivek Kumar
Nov 16 '18 at 12:04

Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?

– Alex
Nov 16 '18 at 12:59

@Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.

– Vivek Kumar
Nov 19 '18 at 10:02

One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)

– Alex
Nov 22 '18 at 9:21

add a comment |

For using lm.score() you need to pass X_test, y_test.

dico['R2 value']=lm.score(X_test, Y_test)

See the documentation here:

score(X, y, sample_weight=None)

X : array-like, shape = (n_samples, n_features) Test samples. 

    For some estimators this may be a precomputed kernel matrix instead, 

    shape = (n_samples, n_samples_fitted], where n_samples_fitted is the 

    number of samples used in the fitting for the estimator.



y : array-like, shape = (n_samples) or (n_samples, n_outputs) True values for X.



sample_weight : array-like, shape = [n_samples], optional Sample weights.

You are trying to use the score method as a metric method, which is wrong. A score() method on any estimator will itself calculate the predictions and then send them to appropriate metric scorer.

If you want to use Y_test and Y_pred yourself, then you can do this:

from sklearn.metrics import r2_score

dico['R2 value'] = r2_score(Y_test, Y_pred)

edited Nov 16 '18 at 11:48

answered Nov 16 '18 at 11:42

Vivek Kumar

16.8k42156

For using lm.score() you need to pass X_test, y_test.

dico['R2 value']=lm.score(X_test, Y_test)

See the documentation here:

score(X, y, sample_weight=None)

X : array-like, shape = (n_samples, n_features) Test samples. 

    For some estimators this may be a precomputed kernel matrix instead, 

    shape = (n_samples, n_samples_fitted], where n_samples_fitted is the 

    number of samples used in the fitting for the estimator.



y : array-like, shape = (n_samples) or (n_samples, n_outputs) True values for X.



sample_weight : array-like, shape = [n_samples], optional Sample weights.

You are trying to use the score method as a metric method, which is wrong. A score() method on any estimator will itself calculate the predictions and then send them to appropriate metric scorer.

If you want to use Y_test and Y_pred yourself, then you can do this:

from sklearn.metrics import r2_score

dico['R2 value'] = r2_score(Y_test, Y_pred)

edited Nov 16 '18 at 11:48

answered Nov 16 '18 at 11:42

Vivek Kumar

16.8k42156

edited Nov 16 '18 at 11:48

answered Nov 16 '18 at 11:42

Vivek Kumar

16.8k42156

answered Nov 16 '18 at 11:42

Vivek Kumar

16.8k42156

answered Nov 16 '18 at 11:42

Vivek Kumar

16.8k42156

Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...

– Alex
Nov 16 '18 at 11:58

@Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which have Classifier in their names

– Vivek Kumar
Nov 16 '18 at 12:04

Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?

– Alex
Nov 16 '18 at 12:59

@Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.

– Vivek Kumar
Nov 19 '18 at 10:02

One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)

– Alex
Nov 22 '18 at 9:21

add a comment |

Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...

– Alex
Nov 16 '18 at 11:58

@Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which have Classifier in their names

– Vivek Kumar
Nov 16 '18 at 12:04

Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?

– Alex
Nov 16 '18 at 12:59

@Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.

– Vivek Kumar
Nov 19 '18 at 10:02

One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)

– Alex
Nov 22 '18 at 9:21

Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...

– Alex
Nov 16 '18 at 11:58

@Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which have Classifier in their names

– Vivek Kumar
Nov 16 '18 at 12:04

Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?

– Alex
Nov 16 '18 at 12:59

@Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.

– Vivek Kumar
Nov 19 '18 at 10:02

One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)

– Alex
Nov 22 '18 at 9:21

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

3n6lOiwgtHGgRb k

搜尋此網誌

Vfrdtyky