Dimensions problem linear regression Python scikit learn
I'm implementing a function in which I have to perform a linear regression using scikit learn.
What I have when running it with an example:
X_train.shape=(34,3)
X_test.shape=(12,3)
Y_train.shape=(34,1)
Y_test.shape=(12,1)
Then
lm.fit(X_train,Y_train)
Y_pred = lm.predict(X_test)
However Python tells me there is a mistake at this line
dico['R2 value']=lm.score(Y_test, Y_pred)
What Python tells me:
ValueError: shapes (12,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)
Thanks in advance for the help anyone could bring me :)
Alex
python-2.7 scikit-learn linear-regression
add a comment |
I'm implementing a function in which I have to perform a linear regression using scikit learn.
What I have when running it with an example:
X_train.shape=(34,3)
X_test.shape=(12,3)
Y_train.shape=(34,1)
Y_test.shape=(12,1)
Then
lm.fit(X_train,Y_train)
Y_pred = lm.predict(X_test)
However Python tells me there is a mistake at this line
dico['R2 value']=lm.score(Y_test, Y_pred)
What Python tells me:
ValueError: shapes (12,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)
Thanks in advance for the help anyone could bring me :)
Alex
python-2.7 scikit-learn linear-regression
add a comment |
I'm implementing a function in which I have to perform a linear regression using scikit learn.
What I have when running it with an example:
X_train.shape=(34,3)
X_test.shape=(12,3)
Y_train.shape=(34,1)
Y_test.shape=(12,1)
Then
lm.fit(X_train,Y_train)
Y_pred = lm.predict(X_test)
However Python tells me there is a mistake at this line
dico['R2 value']=lm.score(Y_test, Y_pred)
What Python tells me:
ValueError: shapes (12,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)
Thanks in advance for the help anyone could bring me :)
Alex
python-2.7 scikit-learn linear-regression
I'm implementing a function in which I have to perform a linear regression using scikit learn.
What I have when running it with an example:
X_train.shape=(34,3)
X_test.shape=(12,3)
Y_train.shape=(34,1)
Y_test.shape=(12,1)
Then
lm.fit(X_train,Y_train)
Y_pred = lm.predict(X_test)
However Python tells me there is a mistake at this line
dico['R2 value']=lm.score(Y_test, Y_pred)
What Python tells me:
ValueError: shapes (12,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)
Thanks in advance for the help anyone could bring me :)
Alex
python-2.7 scikit-learn linear-regression
python-2.7 scikit-learn linear-regression
edited Nov 16 '18 at 11:42
Vivek Kumar
16.8k42156
16.8k42156
asked Nov 16 '18 at 11:36
Alex Alex
175
175
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
For using lm.score()
you need to pass X_test
, y_test
.
dico['R2 value']=lm.score(X_test, Y_test)
See the documentation here:
score(X, y, sample_weight=None)
X : array-like, shape = (n_samples, n_features) Test samples.
For some estimators this may be a precomputed kernel matrix instead,
shape = (n_samples, n_samples_fitted], where n_samples_fitted is the
number of samples used in the fitting for the estimator.
y : array-like, shape = (n_samples) or (n_samples, n_outputs) True values for X.
sample_weight : array-like, shape = [n_samples], optional Sample weights.
You are trying to use the score method as a metric method, which is wrong. A score()
method on any estimator will itself calculate the predictions and then send them to appropriate metric scorer.
If you want to use Y_test
and Y_pred
yourself, then you can do this:
from sklearn.metrics import r2_score
dico['R2 value'] = r2_score(Y_test, Y_pred)
Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...
– Alex
Nov 16 '18 at 11:58
@Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which haveClassifier
in their names
– Vivek Kumar
Nov 16 '18 at 12:04
Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?
– Alex
Nov 16 '18 at 12:59
@Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.
– Vivek Kumar
Nov 19 '18 at 10:02
One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)
– Alex
Nov 22 '18 at 9:21
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53337090%2fdimensions-problem-linear-regression-python-scikit-learn%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
For using lm.score()
you need to pass X_test
, y_test
.
dico['R2 value']=lm.score(X_test, Y_test)
See the documentation here:
score(X, y, sample_weight=None)
X : array-like, shape = (n_samples, n_features) Test samples.
For some estimators this may be a precomputed kernel matrix instead,
shape = (n_samples, n_samples_fitted], where n_samples_fitted is the
number of samples used in the fitting for the estimator.
y : array-like, shape = (n_samples) or (n_samples, n_outputs) True values for X.
sample_weight : array-like, shape = [n_samples], optional Sample weights.
You are trying to use the score method as a metric method, which is wrong. A score()
method on any estimator will itself calculate the predictions and then send them to appropriate metric scorer.
If you want to use Y_test
and Y_pred
yourself, then you can do this:
from sklearn.metrics import r2_score
dico['R2 value'] = r2_score(Y_test, Y_pred)
Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...
– Alex
Nov 16 '18 at 11:58
@Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which haveClassifier
in their names
– Vivek Kumar
Nov 16 '18 at 12:04
Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?
– Alex
Nov 16 '18 at 12:59
@Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.
– Vivek Kumar
Nov 19 '18 at 10:02
One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)
– Alex
Nov 22 '18 at 9:21
add a comment |
For using lm.score()
you need to pass X_test
, y_test
.
dico['R2 value']=lm.score(X_test, Y_test)
See the documentation here:
score(X, y, sample_weight=None)
X : array-like, shape = (n_samples, n_features) Test samples.
For some estimators this may be a precomputed kernel matrix instead,
shape = (n_samples, n_samples_fitted], where n_samples_fitted is the
number of samples used in the fitting for the estimator.
y : array-like, shape = (n_samples) or (n_samples, n_outputs) True values for X.
sample_weight : array-like, shape = [n_samples], optional Sample weights.
You are trying to use the score method as a metric method, which is wrong. A score()
method on any estimator will itself calculate the predictions and then send them to appropriate metric scorer.
If you want to use Y_test
and Y_pred
yourself, then you can do this:
from sklearn.metrics import r2_score
dico['R2 value'] = r2_score(Y_test, Y_pred)
Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...
– Alex
Nov 16 '18 at 11:58
@Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which haveClassifier
in their names
– Vivek Kumar
Nov 16 '18 at 12:04
Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?
– Alex
Nov 16 '18 at 12:59
@Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.
– Vivek Kumar
Nov 19 '18 at 10:02
One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)
– Alex
Nov 22 '18 at 9:21
add a comment |
For using lm.score()
you need to pass X_test
, y_test
.
dico['R2 value']=lm.score(X_test, Y_test)
See the documentation here:
score(X, y, sample_weight=None)
X : array-like, shape = (n_samples, n_features) Test samples.
For some estimators this may be a precomputed kernel matrix instead,
shape = (n_samples, n_samples_fitted], where n_samples_fitted is the
number of samples used in the fitting for the estimator.
y : array-like, shape = (n_samples) or (n_samples, n_outputs) True values for X.
sample_weight : array-like, shape = [n_samples], optional Sample weights.
You are trying to use the score method as a metric method, which is wrong. A score()
method on any estimator will itself calculate the predictions and then send them to appropriate metric scorer.
If you want to use Y_test
and Y_pred
yourself, then you can do this:
from sklearn.metrics import r2_score
dico['R2 value'] = r2_score(Y_test, Y_pred)
For using lm.score()
you need to pass X_test
, y_test
.
dico['R2 value']=lm.score(X_test, Y_test)
See the documentation here:
score(X, y, sample_weight=None)
X : array-like, shape = (n_samples, n_features) Test samples.
For some estimators this may be a precomputed kernel matrix instead,
shape = (n_samples, n_samples_fitted], where n_samples_fitted is the
number of samples used in the fitting for the estimator.
y : array-like, shape = (n_samples) or (n_samples, n_outputs) True values for X.
sample_weight : array-like, shape = [n_samples], optional Sample weights.
You are trying to use the score method as a metric method, which is wrong. A score()
method on any estimator will itself calculate the predictions and then send them to appropriate metric scorer.
If you want to use Y_test
and Y_pred
yourself, then you can do this:
from sklearn.metrics import r2_score
dico['R2 value'] = r2_score(Y_test, Y_pred)
edited Nov 16 '18 at 11:48
answered Nov 16 '18 at 11:42
Vivek KumarVivek Kumar
16.8k42156
16.8k42156
Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...
– Alex
Nov 16 '18 at 11:58
@Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which haveClassifier
in their names
– Vivek Kumar
Nov 16 '18 at 12:04
Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?
– Alex
Nov 16 '18 at 12:59
@Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.
– Vivek Kumar
Nov 19 '18 at 10:02
One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)
– Alex
Nov 22 '18 at 9:21
add a comment |
Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...
– Alex
Nov 16 '18 at 11:58
@Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which haveClassifier
in their names
– Vivek Kumar
Nov 16 '18 at 12:04
Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?
– Alex
Nov 16 '18 at 12:59
@Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.
– Vivek Kumar
Nov 19 '18 at 10:02
One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)
– Alex
Nov 22 '18 at 9:21
Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...
– Alex
Nov 16 '18 at 11:58
Thanks a lot for your help ! Seems I was a bit confused :) However now I don't get it why the r2 score is really low (0.11) whereas the dataset I used is the iris one...
– Alex
Nov 16 '18 at 11:58
@Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which have
Classifier
in their names– Vivek Kumar
Nov 16 '18 at 12:04
@Alex Iris is a classification dataset and you are using regression model (LinearRegression with R-squared) and hence not working. Use models which have
Classifier
in their names– Vivek Kumar
Nov 16 '18 at 12:04
Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?
– Alex
Nov 16 '18 at 12:59
Hmm I don't see why because I only kept the setosa type of iris so that the regression would have a sense. My features were SepalLengthCm, SepalWidthCm, PetalLengthCm and I wanted to predict PetalWidthCm. So why wouldn't the linear regression be legit?
– Alex
Nov 16 '18 at 12:59
@Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.
– Vivek Kumar
Nov 19 '18 at 10:02
@Alex Well, in that case the regression makes sense. But then you need to consider if it actually makes sense to predict petalwidth from other features. Regression will only perform good, if the dependent variable (petalwidth in this case) is actually a dependent of other variables. Which I dont think it is.
– Vivek Kumar
Nov 19 '18 at 10:02
One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)
– Alex
Nov 22 '18 at 9:21
One last question: Can I still use that LinearRegression from sklearn if there are also some nominal/ordinal features? (Obviously I would encode them before performing the regression)
– Alex
Nov 22 '18 at 9:21
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53337090%2fdimensions-problem-linear-regression-python-scikit-learn%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown