automatic variable selection
I have a data set with following columns
Acres,FamilyType, NumBedrooms,NumChildren, NumPeople, NumRooms,NumUnits NumVehicles,NumWorkers, OwnRent,YearBuilt, HouseCosts,ElectricBill, FoodStamp,HeatingFuel,Insurance,Language, above_150K
I did
fit<-glm(above_150K~Acres+ FamilyType+ NumBedrooms+ NumChildren+NumPeople+NumRooms+NumUnits+NumVehicles+NumWorkers+OwnRent+YearBuilt+HouseCosts+ElectricBill+FoodStamp+HeatingFuel+Insurance+Language,data=‘df’)
summary(fit)
It breaks down each column further down into sub columns like below
Abbreviation
Acres10+ A
AcresSub 1 A1
FamilyTypeMale Head FH
FamilyTypeMarried FT
NumBedrooms NB
NumChildren NC
NumPeople NP
NumRooms NR
NumUnitsSingle attached Na
NumUnitsSingle detached Nd
NumVehicles NV
NumWorkers NW
OwnRentOutright ORO
OwnRentRented ORR
YearBuilt1940-1949 YB194
YearBuilt1950-1959 YB195
YearBuilt1960-1969 YB196
YearBuilt1970-1979 YB197
YearBuilt1980-1989 YB198
YearBuilt1990-1999 YB199
YearBuilt2000-2004 YB2000
YearBuilt2005 YB2005
YearBuilt2006 YB2006
YearBuilt2007 YB2007
YearBuilt2008 YB2008
YearBuilt2009 YB2009
YearBuilt2010 YB201
YearBuiltBefore 1939 Y1
HouseCosts HC
ElectricBill E
FoodStampYes FS
HeatingFuelElectricity HFE
HeatingFuelGas HFG
HeatingFuelNone HFN
HeatingFuelOil HtngFlOl
HeatingFuelOther HtngFlOt
HeatingFuelSolar HFS
HeatingFuelWood HFW
Insurance I
LanguageEnglish LnE
LanguageOther LO
LanguageOther European LOE
LanguageSpanish LS
As you can see a single column HeatingFuel is broken down into
HeatingFuelElectricity HFE
HeatingFuelGas HFG
HeatingFuelNone HFN
HeatingFuelOil HtngFlOl
HeatingFuelOther HtngFlOt
HeatingFuelSolar HFS
HeatingFuelWood HFW
Why is this happening?
I wanted to select the variables for prediction above_150K, I used Stepwise, AllSubsets automatic variable selection, they suggest
To use all the variables. Could some one please clarify?
r logistic-regression forecasting feature-selection variable-selection
add a comment |
I have a data set with following columns
Acres,FamilyType, NumBedrooms,NumChildren, NumPeople, NumRooms,NumUnits NumVehicles,NumWorkers, OwnRent,YearBuilt, HouseCosts,ElectricBill, FoodStamp,HeatingFuel,Insurance,Language, above_150K
I did
fit<-glm(above_150K~Acres+ FamilyType+ NumBedrooms+ NumChildren+NumPeople+NumRooms+NumUnits+NumVehicles+NumWorkers+OwnRent+YearBuilt+HouseCosts+ElectricBill+FoodStamp+HeatingFuel+Insurance+Language,data=‘df’)
summary(fit)
It breaks down each column further down into sub columns like below
Abbreviation
Acres10+ A
AcresSub 1 A1
FamilyTypeMale Head FH
FamilyTypeMarried FT
NumBedrooms NB
NumChildren NC
NumPeople NP
NumRooms NR
NumUnitsSingle attached Na
NumUnitsSingle detached Nd
NumVehicles NV
NumWorkers NW
OwnRentOutright ORO
OwnRentRented ORR
YearBuilt1940-1949 YB194
YearBuilt1950-1959 YB195
YearBuilt1960-1969 YB196
YearBuilt1970-1979 YB197
YearBuilt1980-1989 YB198
YearBuilt1990-1999 YB199
YearBuilt2000-2004 YB2000
YearBuilt2005 YB2005
YearBuilt2006 YB2006
YearBuilt2007 YB2007
YearBuilt2008 YB2008
YearBuilt2009 YB2009
YearBuilt2010 YB201
YearBuiltBefore 1939 Y1
HouseCosts HC
ElectricBill E
FoodStampYes FS
HeatingFuelElectricity HFE
HeatingFuelGas HFG
HeatingFuelNone HFN
HeatingFuelOil HtngFlOl
HeatingFuelOther HtngFlOt
HeatingFuelSolar HFS
HeatingFuelWood HFW
Insurance I
LanguageEnglish LnE
LanguageOther LO
LanguageOther European LOE
LanguageSpanish LS
As you can see a single column HeatingFuel is broken down into
HeatingFuelElectricity HFE
HeatingFuelGas HFG
HeatingFuelNone HFN
HeatingFuelOil HtngFlOl
HeatingFuelOther HtngFlOt
HeatingFuelSolar HFS
HeatingFuelWood HFW
Why is this happening?
I wanted to select the variables for prediction above_150K, I used Stepwise, AllSubsets automatic variable selection, they suggest
To use all the variables. Could some one please clarify?
r logistic-regression forecasting feature-selection variable-selection
add a comment |
I have a data set with following columns
Acres,FamilyType, NumBedrooms,NumChildren, NumPeople, NumRooms,NumUnits NumVehicles,NumWorkers, OwnRent,YearBuilt, HouseCosts,ElectricBill, FoodStamp,HeatingFuel,Insurance,Language, above_150K
I did
fit<-glm(above_150K~Acres+ FamilyType+ NumBedrooms+ NumChildren+NumPeople+NumRooms+NumUnits+NumVehicles+NumWorkers+OwnRent+YearBuilt+HouseCosts+ElectricBill+FoodStamp+HeatingFuel+Insurance+Language,data=‘df’)
summary(fit)
It breaks down each column further down into sub columns like below
Abbreviation
Acres10+ A
AcresSub 1 A1
FamilyTypeMale Head FH
FamilyTypeMarried FT
NumBedrooms NB
NumChildren NC
NumPeople NP
NumRooms NR
NumUnitsSingle attached Na
NumUnitsSingle detached Nd
NumVehicles NV
NumWorkers NW
OwnRentOutright ORO
OwnRentRented ORR
YearBuilt1940-1949 YB194
YearBuilt1950-1959 YB195
YearBuilt1960-1969 YB196
YearBuilt1970-1979 YB197
YearBuilt1980-1989 YB198
YearBuilt1990-1999 YB199
YearBuilt2000-2004 YB2000
YearBuilt2005 YB2005
YearBuilt2006 YB2006
YearBuilt2007 YB2007
YearBuilt2008 YB2008
YearBuilt2009 YB2009
YearBuilt2010 YB201
YearBuiltBefore 1939 Y1
HouseCosts HC
ElectricBill E
FoodStampYes FS
HeatingFuelElectricity HFE
HeatingFuelGas HFG
HeatingFuelNone HFN
HeatingFuelOil HtngFlOl
HeatingFuelOther HtngFlOt
HeatingFuelSolar HFS
HeatingFuelWood HFW
Insurance I
LanguageEnglish LnE
LanguageOther LO
LanguageOther European LOE
LanguageSpanish LS
As you can see a single column HeatingFuel is broken down into
HeatingFuelElectricity HFE
HeatingFuelGas HFG
HeatingFuelNone HFN
HeatingFuelOil HtngFlOl
HeatingFuelOther HtngFlOt
HeatingFuelSolar HFS
HeatingFuelWood HFW
Why is this happening?
I wanted to select the variables for prediction above_150K, I used Stepwise, AllSubsets automatic variable selection, they suggest
To use all the variables. Could some one please clarify?
r logistic-regression forecasting feature-selection variable-selection
I have a data set with following columns
Acres,FamilyType, NumBedrooms,NumChildren, NumPeople, NumRooms,NumUnits NumVehicles,NumWorkers, OwnRent,YearBuilt, HouseCosts,ElectricBill, FoodStamp,HeatingFuel,Insurance,Language, above_150K
I did
fit<-glm(above_150K~Acres+ FamilyType+ NumBedrooms+ NumChildren+NumPeople+NumRooms+NumUnits+NumVehicles+NumWorkers+OwnRent+YearBuilt+HouseCosts+ElectricBill+FoodStamp+HeatingFuel+Insurance+Language,data=‘df’)
summary(fit)
It breaks down each column further down into sub columns like below
Abbreviation
Acres10+ A
AcresSub 1 A1
FamilyTypeMale Head FH
FamilyTypeMarried FT
NumBedrooms NB
NumChildren NC
NumPeople NP
NumRooms NR
NumUnitsSingle attached Na
NumUnitsSingle detached Nd
NumVehicles NV
NumWorkers NW
OwnRentOutright ORO
OwnRentRented ORR
YearBuilt1940-1949 YB194
YearBuilt1950-1959 YB195
YearBuilt1960-1969 YB196
YearBuilt1970-1979 YB197
YearBuilt1980-1989 YB198
YearBuilt1990-1999 YB199
YearBuilt2000-2004 YB2000
YearBuilt2005 YB2005
YearBuilt2006 YB2006
YearBuilt2007 YB2007
YearBuilt2008 YB2008
YearBuilt2009 YB2009
YearBuilt2010 YB201
YearBuiltBefore 1939 Y1
HouseCosts HC
ElectricBill E
FoodStampYes FS
HeatingFuelElectricity HFE
HeatingFuelGas HFG
HeatingFuelNone HFN
HeatingFuelOil HtngFlOl
HeatingFuelOther HtngFlOt
HeatingFuelSolar HFS
HeatingFuelWood HFW
Insurance I
LanguageEnglish LnE
LanguageOther LO
LanguageOther European LOE
LanguageSpanish LS
As you can see a single column HeatingFuel is broken down into
HeatingFuelElectricity HFE
HeatingFuelGas HFG
HeatingFuelNone HFN
HeatingFuelOil HtngFlOl
HeatingFuelOther HtngFlOt
HeatingFuelSolar HFS
HeatingFuelWood HFW
Why is this happening?
I wanted to select the variables for prediction above_150K, I used Stepwise, AllSubsets automatic variable selection, they suggest
To use all the variables. Could some one please clarify?
r logistic-regression forecasting feature-selection variable-selection
r logistic-regression forecasting feature-selection variable-selection
edited Nov 22 '18 at 9:39
Vivek Kumar
16.2k42054
16.2k42054
asked Nov 14 '18 at 21:49
RajRaj
378
378
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53309232%2fautomatic-variable-selection%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53309232%2fautomatic-variable-selection%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown