Find other rows most closely associated with a column
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}
I'm sorry, I'm not sure how to phrase this question.
I have a table with two columns: tag
and activity_id
used to track a many-to-many relationship between activities and tags. (In reality, tag is actually a numeric tag_id
, but I'm simplifying this for the purposed of the question - I can figure out the JOIN later.)
Sample data:
tag, activity_id
"Ideation",52698
"Adult",52698
"Trans man",52698
"USA - Northwest",52698
"Transfeminine",52699
"Ideation",52699
"Adult",52702
"Trans man",52702
"USA - Northwest",52702
"Ideation",52702
"PTSD",52702
"Religious abuse / trauma",52702
"Adult",52709
"Ideation",52709
What I want to find is which tags appear most with others. For example, in the above Ideation and Adult appear with the same activity_id multiple times. Ideation and Trans Man also show up together. What I'd like is a query to show which tags are clustered together based on activity_id, ideally with some sort of rank based on how many times they appear together.
Thanks for any assistance - Please comment if I haven't explained this clearly enough!
sql database postgresql
add a comment |
I'm sorry, I'm not sure how to phrase this question.
I have a table with two columns: tag
and activity_id
used to track a many-to-many relationship between activities and tags. (In reality, tag is actually a numeric tag_id
, but I'm simplifying this for the purposed of the question - I can figure out the JOIN later.)
Sample data:
tag, activity_id
"Ideation",52698
"Adult",52698
"Trans man",52698
"USA - Northwest",52698
"Transfeminine",52699
"Ideation",52699
"Adult",52702
"Trans man",52702
"USA - Northwest",52702
"Ideation",52702
"PTSD",52702
"Religious abuse / trauma",52702
"Adult",52709
"Ideation",52709
What I want to find is which tags appear most with others. For example, in the above Ideation and Adult appear with the same activity_id multiple times. Ideation and Trans Man also show up together. What I'd like is a query to show which tags are clustered together based on activity_id, ideally with some sort of rank based on how many times they appear together.
Thanks for any assistance - Please comment if I haven't explained this clearly enough!
sql database postgresql
add a comment |
I'm sorry, I'm not sure how to phrase this question.
I have a table with two columns: tag
and activity_id
used to track a many-to-many relationship between activities and tags. (In reality, tag is actually a numeric tag_id
, but I'm simplifying this for the purposed of the question - I can figure out the JOIN later.)
Sample data:
tag, activity_id
"Ideation",52698
"Adult",52698
"Trans man",52698
"USA - Northwest",52698
"Transfeminine",52699
"Ideation",52699
"Adult",52702
"Trans man",52702
"USA - Northwest",52702
"Ideation",52702
"PTSD",52702
"Religious abuse / trauma",52702
"Adult",52709
"Ideation",52709
What I want to find is which tags appear most with others. For example, in the above Ideation and Adult appear with the same activity_id multiple times. Ideation and Trans Man also show up together. What I'd like is a query to show which tags are clustered together based on activity_id, ideally with some sort of rank based on how many times they appear together.
Thanks for any assistance - Please comment if I haven't explained this clearly enough!
sql database postgresql
I'm sorry, I'm not sure how to phrase this question.
I have a table with two columns: tag
and activity_id
used to track a many-to-many relationship between activities and tags. (In reality, tag is actually a numeric tag_id
, but I'm simplifying this for the purposed of the question - I can figure out the JOIN later.)
Sample data:
tag, activity_id
"Ideation",52698
"Adult",52698
"Trans man",52698
"USA - Northwest",52698
"Transfeminine",52699
"Ideation",52699
"Adult",52702
"Trans man",52702
"USA - Northwest",52702
"Ideation",52702
"PTSD",52702
"Religious abuse / trauma",52702
"Adult",52709
"Ideation",52709
What I want to find is which tags appear most with others. For example, in the above Ideation and Adult appear with the same activity_id multiple times. Ideation and Trans Man also show up together. What I'd like is a query to show which tags are clustered together based on activity_id, ideally with some sort of rank based on how many times they appear together.
Thanks for any assistance - Please comment if I haven't explained this clearly enough!
sql database postgresql
sql database postgresql
asked Nov 16 '18 at 17:38
Chris GaraffaChris Garaffa
897
897
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
I think you want a self-join and aggregation:
select s1.tag, s2.tag, count(*)
from sample s1 join
sample s2
on s1.activity_id = s2.activity_id and s1.tag < s2.tag
group by s1.tag, s2.tag
order by count(*) desc;
This is amazing, thank you. It's exactly what I needed. Here's the full query I used, with the JOIN I left out above to simplify the example:SELECT t1.tag, t2.tag, s1.tag_id, s2.tag_id, count(*) FROM activities_tags s1 JOIN activities_tags s2 ON s1.activity_id = s2.activity_id AND s1.tag_id < s2.tag_id JOIN tags t1 ON s1.tag_id = t1.id JOIN tags t2 ON s2.tag_id = t2.id GROUP BY t1.tag, t2.tag, s1.tag_id, s2.tag_id ORDER BY count(*) desc;
– Chris Garaffa
Nov 16 '18 at 17:57
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53342826%2ffind-other-rows-most-closely-associated-with-a-column%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I think you want a self-join and aggregation:
select s1.tag, s2.tag, count(*)
from sample s1 join
sample s2
on s1.activity_id = s2.activity_id and s1.tag < s2.tag
group by s1.tag, s2.tag
order by count(*) desc;
This is amazing, thank you. It's exactly what I needed. Here's the full query I used, with the JOIN I left out above to simplify the example:SELECT t1.tag, t2.tag, s1.tag_id, s2.tag_id, count(*) FROM activities_tags s1 JOIN activities_tags s2 ON s1.activity_id = s2.activity_id AND s1.tag_id < s2.tag_id JOIN tags t1 ON s1.tag_id = t1.id JOIN tags t2 ON s2.tag_id = t2.id GROUP BY t1.tag, t2.tag, s1.tag_id, s2.tag_id ORDER BY count(*) desc;
– Chris Garaffa
Nov 16 '18 at 17:57
add a comment |
I think you want a self-join and aggregation:
select s1.tag, s2.tag, count(*)
from sample s1 join
sample s2
on s1.activity_id = s2.activity_id and s1.tag < s2.tag
group by s1.tag, s2.tag
order by count(*) desc;
This is amazing, thank you. It's exactly what I needed. Here's the full query I used, with the JOIN I left out above to simplify the example:SELECT t1.tag, t2.tag, s1.tag_id, s2.tag_id, count(*) FROM activities_tags s1 JOIN activities_tags s2 ON s1.activity_id = s2.activity_id AND s1.tag_id < s2.tag_id JOIN tags t1 ON s1.tag_id = t1.id JOIN tags t2 ON s2.tag_id = t2.id GROUP BY t1.tag, t2.tag, s1.tag_id, s2.tag_id ORDER BY count(*) desc;
– Chris Garaffa
Nov 16 '18 at 17:57
add a comment |
I think you want a self-join and aggregation:
select s1.tag, s2.tag, count(*)
from sample s1 join
sample s2
on s1.activity_id = s2.activity_id and s1.tag < s2.tag
group by s1.tag, s2.tag
order by count(*) desc;
I think you want a self-join and aggregation:
select s1.tag, s2.tag, count(*)
from sample s1 join
sample s2
on s1.activity_id = s2.activity_id and s1.tag < s2.tag
group by s1.tag, s2.tag
order by count(*) desc;
answered Nov 16 '18 at 17:40
Gordon LinoffGordon Linoff
797k37318423
797k37318423
This is amazing, thank you. It's exactly what I needed. Here's the full query I used, with the JOIN I left out above to simplify the example:SELECT t1.tag, t2.tag, s1.tag_id, s2.tag_id, count(*) FROM activities_tags s1 JOIN activities_tags s2 ON s1.activity_id = s2.activity_id AND s1.tag_id < s2.tag_id JOIN tags t1 ON s1.tag_id = t1.id JOIN tags t2 ON s2.tag_id = t2.id GROUP BY t1.tag, t2.tag, s1.tag_id, s2.tag_id ORDER BY count(*) desc;
– Chris Garaffa
Nov 16 '18 at 17:57
add a comment |
This is amazing, thank you. It's exactly what I needed. Here's the full query I used, with the JOIN I left out above to simplify the example:SELECT t1.tag, t2.tag, s1.tag_id, s2.tag_id, count(*) FROM activities_tags s1 JOIN activities_tags s2 ON s1.activity_id = s2.activity_id AND s1.tag_id < s2.tag_id JOIN tags t1 ON s1.tag_id = t1.id JOIN tags t2 ON s2.tag_id = t2.id GROUP BY t1.tag, t2.tag, s1.tag_id, s2.tag_id ORDER BY count(*) desc;
– Chris Garaffa
Nov 16 '18 at 17:57
This is amazing, thank you. It's exactly what I needed. Here's the full query I used, with the JOIN I left out above to simplify the example:
SELECT t1.tag, t2.tag, s1.tag_id, s2.tag_id, count(*) FROM activities_tags s1 JOIN activities_tags s2 ON s1.activity_id = s2.activity_id AND s1.tag_id < s2.tag_id JOIN tags t1 ON s1.tag_id = t1.id JOIN tags t2 ON s2.tag_id = t2.id GROUP BY t1.tag, t2.tag, s1.tag_id, s2.tag_id ORDER BY count(*) desc;
– Chris Garaffa
Nov 16 '18 at 17:57
This is amazing, thank you. It's exactly what I needed. Here's the full query I used, with the JOIN I left out above to simplify the example:
SELECT t1.tag, t2.tag, s1.tag_id, s2.tag_id, count(*) FROM activities_tags s1 JOIN activities_tags s2 ON s1.activity_id = s2.activity_id AND s1.tag_id < s2.tag_id JOIN tags t1 ON s1.tag_id = t1.id JOIN tags t2 ON s2.tag_id = t2.id GROUP BY t1.tag, t2.tag, s1.tag_id, s2.tag_id ORDER BY count(*) desc;
– Chris Garaffa
Nov 16 '18 at 17:57
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53342826%2ffind-other-rows-most-closely-associated-with-a-column%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown