Efficient way to find top K products given two lists
Given two lists of equal length N
, I want to find the K
largest products that can be made by multiplying an element from each list. For example, if
> A = [7, 9, 4, 1, 6]
> B = [8, 1, 3, 10, 7]
> K = 3
the result is [90, 72, 70]
or [9*10, 9*8, 7*10]
, found by
> sorted([x*y for x in A for y in B], reverse=True)[:K]
[90, 72, 70]
Is there a more efficient algorithm that doesn't involve multiplying out all N^2
pairs?
python algorithm big-o
add a comment |
Given two lists of equal length N
, I want to find the K
largest products that can be made by multiplying an element from each list. For example, if
> A = [7, 9, 4, 1, 6]
> B = [8, 1, 3, 10, 7]
> K = 3
the result is [90, 72, 70]
or [9*10, 9*8, 7*10]
, found by
> sorted([x*y for x in A for y in B], reverse=True)[:K]
[90, 72, 70]
Is there a more efficient algorithm that doesn't involve multiplying out all N^2
pairs?
python algorithm big-o
You do not need to multiply every elements of the lists. If you need K values, you can just retrieve first K max values of each lists and multiply each pair. But I think there is an even more optimized solution.
– LostReality
Nov 15 '18 at 10:49
add a comment |
Given two lists of equal length N
, I want to find the K
largest products that can be made by multiplying an element from each list. For example, if
> A = [7, 9, 4, 1, 6]
> B = [8, 1, 3, 10, 7]
> K = 3
the result is [90, 72, 70]
or [9*10, 9*8, 7*10]
, found by
> sorted([x*y for x in A for y in B], reverse=True)[:K]
[90, 72, 70]
Is there a more efficient algorithm that doesn't involve multiplying out all N^2
pairs?
python algorithm big-o
Given two lists of equal length N
, I want to find the K
largest products that can be made by multiplying an element from each list. For example, if
> A = [7, 9, 4, 1, 6]
> B = [8, 1, 3, 10, 7]
> K = 3
the result is [90, 72, 70]
or [9*10, 9*8, 7*10]
, found by
> sorted([x*y for x in A for y in B], reverse=True)[:K]
[90, 72, 70]
Is there a more efficient algorithm that doesn't involve multiplying out all N^2
pairs?
python algorithm big-o
python algorithm big-o
edited Nov 15 '18 at 15:09
Peter B
13.3k52045
13.3k52045
asked Nov 15 '18 at 10:30
FlashFlash
8,40385382
8,40385382
You do not need to multiply every elements of the lists. If you need K values, you can just retrieve first K max values of each lists and multiply each pair. But I think there is an even more optimized solution.
– LostReality
Nov 15 '18 at 10:49
add a comment |
You do not need to multiply every elements of the lists. If you need K values, you can just retrieve first K max values of each lists and multiply each pair. But I think there is an even more optimized solution.
– LostReality
Nov 15 '18 at 10:49
You do not need to multiply every elements of the lists. If you need K values, you can just retrieve first K max values of each lists and multiply each pair. But I think there is an even more optimized solution.
– LostReality
Nov 15 '18 at 10:49
You do not need to multiply every elements of the lists. If you need K values, you can just retrieve first K max values of each lists and multiply each pair. But I think there is an even more optimized solution.
– LostReality
Nov 15 '18 at 10:49
add a comment |
3 Answers
3
active
oldest
votes
As already noted, the first step is to sort both lists A
and B
in descending order (or just the K largest of both lists). Then, all the max K products will sit in a roughly triangular area in the top-left corner, the max product being A[0]*B[0]
. In other words, if A[i]*B[j]
is in the top K, then so must be both A[i-1]*B[j]
and A[i]*B[j-1]
(assuming i, j > 0
).
Thus, you can start in the top-left corner and then use a Heap to expand both the "lower" and the "right" neighbor of the current element and put those onto the heap, too, until you have all the K elements you need. Or start with all the K largest elements of A paired with the largest from B already on the heap and only expand in one direction.
Example in Python, using the heapq
module, but the same will work in almost any other language. Note that we are adding negative products to the heap as the heap will be sorted smallest-first.
def top_k_prod(A, B, k):
A = heapq.nlargest(k, A)
B = heapq.nlargest(k, B)
result =
heap = [(-A[i] * B[0], i, 0) for i in range(len(A))]
while heap and len(result) < k:
p, a, b = heapq.heappop(heap)
result.append(-p)
if b < len(B)-1:
heapq.heappush(heap, (-A[a] * B[b+1], a, b+1))
return result
Example:
import random
A = [random.randint(0, 100) for _ in range(100)]
B = [random.randint(0, 100) for _ in range(100)]
K = 20
result = top_k_prod(A, B, K)
test = sorted([x*y for x in A for y in B], reverse=True)[:K]
print(result)
# [9900, 9702, 9603, 9600, 9504, 9408, 9405, 9405, 9400, 9400, 9312, 9306, 9300, 9216, 9212, 9212, 9207, 9200, 9120, 9120]
print(result == test)
# True
The complexity should be about O(NlogN + KlogK)
for sorting A
and B
and then about K
iterations with heap-operations in the loop. Each cell in the triangular "target" region will only be expanded once from its left neighbor, and cells added to the heap but not used are also limited to K (one in each "row"), giving a maximum of 2*K elements inspected.
This can be improved toO(N + KlogK + K)
by usingnumpy.partition(A,n-k)[-k:]
to find the K largest values in O(N), and only then sorting them.
– wowserx
Nov 15 '18 at 15:22
Can also useheapq.nlargest(k, A)
to find the largest K values in order inO(KlogK)
.
– wowserx
Nov 15 '18 at 15:26
@wowserx Maybe, but originally the question was not even tagged as Python but just a generic algorithm question, so I wanted to keep it generic. You can use a heap in (almost) any language.
– tobias_k
Nov 15 '18 at 15:36
Right, and you can use a heap in exactly the same way: O(N) heapify, then K heappops for O(N + KlogK + K). I'm saying the algorithmic performance is better if you only sort the K largest elements, instead of callingA = sorted(A, reverse=True)
right away.
– wowserx
Nov 15 '18 at 16:13
add a comment |
Practical solution:
Find largest K
elements from list A and K
largest elements from list B by using partial_sort (this is a well-known modification of quick sort, and I am sure python has the same in its library). Largest products formed by these new lists are also the largest products of the original lists. Then use max-heap (priority queue) to find K
largest products from new lists.
add a comment |
If we would find out K max values from both the lists, we would have the max K
products from both the lists.
I would suggest two approaches to find out K max values:
- If
K <<< N
(K
in 10s andN
in millions )
Here you have couple of options.
- You can use selection algorithm K times for both the lists. That would take
O(N*K)
- K iterations of either Selection Sort or Bubble Sort. You would have
K
max values at either at the beginning or at the end of the array depending on the type of implementation. Even that would beO(N*K)
- You can use selection algorithm K times for both the lists. That would take
Note that because
K <<< N
you can say thatO(N*K)
is almostO(N)
K
can be as same asN
- In this case, The best bet would be to just sort both the lists using Merge Sort or Quick Sort. That would be
O(N*lgN)
- In this case, The best bet would be to just sort both the lists using Merge Sort or Quick Sort. That would be
"because K <<< N you can say that O(N * K) is almost O(N)" Well, by that reasoning you could also say that sorting the entire list in O(nlogn) is almost O(n). And I agree, for mot practical purposes, it probably is. But then, why not just sort the entire list? In fact, with K=10 and N=1000000, nlogn is still smaller than n*k.
– tobias_k
Nov 15 '18 at 13:31
Also, even if you have the max K values from both lists, you still have to find the max K of those K² pairs/products of numbers. This part is not really addressed.
– tobias_k
Nov 15 '18 at 13:35
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53317379%2fefficient-way-to-find-top-k-products-given-two-lists%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
As already noted, the first step is to sort both lists A
and B
in descending order (or just the K largest of both lists). Then, all the max K products will sit in a roughly triangular area in the top-left corner, the max product being A[0]*B[0]
. In other words, if A[i]*B[j]
is in the top K, then so must be both A[i-1]*B[j]
and A[i]*B[j-1]
(assuming i, j > 0
).
Thus, you can start in the top-left corner and then use a Heap to expand both the "lower" and the "right" neighbor of the current element and put those onto the heap, too, until you have all the K elements you need. Or start with all the K largest elements of A paired with the largest from B already on the heap and only expand in one direction.
Example in Python, using the heapq
module, but the same will work in almost any other language. Note that we are adding negative products to the heap as the heap will be sorted smallest-first.
def top_k_prod(A, B, k):
A = heapq.nlargest(k, A)
B = heapq.nlargest(k, B)
result =
heap = [(-A[i] * B[0], i, 0) for i in range(len(A))]
while heap and len(result) < k:
p, a, b = heapq.heappop(heap)
result.append(-p)
if b < len(B)-1:
heapq.heappush(heap, (-A[a] * B[b+1], a, b+1))
return result
Example:
import random
A = [random.randint(0, 100) for _ in range(100)]
B = [random.randint(0, 100) for _ in range(100)]
K = 20
result = top_k_prod(A, B, K)
test = sorted([x*y for x in A for y in B], reverse=True)[:K]
print(result)
# [9900, 9702, 9603, 9600, 9504, 9408, 9405, 9405, 9400, 9400, 9312, 9306, 9300, 9216, 9212, 9212, 9207, 9200, 9120, 9120]
print(result == test)
# True
The complexity should be about O(NlogN + KlogK)
for sorting A
and B
and then about K
iterations with heap-operations in the loop. Each cell in the triangular "target" region will only be expanded once from its left neighbor, and cells added to the heap but not used are also limited to K (one in each "row"), giving a maximum of 2*K elements inspected.
This can be improved toO(N + KlogK + K)
by usingnumpy.partition(A,n-k)[-k:]
to find the K largest values in O(N), and only then sorting them.
– wowserx
Nov 15 '18 at 15:22
Can also useheapq.nlargest(k, A)
to find the largest K values in order inO(KlogK)
.
– wowserx
Nov 15 '18 at 15:26
@wowserx Maybe, but originally the question was not even tagged as Python but just a generic algorithm question, so I wanted to keep it generic. You can use a heap in (almost) any language.
– tobias_k
Nov 15 '18 at 15:36
Right, and you can use a heap in exactly the same way: O(N) heapify, then K heappops for O(N + KlogK + K). I'm saying the algorithmic performance is better if you only sort the K largest elements, instead of callingA = sorted(A, reverse=True)
right away.
– wowserx
Nov 15 '18 at 16:13
add a comment |
As already noted, the first step is to sort both lists A
and B
in descending order (or just the K largest of both lists). Then, all the max K products will sit in a roughly triangular area in the top-left corner, the max product being A[0]*B[0]
. In other words, if A[i]*B[j]
is in the top K, then so must be both A[i-1]*B[j]
and A[i]*B[j-1]
(assuming i, j > 0
).
Thus, you can start in the top-left corner and then use a Heap to expand both the "lower" and the "right" neighbor of the current element and put those onto the heap, too, until you have all the K elements you need. Or start with all the K largest elements of A paired with the largest from B already on the heap and only expand in one direction.
Example in Python, using the heapq
module, but the same will work in almost any other language. Note that we are adding negative products to the heap as the heap will be sorted smallest-first.
def top_k_prod(A, B, k):
A = heapq.nlargest(k, A)
B = heapq.nlargest(k, B)
result =
heap = [(-A[i] * B[0], i, 0) for i in range(len(A))]
while heap and len(result) < k:
p, a, b = heapq.heappop(heap)
result.append(-p)
if b < len(B)-1:
heapq.heappush(heap, (-A[a] * B[b+1], a, b+1))
return result
Example:
import random
A = [random.randint(0, 100) for _ in range(100)]
B = [random.randint(0, 100) for _ in range(100)]
K = 20
result = top_k_prod(A, B, K)
test = sorted([x*y for x in A for y in B], reverse=True)[:K]
print(result)
# [9900, 9702, 9603, 9600, 9504, 9408, 9405, 9405, 9400, 9400, 9312, 9306, 9300, 9216, 9212, 9212, 9207, 9200, 9120, 9120]
print(result == test)
# True
The complexity should be about O(NlogN + KlogK)
for sorting A
and B
and then about K
iterations with heap-operations in the loop. Each cell in the triangular "target" region will only be expanded once from its left neighbor, and cells added to the heap but not used are also limited to K (one in each "row"), giving a maximum of 2*K elements inspected.
This can be improved toO(N + KlogK + K)
by usingnumpy.partition(A,n-k)[-k:]
to find the K largest values in O(N), and only then sorting them.
– wowserx
Nov 15 '18 at 15:22
Can also useheapq.nlargest(k, A)
to find the largest K values in order inO(KlogK)
.
– wowserx
Nov 15 '18 at 15:26
@wowserx Maybe, but originally the question was not even tagged as Python but just a generic algorithm question, so I wanted to keep it generic. You can use a heap in (almost) any language.
– tobias_k
Nov 15 '18 at 15:36
Right, and you can use a heap in exactly the same way: O(N) heapify, then K heappops for O(N + KlogK + K). I'm saying the algorithmic performance is better if you only sort the K largest elements, instead of callingA = sorted(A, reverse=True)
right away.
– wowserx
Nov 15 '18 at 16:13
add a comment |
As already noted, the first step is to sort both lists A
and B
in descending order (or just the K largest of both lists). Then, all the max K products will sit in a roughly triangular area in the top-left corner, the max product being A[0]*B[0]
. In other words, if A[i]*B[j]
is in the top K, then so must be both A[i-1]*B[j]
and A[i]*B[j-1]
(assuming i, j > 0
).
Thus, you can start in the top-left corner and then use a Heap to expand both the "lower" and the "right" neighbor of the current element and put those onto the heap, too, until you have all the K elements you need. Or start with all the K largest elements of A paired with the largest from B already on the heap and only expand in one direction.
Example in Python, using the heapq
module, but the same will work in almost any other language. Note that we are adding negative products to the heap as the heap will be sorted smallest-first.
def top_k_prod(A, B, k):
A = heapq.nlargest(k, A)
B = heapq.nlargest(k, B)
result =
heap = [(-A[i] * B[0], i, 0) for i in range(len(A))]
while heap and len(result) < k:
p, a, b = heapq.heappop(heap)
result.append(-p)
if b < len(B)-1:
heapq.heappush(heap, (-A[a] * B[b+1], a, b+1))
return result
Example:
import random
A = [random.randint(0, 100) for _ in range(100)]
B = [random.randint(0, 100) for _ in range(100)]
K = 20
result = top_k_prod(A, B, K)
test = sorted([x*y for x in A for y in B], reverse=True)[:K]
print(result)
# [9900, 9702, 9603, 9600, 9504, 9408, 9405, 9405, 9400, 9400, 9312, 9306, 9300, 9216, 9212, 9212, 9207, 9200, 9120, 9120]
print(result == test)
# True
The complexity should be about O(NlogN + KlogK)
for sorting A
and B
and then about K
iterations with heap-operations in the loop. Each cell in the triangular "target" region will only be expanded once from its left neighbor, and cells added to the heap but not used are also limited to K (one in each "row"), giving a maximum of 2*K elements inspected.
As already noted, the first step is to sort both lists A
and B
in descending order (or just the K largest of both lists). Then, all the max K products will sit in a roughly triangular area in the top-left corner, the max product being A[0]*B[0]
. In other words, if A[i]*B[j]
is in the top K, then so must be both A[i-1]*B[j]
and A[i]*B[j-1]
(assuming i, j > 0
).
Thus, you can start in the top-left corner and then use a Heap to expand both the "lower" and the "right" neighbor of the current element and put those onto the heap, too, until you have all the K elements you need. Or start with all the K largest elements of A paired with the largest from B already on the heap and only expand in one direction.
Example in Python, using the heapq
module, but the same will work in almost any other language. Note that we are adding negative products to the heap as the heap will be sorted smallest-first.
def top_k_prod(A, B, k):
A = heapq.nlargest(k, A)
B = heapq.nlargest(k, B)
result =
heap = [(-A[i] * B[0], i, 0) for i in range(len(A))]
while heap and len(result) < k:
p, a, b = heapq.heappop(heap)
result.append(-p)
if b < len(B)-1:
heapq.heappush(heap, (-A[a] * B[b+1], a, b+1))
return result
Example:
import random
A = [random.randint(0, 100) for _ in range(100)]
B = [random.randint(0, 100) for _ in range(100)]
K = 20
result = top_k_prod(A, B, K)
test = sorted([x*y for x in A for y in B], reverse=True)[:K]
print(result)
# [9900, 9702, 9603, 9600, 9504, 9408, 9405, 9405, 9400, 9400, 9312, 9306, 9300, 9216, 9212, 9212, 9207, 9200, 9120, 9120]
print(result == test)
# True
The complexity should be about O(NlogN + KlogK)
for sorting A
and B
and then about K
iterations with heap-operations in the loop. Each cell in the triangular "target" region will only be expanded once from its left neighbor, and cells added to the heap but not used are also limited to K (one in each "row"), giving a maximum of 2*K elements inspected.
edited Nov 15 '18 at 17:25
answered Nov 15 '18 at 11:18
tobias_ktobias_k
58.9k969108
58.9k969108
This can be improved toO(N + KlogK + K)
by usingnumpy.partition(A,n-k)[-k:]
to find the K largest values in O(N), and only then sorting them.
– wowserx
Nov 15 '18 at 15:22
Can also useheapq.nlargest(k, A)
to find the largest K values in order inO(KlogK)
.
– wowserx
Nov 15 '18 at 15:26
@wowserx Maybe, but originally the question was not even tagged as Python but just a generic algorithm question, so I wanted to keep it generic. You can use a heap in (almost) any language.
– tobias_k
Nov 15 '18 at 15:36
Right, and you can use a heap in exactly the same way: O(N) heapify, then K heappops for O(N + KlogK + K). I'm saying the algorithmic performance is better if you only sort the K largest elements, instead of callingA = sorted(A, reverse=True)
right away.
– wowserx
Nov 15 '18 at 16:13
add a comment |
This can be improved toO(N + KlogK + K)
by usingnumpy.partition(A,n-k)[-k:]
to find the K largest values in O(N), and only then sorting them.
– wowserx
Nov 15 '18 at 15:22
Can also useheapq.nlargest(k, A)
to find the largest K values in order inO(KlogK)
.
– wowserx
Nov 15 '18 at 15:26
@wowserx Maybe, but originally the question was not even tagged as Python but just a generic algorithm question, so I wanted to keep it generic. You can use a heap in (almost) any language.
– tobias_k
Nov 15 '18 at 15:36
Right, and you can use a heap in exactly the same way: O(N) heapify, then K heappops for O(N + KlogK + K). I'm saying the algorithmic performance is better if you only sort the K largest elements, instead of callingA = sorted(A, reverse=True)
right away.
– wowserx
Nov 15 '18 at 16:13
This can be improved to
O(N + KlogK + K)
by using numpy.partition(A,n-k)[-k:]
to find the K largest values in O(N), and only then sorting them.– wowserx
Nov 15 '18 at 15:22
This can be improved to
O(N + KlogK + K)
by using numpy.partition(A,n-k)[-k:]
to find the K largest values in O(N), and only then sorting them.– wowserx
Nov 15 '18 at 15:22
Can also use
heapq.nlargest(k, A)
to find the largest K values in order in O(KlogK)
.– wowserx
Nov 15 '18 at 15:26
Can also use
heapq.nlargest(k, A)
to find the largest K values in order in O(KlogK)
.– wowserx
Nov 15 '18 at 15:26
@wowserx Maybe, but originally the question was not even tagged as Python but just a generic algorithm question, so I wanted to keep it generic. You can use a heap in (almost) any language.
– tobias_k
Nov 15 '18 at 15:36
@wowserx Maybe, but originally the question was not even tagged as Python but just a generic algorithm question, so I wanted to keep it generic. You can use a heap in (almost) any language.
– tobias_k
Nov 15 '18 at 15:36
Right, and you can use a heap in exactly the same way: O(N) heapify, then K heappops for O(N + KlogK + K). I'm saying the algorithmic performance is better if you only sort the K largest elements, instead of calling
A = sorted(A, reverse=True)
right away.– wowserx
Nov 15 '18 at 16:13
Right, and you can use a heap in exactly the same way: O(N) heapify, then K heappops for O(N + KlogK + K). I'm saying the algorithmic performance is better if you only sort the K largest elements, instead of calling
A = sorted(A, reverse=True)
right away.– wowserx
Nov 15 '18 at 16:13
add a comment |
Practical solution:
Find largest K
elements from list A and K
largest elements from list B by using partial_sort (this is a well-known modification of quick sort, and I am sure python has the same in its library). Largest products formed by these new lists are also the largest products of the original lists. Then use max-heap (priority queue) to find K
largest products from new lists.
add a comment |
Practical solution:
Find largest K
elements from list A and K
largest elements from list B by using partial_sort (this is a well-known modification of quick sort, and I am sure python has the same in its library). Largest products formed by these new lists are also the largest products of the original lists. Then use max-heap (priority queue) to find K
largest products from new lists.
add a comment |
Practical solution:
Find largest K
elements from list A and K
largest elements from list B by using partial_sort (this is a well-known modification of quick sort, and I am sure python has the same in its library). Largest products formed by these new lists are also the largest products of the original lists. Then use max-heap (priority queue) to find K
largest products from new lists.
Practical solution:
Find largest K
elements from list A and K
largest elements from list B by using partial_sort (this is a well-known modification of quick sort, and I am sure python has the same in its library). Largest products formed by these new lists are also the largest products of the original lists. Then use max-heap (priority queue) to find K
largest products from new lists.
answered Nov 15 '18 at 10:47
gudokgudok
2,78121223
2,78121223
add a comment |
add a comment |
If we would find out K max values from both the lists, we would have the max K
products from both the lists.
I would suggest two approaches to find out K max values:
- If
K <<< N
(K
in 10s andN
in millions )
Here you have couple of options.
- You can use selection algorithm K times for both the lists. That would take
O(N*K)
- K iterations of either Selection Sort or Bubble Sort. You would have
K
max values at either at the beginning or at the end of the array depending on the type of implementation. Even that would beO(N*K)
- You can use selection algorithm K times for both the lists. That would take
Note that because
K <<< N
you can say thatO(N*K)
is almostO(N)
K
can be as same asN
- In this case, The best bet would be to just sort both the lists using Merge Sort or Quick Sort. That would be
O(N*lgN)
- In this case, The best bet would be to just sort both the lists using Merge Sort or Quick Sort. That would be
"because K <<< N you can say that O(N * K) is almost O(N)" Well, by that reasoning you could also say that sorting the entire list in O(nlogn) is almost O(n). And I agree, for mot practical purposes, it probably is. But then, why not just sort the entire list? In fact, with K=10 and N=1000000, nlogn is still smaller than n*k.
– tobias_k
Nov 15 '18 at 13:31
Also, even if you have the max K values from both lists, you still have to find the max K of those K² pairs/products of numbers. This part is not really addressed.
– tobias_k
Nov 15 '18 at 13:35
add a comment |
If we would find out K max values from both the lists, we would have the max K
products from both the lists.
I would suggest two approaches to find out K max values:
- If
K <<< N
(K
in 10s andN
in millions )
Here you have couple of options.
- You can use selection algorithm K times for both the lists. That would take
O(N*K)
- K iterations of either Selection Sort or Bubble Sort. You would have
K
max values at either at the beginning or at the end of the array depending on the type of implementation. Even that would beO(N*K)
- You can use selection algorithm K times for both the lists. That would take
Note that because
K <<< N
you can say thatO(N*K)
is almostO(N)
K
can be as same asN
- In this case, The best bet would be to just sort both the lists using Merge Sort or Quick Sort. That would be
O(N*lgN)
- In this case, The best bet would be to just sort both the lists using Merge Sort or Quick Sort. That would be
"because K <<< N you can say that O(N * K) is almost O(N)" Well, by that reasoning you could also say that sorting the entire list in O(nlogn) is almost O(n). And I agree, for mot practical purposes, it probably is. But then, why not just sort the entire list? In fact, with K=10 and N=1000000, nlogn is still smaller than n*k.
– tobias_k
Nov 15 '18 at 13:31
Also, even if you have the max K values from both lists, you still have to find the max K of those K² pairs/products of numbers. This part is not really addressed.
– tobias_k
Nov 15 '18 at 13:35
add a comment |
If we would find out K max values from both the lists, we would have the max K
products from both the lists.
I would suggest two approaches to find out K max values:
- If
K <<< N
(K
in 10s andN
in millions )
Here you have couple of options.
- You can use selection algorithm K times for both the lists. That would take
O(N*K)
- K iterations of either Selection Sort or Bubble Sort. You would have
K
max values at either at the beginning or at the end of the array depending on the type of implementation. Even that would beO(N*K)
- You can use selection algorithm K times for both the lists. That would take
Note that because
K <<< N
you can say thatO(N*K)
is almostO(N)
K
can be as same asN
- In this case, The best bet would be to just sort both the lists using Merge Sort or Quick Sort. That would be
O(N*lgN)
- In this case, The best bet would be to just sort both the lists using Merge Sort or Quick Sort. That would be
If we would find out K max values from both the lists, we would have the max K
products from both the lists.
I would suggest two approaches to find out K max values:
- If
K <<< N
(K
in 10s andN
in millions )
Here you have couple of options.
- You can use selection algorithm K times for both the lists. That would take
O(N*K)
- K iterations of either Selection Sort or Bubble Sort. You would have
K
max values at either at the beginning or at the end of the array depending on the type of implementation. Even that would beO(N*K)
- You can use selection algorithm K times for both the lists. That would take
Note that because
K <<< N
you can say thatO(N*K)
is almostO(N)
K
can be as same asN
- In this case, The best bet would be to just sort both the lists using Merge Sort or Quick Sort. That would be
O(N*lgN)
- In this case, The best bet would be to just sort both the lists using Merge Sort or Quick Sort. That would be
answered Nov 15 '18 at 11:55
Anand UndaviaAnand Undavia
1,7361923
1,7361923
"because K <<< N you can say that O(N * K) is almost O(N)" Well, by that reasoning you could also say that sorting the entire list in O(nlogn) is almost O(n). And I agree, for mot practical purposes, it probably is. But then, why not just sort the entire list? In fact, with K=10 and N=1000000, nlogn is still smaller than n*k.
– tobias_k
Nov 15 '18 at 13:31
Also, even if you have the max K values from both lists, you still have to find the max K of those K² pairs/products of numbers. This part is not really addressed.
– tobias_k
Nov 15 '18 at 13:35
add a comment |
"because K <<< N you can say that O(N * K) is almost O(N)" Well, by that reasoning you could also say that sorting the entire list in O(nlogn) is almost O(n). And I agree, for mot practical purposes, it probably is. But then, why not just sort the entire list? In fact, with K=10 and N=1000000, nlogn is still smaller than n*k.
– tobias_k
Nov 15 '18 at 13:31
Also, even if you have the max K values from both lists, you still have to find the max K of those K² pairs/products of numbers. This part is not really addressed.
– tobias_k
Nov 15 '18 at 13:35
"because K <<< N you can say that O(N * K) is almost O(N)" Well, by that reasoning you could also say that sorting the entire list in O(nlogn) is almost O(n). And I agree, for mot practical purposes, it probably is. But then, why not just sort the entire list? In fact, with K=10 and N=1000000, nlogn is still smaller than n*k.
– tobias_k
Nov 15 '18 at 13:31
"because K <<< N you can say that O(N * K) is almost O(N)" Well, by that reasoning you could also say that sorting the entire list in O(nlogn) is almost O(n). And I agree, for mot practical purposes, it probably is. But then, why not just sort the entire list? In fact, with K=10 and N=1000000, nlogn is still smaller than n*k.
– tobias_k
Nov 15 '18 at 13:31
Also, even if you have the max K values from both lists, you still have to find the max K of those K² pairs/products of numbers. This part is not really addressed.
– tobias_k
Nov 15 '18 at 13:35
Also, even if you have the max K values from both lists, you still have to find the max K of those K² pairs/products of numbers. This part is not really addressed.
– tobias_k
Nov 15 '18 at 13:35
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53317379%2fefficient-way-to-find-top-k-products-given-two-lists%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
You do not need to multiply every elements of the lists. If you need K values, you can just retrieve first K max values of each lists and multiply each pair. But I think there is an even more optimized solution.
– LostReality
Nov 15 '18 at 10:49