Slow query performance: MongoDB with $lookup and $and/$or












2














I'm trying to find a way to create an engine that translates GraphQL query filters to MongoDB aggregations while keeping the performance. Our application has the requisite of limiting the results from collection A by applying filters to collection B, C and even D sometimes.



For better understanding, here's a sample about how a filter is translated to MongoDB.



This:



{
"filter": {
"return": null,
"AND": [{
"customer_WITH": {
"OR": [{
"code": "CUSTOMER NAME"
}, {
"commercialName_LIKE": "CUSTOMER NAME"
}, {
"corporateName_LIKE": "CUSTOMER NAME"
}]
}
}],
"OR": [{
"dispatcher_WITH": {
"company_WITH": {
"corporateName_LIKE": "COMPANY NAME"
}
}
}, {
"redispatcher_WITH": {
"company_WITH": {
"corporateName_LIKE": "COMPANY NAME"
}
}
}],
"reversal": null
}
}


Gets translated to this:



[{
"$match": {
"return": {
"$eq": null
},
"reversal": {
"$eq": null
},
"company": {
"$eq": ObjectId("xxxxxxxxxxxxxxxxxxxxxxxx")
}
}
}, {
"$lookup": {
"as": "dispatcher",
"from": "shippers",
"localField": "dispatcher",
"foreignField": "_id"
}
}, {
"$unwind": {
"path": "$dispatcher",
"preserveNullAndEmptyArrays": true
}
}, {
"$lookup": {
"as": "dispatcher.company",
"from": "companies",
"localField": "dispatcher.company",
"foreignField": "_id"
}
}, {
"$unwind": {
"path": "$dispatcher.company",
"preserveNullAndEmptyArrays": true
}
}, {
"$lookup": {
"as": "redispatcher",
"from": "shippers",
"localField": "redispatcher",
"foreignField": "_id"
}
}, {
"$unwind": {
"path": "$redispatcher",
"preserveNullAndEmptyArrays": true
}
}, {
"$lookup": {
"as": "redispatcher.company",
"from": "companies",
"localField": "redispatcher.company",
"foreignField": "_id"
}
}, {
"$unwind": {
"path": "$redispatcher.company",
"preserveNullAndEmptyArrays": true
}
}, {
"$lookup": {
"as": "customer",
"from": "customers",
"localField": "customer",
"foreignField": "_id"
}
}, {
"$match": {
"$or": [{
"dispatcher.company.corporateName": {
"$regex": /sCOMPANYsNAME/
}
}, {
"redispatcher.company.corporateName": {
"$regex": /sCOMPANYsNAME/
}
}],
"$and": [{
"$or": [{
"customer.code": {
"$eq": "CUSTOMER NAME"
}
}, {
"customer.commercialName": {
"$regex": /CUSTOMERsNAME/
}
}, {
"customer.corporateName": {
"$regex": /CUSTOMERsNAME/
}
}]
}]
}
}, {
"$unwind": {
"path": "$customer",
"preserveNullAndEmptyArrays": true
}
}, {
"$group": {
"_id": "$invoiceNo",
"__rootId": {
"$first": "$_id"
},
"company": {
"$first": "$company"
},
"customer": {
"$first": "$customer._id"
},
"dispatcher": {
"$first": "$dispatcher._id"
},
"redispatcher": {
"$first": "$redispatcher._id"
},
"driverPlate": {
"$first": "$driverPlate"
},
"key": {
"$first": "$key"
},
"activities": {
"$first": "$activities"
},
"serialNo": {
"$first": "$serialNo"
},
"invoiceNo": {
"$first": "$invoiceNo"
},
"incidents": {
"$first": "$incidents"
},
"deliveries": {
"$first": "$deliveries"
},
"return": {
"$first": "$return"
}
}
}, {
"$project": {
"_id": "$__rootId",
"company": "$company",
"customer": "$customer",
"dispatcher": "$dispatcher",
"redispatcher": "$redispatcher",
"driverPlate": "$driverPlate",
"key": "$key",
"activities": "$activities",
"serialNo": "$serialNo",
"invoiceNo": "$invoiceNo",
"incidents": "$incidents",
"deliveries": "$deliveries",
"return": "$return"
}
}, {
"$sort": {
"invoiceNo": -1
}
}, {
"$limit": 51
}]


The engine is smart enough to reallocate to the first position $match properties that don't require $lookups and right after $lookups if they do, however if they are within a $and/$or condition block, then they are reallocated after the last $lookup, regardless of what properties are there.



I could scan for what is used inside the $and and deconstruct it into new reallocated $match phases, but I need to figure how to handle the $or operator: I can't apply the same desconstruction idea on it because this would invalidate the condition.



So my question is: Is there an alternative way to use the phase $lookup along with $and/$or and improve the performance drastically?



Creating more indexes won't help because they're not used for the $lookup. Moving up $match phases, as the MongoDB team would suggest is also not possible because it would break the conditions. So I'm out of ideas now.



Best regards.










share|improve this question
























  • Seems from the content of the question that you already asked this question elsewhere, being either MongoDB support or the MongoDB mailing list. I'm not sure why you would be expecting a different response here. Ideally the $match block(s) ( other than the initial one ) would be directly after each $unwind, however it's an $or condition over multiple sub-document paths ( now linked and extracted ). Bottom line is that this structure probably should not be denormalized into separate collections in the first place. The modelling appears very relational, and that's very wrong.
    – Neil Lunn
    Nov 12 at 20:46












  • Actually not, but I did my fair amount of research before posting it here, but I still didn't noticed any question that would fit 100% my criteria. Now about the relationships: if you're talking about how I should use embedded documents would that improve the performance greatly? If you say so I could run a function that overwrites each ObjectId reference with the document itself. That would take a bit of time and testing but I don't think it would be impossible to be worked out. However there are some few references that should be kept to prevent data duplication.
    – Rubens Felipe
    Nov 13 at 0:10












  • Would that improve performance greatly? Well most likely so considering the data would already be in one collection and sort of "pre-joined" as a simple way of putting it. Point here is that if you use MongoDB just the same as an RDBMS then A. There was probably little point in not using an RDBMS. 2. You just end up with the same performance problems or indeed worse, considering the thing is not really designed to be "relational" in the first place. $lookup is just "a way to get around things", and should not be used as a "central design point". This seems your basic problem.
    – Neil Lunn
    Nov 13 at 0:15










  • Of course "How do you measure such performance?", well you test of course, and keep testing until you see the structure which actually gives the best performance/functionality split. Which is actually why you use something other than an RDBMS in the first place.
    – Neil Lunn
    Nov 13 at 0:17
















2














I'm trying to find a way to create an engine that translates GraphQL query filters to MongoDB aggregations while keeping the performance. Our application has the requisite of limiting the results from collection A by applying filters to collection B, C and even D sometimes.



For better understanding, here's a sample about how a filter is translated to MongoDB.



This:



{
"filter": {
"return": null,
"AND": [{
"customer_WITH": {
"OR": [{
"code": "CUSTOMER NAME"
}, {
"commercialName_LIKE": "CUSTOMER NAME"
}, {
"corporateName_LIKE": "CUSTOMER NAME"
}]
}
}],
"OR": [{
"dispatcher_WITH": {
"company_WITH": {
"corporateName_LIKE": "COMPANY NAME"
}
}
}, {
"redispatcher_WITH": {
"company_WITH": {
"corporateName_LIKE": "COMPANY NAME"
}
}
}],
"reversal": null
}
}


Gets translated to this:



[{
"$match": {
"return": {
"$eq": null
},
"reversal": {
"$eq": null
},
"company": {
"$eq": ObjectId("xxxxxxxxxxxxxxxxxxxxxxxx")
}
}
}, {
"$lookup": {
"as": "dispatcher",
"from": "shippers",
"localField": "dispatcher",
"foreignField": "_id"
}
}, {
"$unwind": {
"path": "$dispatcher",
"preserveNullAndEmptyArrays": true
}
}, {
"$lookup": {
"as": "dispatcher.company",
"from": "companies",
"localField": "dispatcher.company",
"foreignField": "_id"
}
}, {
"$unwind": {
"path": "$dispatcher.company",
"preserveNullAndEmptyArrays": true
}
}, {
"$lookup": {
"as": "redispatcher",
"from": "shippers",
"localField": "redispatcher",
"foreignField": "_id"
}
}, {
"$unwind": {
"path": "$redispatcher",
"preserveNullAndEmptyArrays": true
}
}, {
"$lookup": {
"as": "redispatcher.company",
"from": "companies",
"localField": "redispatcher.company",
"foreignField": "_id"
}
}, {
"$unwind": {
"path": "$redispatcher.company",
"preserveNullAndEmptyArrays": true
}
}, {
"$lookup": {
"as": "customer",
"from": "customers",
"localField": "customer",
"foreignField": "_id"
}
}, {
"$match": {
"$or": [{
"dispatcher.company.corporateName": {
"$regex": /sCOMPANYsNAME/
}
}, {
"redispatcher.company.corporateName": {
"$regex": /sCOMPANYsNAME/
}
}],
"$and": [{
"$or": [{
"customer.code": {
"$eq": "CUSTOMER NAME"
}
}, {
"customer.commercialName": {
"$regex": /CUSTOMERsNAME/
}
}, {
"customer.corporateName": {
"$regex": /CUSTOMERsNAME/
}
}]
}]
}
}, {
"$unwind": {
"path": "$customer",
"preserveNullAndEmptyArrays": true
}
}, {
"$group": {
"_id": "$invoiceNo",
"__rootId": {
"$first": "$_id"
},
"company": {
"$first": "$company"
},
"customer": {
"$first": "$customer._id"
},
"dispatcher": {
"$first": "$dispatcher._id"
},
"redispatcher": {
"$first": "$redispatcher._id"
},
"driverPlate": {
"$first": "$driverPlate"
},
"key": {
"$first": "$key"
},
"activities": {
"$first": "$activities"
},
"serialNo": {
"$first": "$serialNo"
},
"invoiceNo": {
"$first": "$invoiceNo"
},
"incidents": {
"$first": "$incidents"
},
"deliveries": {
"$first": "$deliveries"
},
"return": {
"$first": "$return"
}
}
}, {
"$project": {
"_id": "$__rootId",
"company": "$company",
"customer": "$customer",
"dispatcher": "$dispatcher",
"redispatcher": "$redispatcher",
"driverPlate": "$driverPlate",
"key": "$key",
"activities": "$activities",
"serialNo": "$serialNo",
"invoiceNo": "$invoiceNo",
"incidents": "$incidents",
"deliveries": "$deliveries",
"return": "$return"
}
}, {
"$sort": {
"invoiceNo": -1
}
}, {
"$limit": 51
}]


The engine is smart enough to reallocate to the first position $match properties that don't require $lookups and right after $lookups if they do, however if they are within a $and/$or condition block, then they are reallocated after the last $lookup, regardless of what properties are there.



I could scan for what is used inside the $and and deconstruct it into new reallocated $match phases, but I need to figure how to handle the $or operator: I can't apply the same desconstruction idea on it because this would invalidate the condition.



So my question is: Is there an alternative way to use the phase $lookup along with $and/$or and improve the performance drastically?



Creating more indexes won't help because they're not used for the $lookup. Moving up $match phases, as the MongoDB team would suggest is also not possible because it would break the conditions. So I'm out of ideas now.



Best regards.










share|improve this question
























  • Seems from the content of the question that you already asked this question elsewhere, being either MongoDB support or the MongoDB mailing list. I'm not sure why you would be expecting a different response here. Ideally the $match block(s) ( other than the initial one ) would be directly after each $unwind, however it's an $or condition over multiple sub-document paths ( now linked and extracted ). Bottom line is that this structure probably should not be denormalized into separate collections in the first place. The modelling appears very relational, and that's very wrong.
    – Neil Lunn
    Nov 12 at 20:46












  • Actually not, but I did my fair amount of research before posting it here, but I still didn't noticed any question that would fit 100% my criteria. Now about the relationships: if you're talking about how I should use embedded documents would that improve the performance greatly? If you say so I could run a function that overwrites each ObjectId reference with the document itself. That would take a bit of time and testing but I don't think it would be impossible to be worked out. However there are some few references that should be kept to prevent data duplication.
    – Rubens Felipe
    Nov 13 at 0:10












  • Would that improve performance greatly? Well most likely so considering the data would already be in one collection and sort of "pre-joined" as a simple way of putting it. Point here is that if you use MongoDB just the same as an RDBMS then A. There was probably little point in not using an RDBMS. 2. You just end up with the same performance problems or indeed worse, considering the thing is not really designed to be "relational" in the first place. $lookup is just "a way to get around things", and should not be used as a "central design point". This seems your basic problem.
    – Neil Lunn
    Nov 13 at 0:15










  • Of course "How do you measure such performance?", well you test of course, and keep testing until you see the structure which actually gives the best performance/functionality split. Which is actually why you use something other than an RDBMS in the first place.
    – Neil Lunn
    Nov 13 at 0:17














2












2








2







I'm trying to find a way to create an engine that translates GraphQL query filters to MongoDB aggregations while keeping the performance. Our application has the requisite of limiting the results from collection A by applying filters to collection B, C and even D sometimes.



For better understanding, here's a sample about how a filter is translated to MongoDB.



This:



{
"filter": {
"return": null,
"AND": [{
"customer_WITH": {
"OR": [{
"code": "CUSTOMER NAME"
}, {
"commercialName_LIKE": "CUSTOMER NAME"
}, {
"corporateName_LIKE": "CUSTOMER NAME"
}]
}
}],
"OR": [{
"dispatcher_WITH": {
"company_WITH": {
"corporateName_LIKE": "COMPANY NAME"
}
}
}, {
"redispatcher_WITH": {
"company_WITH": {
"corporateName_LIKE": "COMPANY NAME"
}
}
}],
"reversal": null
}
}


Gets translated to this:



[{
"$match": {
"return": {
"$eq": null
},
"reversal": {
"$eq": null
},
"company": {
"$eq": ObjectId("xxxxxxxxxxxxxxxxxxxxxxxx")
}
}
}, {
"$lookup": {
"as": "dispatcher",
"from": "shippers",
"localField": "dispatcher",
"foreignField": "_id"
}
}, {
"$unwind": {
"path": "$dispatcher",
"preserveNullAndEmptyArrays": true
}
}, {
"$lookup": {
"as": "dispatcher.company",
"from": "companies",
"localField": "dispatcher.company",
"foreignField": "_id"
}
}, {
"$unwind": {
"path": "$dispatcher.company",
"preserveNullAndEmptyArrays": true
}
}, {
"$lookup": {
"as": "redispatcher",
"from": "shippers",
"localField": "redispatcher",
"foreignField": "_id"
}
}, {
"$unwind": {
"path": "$redispatcher",
"preserveNullAndEmptyArrays": true
}
}, {
"$lookup": {
"as": "redispatcher.company",
"from": "companies",
"localField": "redispatcher.company",
"foreignField": "_id"
}
}, {
"$unwind": {
"path": "$redispatcher.company",
"preserveNullAndEmptyArrays": true
}
}, {
"$lookup": {
"as": "customer",
"from": "customers",
"localField": "customer",
"foreignField": "_id"
}
}, {
"$match": {
"$or": [{
"dispatcher.company.corporateName": {
"$regex": /sCOMPANYsNAME/
}
}, {
"redispatcher.company.corporateName": {
"$regex": /sCOMPANYsNAME/
}
}],
"$and": [{
"$or": [{
"customer.code": {
"$eq": "CUSTOMER NAME"
}
}, {
"customer.commercialName": {
"$regex": /CUSTOMERsNAME/
}
}, {
"customer.corporateName": {
"$regex": /CUSTOMERsNAME/
}
}]
}]
}
}, {
"$unwind": {
"path": "$customer",
"preserveNullAndEmptyArrays": true
}
}, {
"$group": {
"_id": "$invoiceNo",
"__rootId": {
"$first": "$_id"
},
"company": {
"$first": "$company"
},
"customer": {
"$first": "$customer._id"
},
"dispatcher": {
"$first": "$dispatcher._id"
},
"redispatcher": {
"$first": "$redispatcher._id"
},
"driverPlate": {
"$first": "$driverPlate"
},
"key": {
"$first": "$key"
},
"activities": {
"$first": "$activities"
},
"serialNo": {
"$first": "$serialNo"
},
"invoiceNo": {
"$first": "$invoiceNo"
},
"incidents": {
"$first": "$incidents"
},
"deliveries": {
"$first": "$deliveries"
},
"return": {
"$first": "$return"
}
}
}, {
"$project": {
"_id": "$__rootId",
"company": "$company",
"customer": "$customer",
"dispatcher": "$dispatcher",
"redispatcher": "$redispatcher",
"driverPlate": "$driverPlate",
"key": "$key",
"activities": "$activities",
"serialNo": "$serialNo",
"invoiceNo": "$invoiceNo",
"incidents": "$incidents",
"deliveries": "$deliveries",
"return": "$return"
}
}, {
"$sort": {
"invoiceNo": -1
}
}, {
"$limit": 51
}]


The engine is smart enough to reallocate to the first position $match properties that don't require $lookups and right after $lookups if they do, however if they are within a $and/$or condition block, then they are reallocated after the last $lookup, regardless of what properties are there.



I could scan for what is used inside the $and and deconstruct it into new reallocated $match phases, but I need to figure how to handle the $or operator: I can't apply the same desconstruction idea on it because this would invalidate the condition.



So my question is: Is there an alternative way to use the phase $lookup along with $and/$or and improve the performance drastically?



Creating more indexes won't help because they're not used for the $lookup. Moving up $match phases, as the MongoDB team would suggest is also not possible because it would break the conditions. So I'm out of ideas now.



Best regards.










share|improve this question















I'm trying to find a way to create an engine that translates GraphQL query filters to MongoDB aggregations while keeping the performance. Our application has the requisite of limiting the results from collection A by applying filters to collection B, C and even D sometimes.



For better understanding, here's a sample about how a filter is translated to MongoDB.



This:



{
"filter": {
"return": null,
"AND": [{
"customer_WITH": {
"OR": [{
"code": "CUSTOMER NAME"
}, {
"commercialName_LIKE": "CUSTOMER NAME"
}, {
"corporateName_LIKE": "CUSTOMER NAME"
}]
}
}],
"OR": [{
"dispatcher_WITH": {
"company_WITH": {
"corporateName_LIKE": "COMPANY NAME"
}
}
}, {
"redispatcher_WITH": {
"company_WITH": {
"corporateName_LIKE": "COMPANY NAME"
}
}
}],
"reversal": null
}
}


Gets translated to this:



[{
"$match": {
"return": {
"$eq": null
},
"reversal": {
"$eq": null
},
"company": {
"$eq": ObjectId("xxxxxxxxxxxxxxxxxxxxxxxx")
}
}
}, {
"$lookup": {
"as": "dispatcher",
"from": "shippers",
"localField": "dispatcher",
"foreignField": "_id"
}
}, {
"$unwind": {
"path": "$dispatcher",
"preserveNullAndEmptyArrays": true
}
}, {
"$lookup": {
"as": "dispatcher.company",
"from": "companies",
"localField": "dispatcher.company",
"foreignField": "_id"
}
}, {
"$unwind": {
"path": "$dispatcher.company",
"preserveNullAndEmptyArrays": true
}
}, {
"$lookup": {
"as": "redispatcher",
"from": "shippers",
"localField": "redispatcher",
"foreignField": "_id"
}
}, {
"$unwind": {
"path": "$redispatcher",
"preserveNullAndEmptyArrays": true
}
}, {
"$lookup": {
"as": "redispatcher.company",
"from": "companies",
"localField": "redispatcher.company",
"foreignField": "_id"
}
}, {
"$unwind": {
"path": "$redispatcher.company",
"preserveNullAndEmptyArrays": true
}
}, {
"$lookup": {
"as": "customer",
"from": "customers",
"localField": "customer",
"foreignField": "_id"
}
}, {
"$match": {
"$or": [{
"dispatcher.company.corporateName": {
"$regex": /sCOMPANYsNAME/
}
}, {
"redispatcher.company.corporateName": {
"$regex": /sCOMPANYsNAME/
}
}],
"$and": [{
"$or": [{
"customer.code": {
"$eq": "CUSTOMER NAME"
}
}, {
"customer.commercialName": {
"$regex": /CUSTOMERsNAME/
}
}, {
"customer.corporateName": {
"$regex": /CUSTOMERsNAME/
}
}]
}]
}
}, {
"$unwind": {
"path": "$customer",
"preserveNullAndEmptyArrays": true
}
}, {
"$group": {
"_id": "$invoiceNo",
"__rootId": {
"$first": "$_id"
},
"company": {
"$first": "$company"
},
"customer": {
"$first": "$customer._id"
},
"dispatcher": {
"$first": "$dispatcher._id"
},
"redispatcher": {
"$first": "$redispatcher._id"
},
"driverPlate": {
"$first": "$driverPlate"
},
"key": {
"$first": "$key"
},
"activities": {
"$first": "$activities"
},
"serialNo": {
"$first": "$serialNo"
},
"invoiceNo": {
"$first": "$invoiceNo"
},
"incidents": {
"$first": "$incidents"
},
"deliveries": {
"$first": "$deliveries"
},
"return": {
"$first": "$return"
}
}
}, {
"$project": {
"_id": "$__rootId",
"company": "$company",
"customer": "$customer",
"dispatcher": "$dispatcher",
"redispatcher": "$redispatcher",
"driverPlate": "$driverPlate",
"key": "$key",
"activities": "$activities",
"serialNo": "$serialNo",
"invoiceNo": "$invoiceNo",
"incidents": "$incidents",
"deliveries": "$deliveries",
"return": "$return"
}
}, {
"$sort": {
"invoiceNo": -1
}
}, {
"$limit": 51
}]


The engine is smart enough to reallocate to the first position $match properties that don't require $lookups and right after $lookups if they do, however if they are within a $and/$or condition block, then they are reallocated after the last $lookup, regardless of what properties are there.



I could scan for what is used inside the $and and deconstruct it into new reallocated $match phases, but I need to figure how to handle the $or operator: I can't apply the same desconstruction idea on it because this would invalidate the condition.



So my question is: Is there an alternative way to use the phase $lookup along with $and/$or and improve the performance drastically?



Creating more indexes won't help because they're not used for the $lookup. Moving up $match phases, as the MongoDB team would suggest is also not possible because it would break the conditions. So I'm out of ideas now.



Best regards.







javascript mongodb performance graphql gql






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 12 at 18:00

























asked Nov 12 at 17:40









Rubens Felipe

112




112












  • Seems from the content of the question that you already asked this question elsewhere, being either MongoDB support or the MongoDB mailing list. I'm not sure why you would be expecting a different response here. Ideally the $match block(s) ( other than the initial one ) would be directly after each $unwind, however it's an $or condition over multiple sub-document paths ( now linked and extracted ). Bottom line is that this structure probably should not be denormalized into separate collections in the first place. The modelling appears very relational, and that's very wrong.
    – Neil Lunn
    Nov 12 at 20:46












  • Actually not, but I did my fair amount of research before posting it here, but I still didn't noticed any question that would fit 100% my criteria. Now about the relationships: if you're talking about how I should use embedded documents would that improve the performance greatly? If you say so I could run a function that overwrites each ObjectId reference with the document itself. That would take a bit of time and testing but I don't think it would be impossible to be worked out. However there are some few references that should be kept to prevent data duplication.
    – Rubens Felipe
    Nov 13 at 0:10












  • Would that improve performance greatly? Well most likely so considering the data would already be in one collection and sort of "pre-joined" as a simple way of putting it. Point here is that if you use MongoDB just the same as an RDBMS then A. There was probably little point in not using an RDBMS. 2. You just end up with the same performance problems or indeed worse, considering the thing is not really designed to be "relational" in the first place. $lookup is just "a way to get around things", and should not be used as a "central design point". This seems your basic problem.
    – Neil Lunn
    Nov 13 at 0:15










  • Of course "How do you measure such performance?", well you test of course, and keep testing until you see the structure which actually gives the best performance/functionality split. Which is actually why you use something other than an RDBMS in the first place.
    – Neil Lunn
    Nov 13 at 0:17


















  • Seems from the content of the question that you already asked this question elsewhere, being either MongoDB support or the MongoDB mailing list. I'm not sure why you would be expecting a different response here. Ideally the $match block(s) ( other than the initial one ) would be directly after each $unwind, however it's an $or condition over multiple sub-document paths ( now linked and extracted ). Bottom line is that this structure probably should not be denormalized into separate collections in the first place. The modelling appears very relational, and that's very wrong.
    – Neil Lunn
    Nov 12 at 20:46












  • Actually not, but I did my fair amount of research before posting it here, but I still didn't noticed any question that would fit 100% my criteria. Now about the relationships: if you're talking about how I should use embedded documents would that improve the performance greatly? If you say so I could run a function that overwrites each ObjectId reference with the document itself. That would take a bit of time and testing but I don't think it would be impossible to be worked out. However there are some few references that should be kept to prevent data duplication.
    – Rubens Felipe
    Nov 13 at 0:10












  • Would that improve performance greatly? Well most likely so considering the data would already be in one collection and sort of "pre-joined" as a simple way of putting it. Point here is that if you use MongoDB just the same as an RDBMS then A. There was probably little point in not using an RDBMS. 2. You just end up with the same performance problems or indeed worse, considering the thing is not really designed to be "relational" in the first place. $lookup is just "a way to get around things", and should not be used as a "central design point". This seems your basic problem.
    – Neil Lunn
    Nov 13 at 0:15










  • Of course "How do you measure such performance?", well you test of course, and keep testing until you see the structure which actually gives the best performance/functionality split. Which is actually why you use something other than an RDBMS in the first place.
    – Neil Lunn
    Nov 13 at 0:17
















Seems from the content of the question that you already asked this question elsewhere, being either MongoDB support or the MongoDB mailing list. I'm not sure why you would be expecting a different response here. Ideally the $match block(s) ( other than the initial one ) would be directly after each $unwind, however it's an $or condition over multiple sub-document paths ( now linked and extracted ). Bottom line is that this structure probably should not be denormalized into separate collections in the first place. The modelling appears very relational, and that's very wrong.
– Neil Lunn
Nov 12 at 20:46






Seems from the content of the question that you already asked this question elsewhere, being either MongoDB support or the MongoDB mailing list. I'm not sure why you would be expecting a different response here. Ideally the $match block(s) ( other than the initial one ) would be directly after each $unwind, however it's an $or condition over multiple sub-document paths ( now linked and extracted ). Bottom line is that this structure probably should not be denormalized into separate collections in the first place. The modelling appears very relational, and that's very wrong.
– Neil Lunn
Nov 12 at 20:46














Actually not, but I did my fair amount of research before posting it here, but I still didn't noticed any question that would fit 100% my criteria. Now about the relationships: if you're talking about how I should use embedded documents would that improve the performance greatly? If you say so I could run a function that overwrites each ObjectId reference with the document itself. That would take a bit of time and testing but I don't think it would be impossible to be worked out. However there are some few references that should be kept to prevent data duplication.
– Rubens Felipe
Nov 13 at 0:10






Actually not, but I did my fair amount of research before posting it here, but I still didn't noticed any question that would fit 100% my criteria. Now about the relationships: if you're talking about how I should use embedded documents would that improve the performance greatly? If you say so I could run a function that overwrites each ObjectId reference with the document itself. That would take a bit of time and testing but I don't think it would be impossible to be worked out. However there are some few references that should be kept to prevent data duplication.
– Rubens Felipe
Nov 13 at 0:10














Would that improve performance greatly? Well most likely so considering the data would already be in one collection and sort of "pre-joined" as a simple way of putting it. Point here is that if you use MongoDB just the same as an RDBMS then A. There was probably little point in not using an RDBMS. 2. You just end up with the same performance problems or indeed worse, considering the thing is not really designed to be "relational" in the first place. $lookup is just "a way to get around things", and should not be used as a "central design point". This seems your basic problem.
– Neil Lunn
Nov 13 at 0:15




Would that improve performance greatly? Well most likely so considering the data would already be in one collection and sort of "pre-joined" as a simple way of putting it. Point here is that if you use MongoDB just the same as an RDBMS then A. There was probably little point in not using an RDBMS. 2. You just end up with the same performance problems or indeed worse, considering the thing is not really designed to be "relational" in the first place. $lookup is just "a way to get around things", and should not be used as a "central design point". This seems your basic problem.
– Neil Lunn
Nov 13 at 0:15












Of course "How do you measure such performance?", well you test of course, and keep testing until you see the structure which actually gives the best performance/functionality split. Which is actually why you use something other than an RDBMS in the first place.
– Neil Lunn
Nov 13 at 0:17




Of course "How do you measure such performance?", well you test of course, and keep testing until you see the structure which actually gives the best performance/functionality split. Which is actually why you use something other than an RDBMS in the first place.
– Neil Lunn
Nov 13 at 0:17

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53267388%2fslow-query-performance-mongodb-with-lookup-and-and-or%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53267388%2fslow-query-performance-mongodb-with-lookup-and-and-or%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Xamarin.iOS Cant Deploy on Iphone

Glorious Revolution

Dulmage-Mendelsohn matrix decomposition in Python