[BUG] Order of elements in list[Link] is not preserved
moonrail opened this issue · comments
Describe the bug
As #414 and #527 were closed without any interaction by a stale bot (yay...) here another attempt.
When using DBRefs/Links, the ordering is stable when using MongoDB directly or any of its lower level libraries.
This is however not always the case with beanie.
Upon insert/update the order is preserved, as defined.
Upon link-fetching via e.g. fetch_links
or fetch_all_links
, it is not.
A possible solution can be found here: #414 (comment)
This just copies all Ids to a list with the original order and sorts the fetched documents later based on this order.
To Reproduce
Examples can be found in both older issues.
Expected behavior
Using fetch_all_links
and fetch_links
keeps the order of items as defined as DBRefs in the database.
Additional context
Note that in addition to not preserving the order of elements, fetch_links
may even change the length of the returned list by removing duplicate elements. In summary, fetch_links
behaves like an $in
query, not like a list fetch.
The offending code is here, in the LinkTypes.LIST
branch of the construct_query
function in beanie.odm.utils.find
:
beanie/beanie/odm/utils/find.py
Lines 270 to 275 in 51e73eb
I see this as more of a correctness issue than a convenience issue. As a mongodb datastructure, arrays can contain duplicates and their order is part of their semantics. Users will expect fetching an array to fetch elements in order, including duplicates. I believe there are ways to dereference an array of ids efficiently, but even if it turns out that this is a fundamentally slow operation, then I think the solution would be to warn users of the conditions under which the operation is slow rather than to break array semantics.
As a proof of concept, here is an example query that does an in-order list fetch that could be modified to replace the existing $lookup
stage in construct_query
(playground link: https://mongoplayground.net/p/t7TyJK1oOEI)
db.orders.aggregate([
{
$unwind: "$items"
},
{
$lookup: {
from: "inventory",
localField: "items.$id",
foreignField: "_id",
as: "items"
}
},
{
$group: {
_id: "$_id",
items: {
$push: {
$first: "$items"
}
},
__oldRoot: {
$last: "$$ROOT"
}
}
},
{
$replaceRoot: {
newRoot: {
$mergeObjects: [
"$__oldRoot",
{
items: "$items"
}
]
}
}
}
])
The idea is to unwind the array, perform an efficient lookup on the id, group the results into an object containing the fetched array and the old document, then merge the old document with a new document containing the fetched items.
@roman-right How do you feel about the solution proposed here? Have I missed anything or do you have other concerns that I could help address? Before submitting a PR I would want to have you on-board with the general plan.
This issue is stale because it has been open 30 days with no activity.