are there any libraries out there that allow efficiently moving to a given position of a query result without keeping the entire result set in memory?
That seems like a strange thing to want to do. I don’t believe SQL guarantees the ordering of results (unless you explicitly use an ORDER BY
clause), so I’m not sure what use case you’d need to select a specific index.
You can set a maximum number of queries to return with LIMIT
or TOP
(depending on your server), and there are some tricks you can do to specify an offset if you really need to - see this relevant Stack Overflow question.
The db library doesn’t keep everything in memory - it loads chunks of data at a time, filling a buffer. Once you hit Next() enough times, the buffer is run through, and it fetches the next chunk.
I think you can do what you want by simply calling Next() X times, where X is the row you want to end up on?
How the result set is actually buffered depends on the implementation of a given driver. What I’m asking is about moving to any position in the result set, not just moving in one direction through the set.
Thats a good point - the buffering is up to the driver.
But I believe the sql interface is such that you can only go forwards, not backwards, and thus there is no arbitrary seek.
I’m not concerned about ordering, just moving to a specific position within a result set. The set would either need to exist entirely in memory or be cached in some other fashion in case the memory required to hold the set isn’t available so as to maintain a “view” of the set as it were when the query was executed.
I’m not expecting this of any db driver, just wondering if there was a package out there, somewhere, that facilitated this.
What exactly is your use-case for seeking within a result set?
Do you know how big the result is going to be?
Some servers allow specifying a read order on the cursor, and jdbc includes functions for it, but I haven’t used it yet.
I think a python linecache module equivalent would work, with marshaling if cannot fit into memory.
If it fits in memory, then just use a slice to store the already read part; otherwise an ordered kv store can help.
If you’re looking for array-like random reads in the result set you’ll probably need to cache the whole result as you read through it. Hopefully it fits in memory (and will always fit in memory…). I don’t know of a library that would do this for you.
But it sounds like you’re trying to do client-side looping logic that is typically better off done at the db layer. Can you restructure the query to give you exactly what you want as you read through it the first time? If you want a random subsample you might be looking for reservoir sampling. Or for a simpler in-db solution for randomness you can use ORDER BY RAND()
or variations (this is sometimes not recommended depending on the db; google should help here).
That’s likely all I can say without more details.
I don’t know the answer, unfortunately.
However I’m curious as to why you are seeking this capability, as others have asked.
If the order of results returned is random, what’s the point in selecting a specific starting point? Sort of like sticking your hand into a bucket of randomly distributed raffle tickets and expecting to magically pick out the right one.
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.