Move to position of sql.Rows result

dskinner · September 30, 2015, 4:28pm

are there any libraries out there that allow efficiently moving to a given position of a query result without keeping the entire result set in memory?

fwip · September 30, 2015, 5:06pm

That seems like a strange thing to want to do. I don’t believe SQL guarantees the ordering of results (unless you explicitly use an ORDER BY clause), so I’m not sure what use case you’d need to select a specific index.

You can set a maximum number of queries to return with LIMIT or TOP (depending on your server), and there are some tricks you can do to specify an offset if you really need to - see this relevant Stack Overflow question.

Stabby · September 30, 2015, 6:20pm

The db library doesn’t keep everything in memory - it loads chunks of data at a time, filling a buffer. Once you hit Next() enough times, the buffer is run through, and it fetches the next chunk.

I think you can do what you want by simply calling Next() X times, where X is the row you want to end up on?

dskinner · September 30, 2015, 6:30pm

How the result set is actually buffered depends on the implementation of a given driver. What I’m asking is about moving to any position in the result set, not just moving in one direction through the set.

Stabby · September 30, 2015, 6:40pm

Thats a good point - the buffering is up to the driver.

But I believe the sql interface is such that you can only go forwards, not backwards, and thus there is no arbitrary seek.

dskinner · September 30, 2015, 6:59pm

I’m not concerned about ordering, just moving to a specific position within a result set. The set would either need to exist entirely in memory or be cached in some other fashion in case the memory required to hold the set isn’t available so as to maintain a “view” of the set as it were when the query was executed.

I’m not expecting this of any db driver, just wondering if there was a package out there, somewhere, that facilitated this.

riking · October 1, 2015, 4:12am

What exactly is your use-case for seeking within a result set?

Do you know how big the result is going to be?

Tamas_Gulacsi · October 1, 2015, 4:25am

Some servers allow specifying a read order on the cursor, and jdbc includes functions for it, but I haven’t used it yet.

I think a python linecache module equivalent would work, with marshaling if cannot fit into memory.

If it fits in memory, then just use a slice to store the already read part; otherwise an ordered kv store can help.

infogulch · October 2, 2015, 4:57pm

If you’re looking for array-like random reads in the result set you’ll probably need to cache the whole result as you read through it. Hopefully it fits in memory (and will always fit in memory…). I don’t know of a library that would do this for you.

But it sounds like you’re trying to do client-side looping logic that is typically better off done at the db layer. Can you restructure the query to give you exactly what you want as you read through it the first time? If you want a random subsample you might be looking for reservoir sampling. Or for a simpler in-db solution for randomness you can use ORDER BY RAND() or variations (this is sometimes not recommended depending on the db; google should help here).

That’s likely all I can say without more details.

thoffen · October 2, 2015, 5:35pm

I don’t know the answer, unfortunately.

However I’m curious as to why you are seeking this capability, as others have asked.

If the order of results returned is random, what’s the point in selecting a specific starting point? Sort of like sticking your hand into a bucket of randomly distributed raffle tickets and expecting to magically pick out the right one.

system · December 31, 2015, 5:35pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.