Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for pagination or scrolling in Dataset.get_data #162

Open
theavey opened this issue Feb 25, 2021 · 3 comments
Open

Add support for pagination or scrolling in Dataset.get_data #162

theavey opened this issue Feb 25, 2021 · 3 comments

Comments

@theavey
Copy link

theavey commented Feb 25, 2021

Describe the problem.
For requests that are apparently too large, the API will return a timeout error. It doesn't seem clear beforehand what exactly will be a request that is too large, and a timeout error is not particularly helpful.

I opened a ticket with Marquee support asking for the best way to do this or a fix for it, but haven't heard back in a couple days.

Describe the solution you'd like
The possibly already supported pagination or scrolling could be made accessible in the method. Then, I can just create a wrapper that will just iterate over chunks and combine the results.

Describe alternatives you've considered
I have some code that iterates over years, but that sometimes fails. I could do smaller date ranges, but that would be overkill for smaller requests. I think the biggest issue with alternatives is that I don't want to have to chunk before I know when it might fail because each call introduces latency to my code.

Are you willing to contribute
Yes

Additional context
I can provide examples of requests that timed out if that's helpful, though running the examples might require access to our paid datasets.

@Dhavin
Copy link

Dhavin commented Aug 20, 2021

Hello @theavey, I would like to contribute to this project by solving this issue. Can I?

@theavey
Copy link
Author

theavey commented Aug 20, 2021

I am not an admin of this repo, but that would be great. I've had to implement other workarounds, but a more "native" solution within the package would be helpful

@Cruppelt
Copy link

Hey @theavey and @Dhavin, we will look into this request. Currently, our Data APIs don't have a scroll/pagination API. If you are seeing timeouts for larger range queries, we currently recommend making smaller date/time range requests. These queries can be parallelized via threads for potentially significant speed improvements. We also have a utility class (

class ThreadPoolManager:
) that helps manage the threads, sessions, and contexts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants