Skip to content

Run provider getters in parallel to reduce startup time#118

Open
michaelasp wants to merge 1 commit intokubernetes-sigs:mainfrom
michaelasp:providerParallel
Open

Run provider getters in parallel to reduce startup time#118
michaelasp wants to merge 1 commit intokubernetes-sigs:mainfrom
michaelasp:providerParallel

Conversation

@michaelasp
Copy link
Copy Markdown
Contributor

When reviewing #116 I saw that we don't run these getters in parallel. Since all of them time out after ~5 seconds, worst case scenario we take 5*(# of providers) to start up. Run these in parallel, the network overhead should be minimal and we already reduce the worst case startup time by 10 seconds.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: michaelasp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Mar 30, 2026
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 30, 2026
@michaelasp michaelasp requested a review from aojea March 30, 2026 18:02
@aojea
Copy link
Copy Markdown
Contributor

aojea commented Mar 30, 2026

can we identify if the urls are the same? running a bunch of http requests in parallel will hit some rate limit and make it worse ... also we should have a metrics for startup latency and one that specifically measure this time to identify the instance ...

I want to avoid premature optimization here

@michaelasp
Copy link
Copy Markdown
Contributor Author

can we identify if the urls are the same? running a bunch of http requests in parallel will hit some rate limit and make it worse ... also we should have a metrics for startup latency and one that specifically measure this time to identify the instance ...

I want to avoid premature optimization here

169.254.169.254 is the backend for all of them, so likely we will get a response, but we will poll until timeout on them all from what I can tell.

Let me take a look, no matter what just looking at it heuristically, if we use OKE today then we have to time out twice on startup. 10s isn't too bad but will increase when we add AWS, Alibaba, etc...

I can add startup latency metrics just to see whether this hypothesis is correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants