-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add development runs #522
Add development runs #522
Conversation
- Add "mode: debug" property for environments - Add "--deploy", "--compute", "--no-wait" CLI flags
cmd/bundle/deploy.go
Outdated
b.Config.Bundle.Lock.Force = forceDeploy | ||
|
||
if computeID == "" { | ||
computeID = os.Getenv("DATABRICKS_CLUSTER_ID") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this to DATABRICKS_CLUSTER_ID for now since we have that precedent. Likely we need to change both of them over time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field should also be called cluster_id
if we pull the cluster ID. We can have one for each of clusters, compute, warehouses. The flag parsing + environment overrides should happen centrally and not by reusing globals from other commands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compute and compute_id are the general terms we're using for serverless jobs / interactive to indicate both clusters and "serverless" clusters. Warehouses are separate. I'd rather not change the name to cluster_id if it's really an id that also works for serverless compute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, computeID
works. Can you remove the getenv check though? It will mean that deployment fails if that value is set if it's not in dev mode, which would be unexpected behavior to me. Having the flag alone captures the intent without that failure mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated it so we only fail if the override is set in the bundle or CLI, and not when it's set in the environment.
cmd/bundle/deploy.go
Outdated
b.Config.Bundle.Lock.Force = forceDeploy | ||
|
||
if computeID == "" { | ||
computeID = os.Getenv("DATABRICKS_CLUSTER_ID") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field should also be called cluster_id
if we pull the cluster ID. We can have one for each of clusters, compute, warehouses. The flag parsing + environment overrides should happen centrally and not by reusing globals from other commands.
bundle/config/bundle.go
Outdated
Mode Mode `json:"mode,omitempty"` | ||
|
||
// Overrides the compute used for jobs and other supported assets. | ||
Compute string `json:"override_compute,omitempty" bundle:"readonly"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use a different name for the JSON field and the Go field?
Does the compute field in the jobs API apply to clusters as well? Or do you take this value and plan to use it in both the compute and cluster fields in the job specs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it is used as ID, so should be called ComputeID
to match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the expected behavior if this value is set by the user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this to just say computeId/ComputeID. Is "-id" the customer-facing convention we have throughout the spec?
Note that this wasn't meant to set from the config, hence the "readonly," but that doesn't seem to do what I expected it to. I updated it so it can be set at the environment level now.x
In terms of naming, what I can do is also change the parameter name to say "--compute-id"/"-c" instead of "--compute"? It's a bit longer but I tentatively made that change.
@@ -18,6 +18,10 @@ func (m *populateCurrentUser) Name() string { | |||
} | |||
|
|||
func (m *populateCurrentUser) Apply(ctx context.Context, b *bundle.Bundle) error { | |||
if b.Config.Workspace.CurrentUser != nil { | |||
return nil | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious: has this run twice in your testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it would, but I think that may have only happened with databricks deploy --run
. So I'll remove this.
We should actually have a general fix to make sure w.CurrentUser.Me()
is cached, since it is very very slow and was still being called a second time in some other cases. This could be part of a general initiative to make things a bit faster.
r.Jobs[i].MaxConcurrentRuns = debugConcurrentRuns | ||
} | ||
if r.Jobs[i].Schedule != nil { | ||
r.Jobs[i].Schedule.PauseStatus = "PAUSED" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These "PAUSED" strings should be the constant jobs.PauseStatusPaused
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Is that another SDK name that should be changed, btw? :( That doesn't seem like a very idiomatic name.
cmd/bundle/deploy.go
Outdated
b.Config.Bundle.Lock.Force = forceDeploy | ||
|
||
if computeID == "" { | ||
computeID = os.Getenv("DATABRICKS_CLUSTER_ID") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, computeID
works. Can you remove the getenv check though? It will mean that deployment fails if that value is set if it's not in dev mode, which would be unexpected behavior to me. Having the flag alone captures the intent without that failure mode.
529ab3f
to
6b221c0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. See comment for failing test explanation.
* Add development runs ([#522](#522)). * Support tab completion for profiles ([#572](#572)). * Correctly use --profile flag passed for all bundle commands ([#571](#571)). * Disallow notebooks in paths where files are expected ([#573](#573)). * Improve auth login experience ([#570](#570)). * Remove base path checks during sync ([#576](#576)). * First look for databricks.yml before falling back to bundle.yml ([#580](#580)). * Integrate with auto-release infra ([#581](#581)). API Changes: * Removed `databricks metastores maintenance` command. * Added `databricks metastores enable-optimization` command. * Added `databricks tables update` command. * Changed `databricks account settings delete-personal-compute-setting` command with new required argument order. * Changed `databricks account settings read-personal-compute-setting` command with new required argument order. * Added `databricks clean-rooms` command group. OpenAPI SHA: 850a075ed9758d21a6bc4409506b48c8b9f93ab4, Date: 2023-07-18 Dependency updates: * Bump golang.org/x/term from 0.9.0 to 0.10.0 ([#567](#567)). * Bump golang.org/x/oauth2 from 0.9.0 to 0.10.0 ([#566](#566)). * Bump golang.org/x/mod from 0.11.0 to 0.12.0 ([#568](#568)). * Bump github.com/databricks/databricks-sdk-go from 0.12.0 to 0.13.0 ([#585](#585)).
* Add development runs ([#522](#522)). * Support tab completion for profiles ([#572](#572)). * Correctly use --profile flag passed for all bundle commands ([#571](#571)). * Disallow notebooks in paths where files are expected ([#573](#573)). * Improve auth login experience ([#570](#570)). * Remove base path checks during sync ([#576](#576)). * First look for databricks.yml before falling back to bundle.yml ([#580](#580)). * Integrate with auto-release infra ([#581](#581)). API Changes: * Removed `databricks metastores maintenance` command. * Added `databricks metastores enable-optimization` command. * Added `databricks tables update` command. * Changed `databricks account settings delete-personal-compute-setting` command with new required argument order. * Changed `databricks account settings read-personal-compute-setting` command with new required argument order. * Added `databricks clean-rooms` command group. OpenAPI SHA: 850a075ed9758d21a6bc4409506b48c8b9f93ab4, Date: 2023-07-18 Dependency updates: * Bump golang.org/x/term from 0.9.0 to 0.10.0 ([#567](#567)). * Bump golang.org/x/oauth2 from 0.9.0 to 0.10.0 ([#566](#566)). * Bump golang.org/x/mod from 0.11.0 to 0.12.0 ([#568](#568)). * Bump github.com/databricks/databricks-sdk-go from 0.12.0 to 0.13.0 ([#585](#585)).
* Add development runs ([#522](#522)). * Support tab completion for profiles ([#572](#572)). * Correctly use --profile flag passed for all bundle commands ([#571](#571)). * Disallow notebooks in paths where files are expected ([#573](#573)). * Improve auth login experience ([#570](#570)). * Remove base path checks during sync ([#576](#576)). * First look for databricks.yml before falling back to bundle.yml ([#580](#580)). * Integrate with auto-release infra ([#581](#581)). API Changes: * Removed `databricks metastores maintenance` command. * Added `databricks metastores enable-optimization` command. * Added `databricks tables update` command. * Changed `databricks account settings delete-personal-compute-setting` command with new required argument order. * Changed `databricks account settings read-personal-compute-setting` command with new required argument order. * Added `databricks clean-rooms` command group. OpenAPI SHA: 850a075ed9758d21a6bc4409506b48c8b9f93ab4, Date: 2023-07-18 Dependency updates: * Bump golang.org/x/term from 0.9.0 to 0.10.0 ([#567](#567)). * Bump golang.org/x/oauth2 from 0.9.0 to 0.10.0 ([#566](#566)). * Bump golang.org/x/mod from 0.11.0 to 0.12.0 ([#568](#568)). * Bump github.com/databricks/databricks-sdk-go from 0.12.0 to 0.13.0 ([#585](#585)).
(This is a new copy of #493, as my fork was broken when this repository was made public.)
This implements the "debug run" or "development run" functionality that we desire for DABs in the workspace / IDE.
bundle.yml changes
In bundle.yml, there should be a "dev" environment that is marked as
mode: debug
:Setting
mode
todevelopment
indicates that this environment is used just for running things for development. This results in several changes to deployed assets:--compute-id
parameter in the CLIOther accepted values for
mode
aredefault
(which does nothing) andpull-request
(which is reserved for future use).CLI changes
To run a single job called "shark_sighting" on existing compute, use the following commands:
which would deploy and run a job called "[dev] shark_sightings" on the compute provided. Note that
--compute-id
is not accepted in production environments, so we show an error ifmode: development
is not used.The
run --deploy
command offers a convenient shorthand for the common combination of deploying & running:The
--deploy
addition isn't really essential and I welcome feedback 🤔 I played with the idea of a "debug" or "dev" command but that seemed to only make the option space even broader for users. The above could work well with an IDE or workspace that automatically sets the target compute.One more thing I added is
run --no-wait
can now be used to run something without waiting for it to be completed (useful for IDE-like environments that can display progress themselves).