-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MAINTENANCE] Refactor usage_stats_opt_out
method in DataContext
#5339
[MAINTENANCE] Refactor usage_stats_opt_out
method in DataContext
#5339
Conversation
…ure/GREAT-953/migration-part1-move-to-abc
…ure/GREAT-953/migration-part1-move-to-abc * bugfix/GREAT-953/tests-deleting-env-variable: pushing fix for test
…ure/GREAT-953/migration-part1-move-to-abc * bugfix/GREAT-953/tests-deleting-env-variable: removed extra fixture
* develop: [MAINTENANCE] Suppress persisting of temporary ExpectationSuite configurations in Rule-Based Profiler computations (#5305) [DOCS] Update standard arguments doc for Expectations to not reference datasets. (#5052) [MAINTENANCE] Handle compare Expectation in presence of high precision floating point numbers and NaN values (#5298) [MAINTENANCE] `DataAssistantResult` cleanup and extensibility enhancements (#5259) chore: check for mypy existence (#5291)
…ure/GREAT-953/migration-part1-move-to-abc * bugfix/GREAT-953/tests-deleting-env-variable: Update migration_guide_postgresql_v3_api.py [MAINTENANCE] Suppress persisting of temporary ExpectationSuite configurations in Rule-Based Profiler computations (#5305) [DOCS] Update standard arguments doc for Expectations to not reference datasets. (#5052) [MAINTENANCE] Handle compare Expectation in presence of high precision floating point numbers and NaN values (#5298) [MAINTENANCE] `DataAssistantResult` cleanup and extensibility enhancements (#5259) chore: check for mypy existence (#5291)
…r-usage-stats-opt * develop: [MAINTENANCE] In ExecutionEngine: Make variable names and usage more descriptive of their purpose. (#6342) [MAINTENANCE] Refactor variable aggregation/substitution logic into `ConfigurationProvider` hierarchy (#6321) [MAINTENANCE] Add missing one-line docstrings and try to make the others consistent (#6340) [BUGFIX] Fix issue with misaligned indentation in docs snippets (#6339)
@@ -2437,6 +2440,21 @@ def _get_metric_configuration_tuples( | |||
|
|||
return metric_configurations_list | |||
|
|||
@classmethod | |||
def validate_config( | |||
cls, project_config: Union[DataContextConfig, Mapping] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validate_config()
's behavior has been extended to always return a DataContextConfig
or error out. This helps with mypy
errors, which were complaining that parameters were being Union[DataContextConfig, Mapping]
in one place and DataContextConfig
in another.
It has also been moved up to AbstractDataContext
because BaseDataContext
and CloudDataContext
both use it.
This is the previous implementation
@classmethod
def validate_config(cls, project_config: Union[DataContextConfig, Mapping]) -> bool:
if isinstance(project_config, DataContextConfig):
return True
try:
dataContextConfigSchema.load(project_config)
except ValidationError:
raise
return True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this name still appropriate? Is there perhaps a better name to signal what it is doing now that it isn't returning a bool
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
) | ||
|
||
project_data_context_config: DataContextConfig = ( | ||
BaseDataContext.validate_config(project_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For static/class methods called in instances, do we want to use the class name or self
? Good either way but let's be consistent. I think I'm the one who added this class call but I might be the outlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great question @cdkini. I had a look, and I think it's a mix of class name and self 😅 My opinion is that we should use class name to distinguish between these and instance methods
@@ -88,8 +88,13 @@ def __init__( | |||
project_config = self.retrieve_data_context_config_from_ge_cloud( | |||
ge_cloud_config=self._ge_cloud_config, | |||
) | |||
|
|||
project_data_context_config: DataContextConfig = ( | |||
CloudDataContext.validate_config(project_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same note
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replied above
@@ -1668,7 +1669,7 @@ def convert_ndarray_decimal_to_float_dtype(data: np.ndarray) -> np.ndarray: | |||
|
|||
|
|||
def get_context( | |||
project_config: Optional[Union["DataContextConfig", dict]] = None, | |||
project_config: Optional[Union["DataContextConfig", Mapping]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we fix up this DataContextConfig
type annotation by using from __future__ import annotations
and if TYPE_CHECKING
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call :) doing now, and for a few more parameters below.
@@ -94,7 +94,7 @@ def test_common_usage_stats_are_sent_no_mocking( | |||
context._usage_statistics_handler._close_worker() | |||
|
|||
# Make sure usage stats are enabled | |||
assert not context._check_global_usage_statistics_opt_out() | |||
assert context._get_global_usage_statistics_override() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we had this issue before but I don't love directly accessing a private method here - is it possible to get around this at all? Do we need this assertion?
Feels like we might be testing an implementation detail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can understand this concern. I think we can get around this by checking the boolean flag on anonymous_usage_statistics
property, which we do directly below. If that is the case then we can just remove this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some misc comments but mainly LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a non blocking rename suggestion.
@@ -2530,21 +2545,24 @@ def _load_config_variables(self) -> Dict: | |||
return config_var_provider.get_values() | |||
return {} | |||
|
|||
def _check_global_usage_statistics_opt_out(self) -> bool: | |||
@staticmethod | |||
def _get_global_usage_statistics_override() -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about something like _is_usage_stats_enabled() -> bool:
since this method returns a boolean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this suggestion :) changing now
@@ -83,7 +83,7 @@ def test_consistent_name_anonymization( | |||
|
|||
|
|||
@pytest.mark.base_data_context | |||
def test_opt_out_environment_variable_base_data_context( | |||
def test_global_override_environment_variable_base_data_context( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
- great_expectations.yml (YML) | ||
- GE_USAGE_STATS environment variable (env) | ||
- conf file (conf) | ||
If it is set as `False` in *any* of the 3 places, the global value is set to `False` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
…r-usage-stats-opt
…r-usage-stats-opt
* develop: [MAINTENANCE] Update column_reflection_fallback to also use schema name for Trino (#6350) [MAINTENANCE] Misc cleanup of GX Cloud helpers (#6352) [MAINTENANCE] Pin `mypy` to `0.990` (#6361) Subject: Support to include ID/PK in validation result for each row t… (#5876) [MAINTENANCE] Fix computed metrics type hint in ExecutionEngine.resolve_metrics() method (#6347) [MAINTENANCE] Refactor `usage_stats_opt_out` method in DataContext (#5339) Zep PostgresDatasource returns a list of batches. (#6341) [DOCS] Add `yarn snippet-check` command (#6351) [MAINTENANCE] Refactor out `termcolor` dependency (#6348) [MAINTENANCE] Move Cloud-specific enums to `cloud_constants.py` (#6349)
Changes proposed in this pull request:
Follow up to [MAINTENANCE] Move
_apply_global_config_overrides()
to AbstractDataContext #5285Small refactor to
_check_global_usage_statistics_opt_out()
so that we dont have to "flip" the boolean in our minds. Before ifopt_out
=True
, thenusage_statistics_enabled
=False
.Now we use
_get_global_usage_statistics_override()
which returnsTrue
ifusage_statistics_enabled
=True
, andFalse
ifusage_statistics_enabled
=False
.GREAT-953
Update 2022-08-26
anonymous_usage_stats_enabled
can be set in the following 3 places-
great_expectations.yml
(YML)-
GE_USAGE_STATS
environment variable (env)-
conf
file (conf)If it is set as
False
in any of the 3 places, the global value is set toFalse
environment variable > home folder > /etc > yml
False
will set the global usage_statistics flag toFalse
.Definition of Done
Please delete options that are not relevant.