Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make error messages on typos and missing/incorrect data types more informative #2824

Merged
merged 10 commits into from
Feb 21, 2023
17 changes: 17 additions & 0 deletions altair/utils/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -532,6 +532,23 @@ def parse_shorthand(
if isinstance(attrs["type"], tuple):
attrs["sort"] = attrs["type"][1]
attrs["type"] = attrs["type"][0]

# If an unescaped colon is still present, it's often due to an incorrect data type specification
# but could also be due to using a column name with ":" in it.
if (
"field" in attrs
and ":" in attrs["field"]
and attrs["field"][attrs["field"].rfind(":") - 1] != "\\"
):
raise ValueError(
'"{}" '.format(attrs["field"].split(":")[-1])
+ "is not one of the valid encoding data types: {}.".format(
", ".join(TYPECODE_MAP.values())
)
+ "\nFor more details, see https://altair-viz.github.io/altair-docs/user_guide/encodings/index.html#encoding-data-types. "
+ "If you are trying to use a column name that contains a colon, "
+ 'prefix it with a backslash; for example "column\\:name" instead of "column:name".'
mattijn marked this conversation as resolved.
Show resolved Hide resolved
)
return attrs


Expand Down
9 changes: 6 additions & 3 deletions altair/vegalite/v5/schema/channels.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,12 @@ def to_dict(self, validate=True, ignore=(), context=None):
parsed.pop('type', None)
elif not (type_in_shorthand or type_defined_explicitly):
if isinstance(context.get('data', None), pd.DataFrame):
joelostblom marked this conversation as resolved.
Show resolved Hide resolved
raise ValueError("{} encoding field is specified without a type; "
"the type cannot be inferred because it does not "
"match any column in the data.".format(shorthand))
raise ValueError(
'Unable to determine data type for the field "{}";'
" verify that the field name is not misspelled."
" If you are referencing a field from a transform,"
joelostblom marked this conversation as resolved.
Show resolved Hide resolved
" also confirm that the data type is specified correctly.".format(shorthand)
)
else:
raise ValueError("{} encoding field is specified without a type; "
"the type cannot be automatically inferred because "
Expand Down
3 changes: 2 additions & 1 deletion doc/releases/changes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Enhancements
- Saving charts with HTML inline is now supported without having altair_saver installed (#2807).
- The documentation page has been revamped, both in terms of appearance and content.
- More informative autocompletion by removing deprecated methods (#2814) and adding support for completion in method chains for editors that rely on type hints (e.g. VS Code) (#2846)
- Improved error messages (#2842)
- Substantially improved error handling. Both in terms of finding the more relevant error (#2842), and in terms of improving the formatting and clarity of the error messages (#2824, #2568).
- Include experimental support for the DataFrame Interchange Protocol (through `__dataframe__` attribute). This requires `pyarrow>=11.0.0` (#2888).

Grammar Changes
Expand All @@ -45,6 +45,7 @@ Bug Fixes
Backward-Incompatible Changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Colons in column names must now be escaped to remove any ambiguity with encoding types. You now need to write ``"column\:name"`` instead of ``"column:name"`` (#2824).
- Removed the Vega (v5) wrappers and deprecate rendering in Vega mode (save Chart as Vega format is still allowed) (#2829).
- Removed the Vega-Lite 3 and 4 wrappers (#2847).
- In regards to the grammar changes listed above, the old terminology will still work in many basic cases. On the other hand, if that old terminology gets used at a lower level, then it most likely will not work. For example, in the current version of :ref:`gallery_scatter_with_minimap`, two instances of the key ``param`` are used in dictionaries to specify axis domains. Those used to be ``selection``, but that usage is not compatible with the current Vega-Lite schema.
Expand Down
17 changes: 12 additions & 5 deletions tests/utils/tests/test_core.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ def check(s, **kwargs):

# Fields alone
check("foobar", field="foobar")
check("blah:(fd ", field="blah:(fd ")
check(r"blah\:(fd ", field=r"blah\:(fd ")

# Fields with type
check("foobar:quantitative", type="quantitative", field="foobar")
Expand All @@ -101,14 +101,14 @@ def check(s, **kwargs):

# check that invalid arguments are not split-out
check("invalid(blah)", field="invalid(blah)")
check("blah:invalid", field="blah:invalid")
check("invalid(blah):invalid", field="invalid(blah):invalid")
check(r"blah\:invalid", field=r"blah\:invalid")
check(r"invalid(blah)\:invalid", field=r"invalid(blah)\:invalid")

# check parsing in presence of strange characters
check(
"average(a b:(c\nd):Q",
r"average(a b\:(c\nd):Q",
aggregate="average",
field="a b:(c\nd",
field=r"a b\:(c\nd",
type="quantitative",
)

Expand Down Expand Up @@ -281,3 +281,10 @@ def test_infer_encoding_types_with_condition():
),
)
assert infer_encoding_types(args, kwds, channels) == expected


def test_invalid_data_type():
with pytest.raises(
ValueError, match=r'"\(fd " is not one of the valid encoding data types'
):
parse_shorthand(r"blah:(fd ")