You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug extract API seems to error on a valid regex.
Steps/Code to reproduce bug
>>>importcudf>>>s=cudf.Series(['a1', 'b2', 'c3'])
>>>s.str.extract(r'(?P<letter>[ab])(?P<digit>\d)')
Traceback (mostrecentcalllast):
File"<stdin>", line1, in<module>File"/conda/envs/cudf/lib/python3.7/site-packages/cudf/core/column/string.py", line452, inextractout=cpp_extract(self._column, pat)
File"cudf/_lib/strings/extract.pyx", line32, incudf._lib.strings.extract.extractRuntimeError: cuDFfailureat: /cudf/cpp/src/strings/regex/regcomp.cpp:398: invalidregexpattern: nothingtorepeatatposition1>>>s.to_pandas().str.extract(r'(?P<letter>[ab])(?P<digit>\d)')
letterdigit0a11b22NaNNaN# The above reg-ex seems to be a valid one.>>>importre>>>re.compile(r'(?P<letter>[ab])(?P<digit>\d)')
re.compile('(?P<letter>[ab])(?P<digit>\\d)')
Expected behavior
I think we shouldn't be erroring in case of this regex, a followup to this is maybe we'll need an API to extract the column names from regex like in this case.
Environment overview (please complete the following information)
Describe the bug
extract
API seems to error on a valid regex.Steps/Code to reproduce bug
Expected behavior
I think we shouldn't be erroring in case of this regex, a followup to this is maybe we'll need an API to extract the column names from regex like in this case.
Environment overview (please complete the following information)
branch-0.14
]Additional context
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.extract.html
The text was updated successfully, but these errors were encountered: