Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't make regexp work #96

Open
wabernat opened this issue Sep 1, 2020 · 14 comments
Open

Can't make regexp work #96

wabernat opened this issue Sep 1, 2020 · 14 comments
Labels
bug Something isn't working

Comments

@wabernat
Copy link

wabernat commented Sep 1, 2020

When I try to use regular expressions in the copybutton_prompt_text field (with copybutton_remove_prompts and copybutton_prompt_is_regex set to True), sphinx-copybutton does not remove prompts.

To reproduce the behavior:

  1. Set conf.py as follows:
copybutton_prompt_text = r"\$\  |\#\ "
copybutton_prompt_is_regexp = True
copybutton_remove_prompts = True

  1. Run sphinx-build

Expectations

I expect sphinx-copybutton to detect and omit from copying the user "$ " OR root "# " prompts (i.e., exclude both prompt strings from copying) .

Environment

A virtual environment (Dockerized) running:

  • Python version 2.7 or 3.7
  • Sphinx version 1.8.3 or 3.1.2
  • Operating System: OS X 10.15.5

Additional context

I am embarrassed to open this as a bug report, as my assumption is that this is either my ineptitude with regexp (though I have tried every permutation that I can imagine) or a weird setup issue. Is there a canonical environment in which regexps are functional?

@wabernat wabernat added the bug Something isn't working label Sep 1, 2020
@welcome
Copy link

welcome bot commented Sep 1, 2020

Thanks for opening your first issue here! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out EBP's Code of Conduct. Also, please try to follow the issue template as it helps other community members to contribute more effectively.

If your issue is a feature request, others may react to it, to raise its prominence (see Feature Voting).

Welcome to the EBP community! 🎉

@chrisjsewell
Copy link
Member

Oh no regexes can be tricky. It looks like you have two spaces after dollar, have you tried using the specific white space regex: r"\$\s|\#\s"?

@wabernat
Copy link
Author

wabernat commented Sep 2, 2020

Thanks for your reply!

I tried:
copybutton_prompt_text = r"\$\s|\#\s"

On the built document, the Copy button copies the entire block, but does not omit the "$ " prompt.

@wabernat
Copy link
Author

wabernat commented Sep 2, 2020

FWIW, the JavaScript regex "or" pattern looks to involve parentheses: https://www.w3schools.com/jsref/jsref_regexp_xy.asp.

but I'm not having much luck permutating on that, either...

@chrisjsewell
Copy link
Member

@choldgraf just remembered that I left @wabernatScality hanging a bit here lol. I'm a bit out of ideas, without perhaps going into the code and trying to add more test cases

@choldgraf
Copy link
Member

@wabernatScality do you still have these issues? this is also something I don't know how to debug either, I am terrible with regexes :-/ it sounds like it might be a bit of time before we can figure out a solution, just FYI

@wabernat
Copy link
Author

wabernat commented Oct 5, 2020

We have not found a solution. I've rolled out sphinx-copybutton with regexp turned off, hoping our own talent might be sufficiently bent out of shape to fix it as a challenge. In the meantime, we make do with '$ ' as our default "cropped" prompt.

@choldgraf
Copy link
Member

Sounds good - if anybody can figure out a fix for why this isn't working I am happy to review and cut a release quickly 👍

@diego-sueiro
Copy link

Any updates on this issue? I can also reproduce it with and without copybutton_prompt_is_regexp. I'm using Python==3.6.9 sphinx==4.0.2 sphinx_rtd_theme==0.5.2 sphinx-copybutton==0.4.0.

@stefanodavid
Copy link
Contributor

I am facing a similar issue (perhaps exactly the same). I was asked to setup a quitevery complex regexp, which validates at regexp101.com, to remove a number of prompts from the docs.

After quite a lot of debugging, my guess is that @wabernat was on the right track: ORs using ( ) are apparently not supported in this extension. I stripped down my prompt text to set up a counterexample.

  • copybutton_prompt_text = r"cluster#\s|single#\s" this works
  • copybutton_prompt_text = r"(cluster|single)#\s" this does not work, regardless of any escape used on parentheses (e.g, \(, \\(, \\\( etc.).

Both regexp validate at regexp101.com and match against the same set of examples.

  • The isse raised since we wanted to catch non only # as prompt ending character, but also $, therefore I added (cluster |single)(#|\$)\s as regexp.

To be honest, I have no idea how to fix this issue, since my regexp knowledge is for sure lower than @choldgraf 's 😅

@Samy-Oubouaziz
Copy link

Samy-Oubouaziz commented Feb 21, 2023

Hello all, I am currently working on the same project as @wabernat was 3 years ago for the same company, and ran into the same problems, which led me to this issue !

We use a variety of prompts in the following formats:

  • #
  • $
  • [foo@bar]#
  • {{fooBar}}#

(each one has a trailing space)

I managed to get the regex to work by adding the following lines in our conf.py file:

copybutton_prompt_text = '\# |\$ |\[.*\@.*\]\# |\{\{.*\}\}\# '

copybutton_prompt_is_regexp = True

copybutton_remove_prompts = True

The documentation is a bit unclear and feels like we have to use enclose the regular expression in a raw string for it to work, but it seems that it is not the case.

Can anyone confirm or deny ?

Also, the documentation should display basic examples of regex to copy-paste as is to the conf.py of a project, at least for testing purposes.

Example:

The following configuration will treat the string "# " or "$ " as prompts which will not be copied.

copybutton_prompt_is_regexp = True

copybutton_remove_prompts = True

copybutton_prompt_text = '\# |\$ '

I can open a documentation PR if you agree with this proposal.

Thanks !

Edit: behavior seems to be the same with or without the r at the beginning of the regex prompt text.

@Samy-Oubouaziz
Copy link

Samy-Oubouaziz commented Feb 21, 2023

@stefanodavid You may have solved your issue already, but maybe try the following:

copybutton_prompt_text = r"cluster\# |single\# |cluster\$ |single\$ "

I do not know the regex best practices, but if it is either "cluster" or "single" + "# " or "$ " (which is only 4 possible prompts), maybe it is not worth factorizing everything!

@stefanodavid
Copy link
Contributor

@Samy-Oubouaziz in the end I used indeed a similar approach to what you propose: to enumerate all possible prompts. However. as I mentioned, I had quite a number of different prompt (around12-14 IIRC, I left that project now so can't check), each with # and $, so it became really a complex enumeration.

My comment however was more to point out that this extension currently does not support complex regexp with ( ), to help spare time in case someone encounters the same problem.

@Samy-Oubouaziz
Copy link

Thank you @stefanodavid ! This might be a good time to close this issue @choldgraf, and maybe open a doc PR for better regex explanations for users who are not familiar with it, including:

  • basic regex examples
  • a list of characters that need to be escaped maybe?
  • an admonition regarding the limitations of regex in t his context

Thanks again !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants