Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new method to $dom, $dom->each_text. #2067

Closed
wants to merge 1 commit into from

Conversation

tcheukueppo
Copy link

@tcheukueppo tcheukueppo commented May 11, 2023

Hello,

Summary

  1. New $dom method $dom->each_text: which returns text content of every descendant node in a list.
  2. Tests on $dom->each_text.

Motivation

I'd like to have this feature to be table to perform context sensitive parsing on the text content of a given tag(with all it descendant in the most frequent cases) without interference with contents of other tags(which is something that $dom->all_text does not provide), I'm totally aware of $dom->text but that do require me to walk through the $dom data structure and use css selectors which is quite tedious.

@kraih
Copy link
Member

kraih commented May 11, 2023

Please don't combine multiple independent changes into a single PR. It lowers the chances of it being accepted significantly. Keep it to one commit per PR.

@tcheukueppo
Copy link
Author

Sorry, I think it is the perltidy, let me undo that.

@kraih
Copy link
Member

kraih commented May 11, 2023

I specifically meant that adding each_text and changing the parser regex are unrelated changes.

@tcheukueppo tcheukueppo changed the title Update HTML regex and add new method to $dom, $dom->each_text. Add new method to $dom, $dom->each_text. May 11, 2023
@tcheukueppo
Copy link
Author

After performing some tests I think modifying parser regex isn't worth it and thus no PR for that.

@tcheukueppo tcheukueppo reopened this May 11, 2023
@tcheukueppo
Copy link
Author

done

@@ -468,6 +486,16 @@ excluded.
# "foo\nbarbaz\n"
$dom->parse("<div>foo\n<p>bar</p>baz\n</div>")->at('div')->all_text;

=head2 each_text

my @text = $dom->each_text;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mojo::Headers's to_hash() can take "1" as an argument, so maybe we could add that to all_text() instead of adding a new method? Also, how about returning a Mojo::Collection?

@kraih
Copy link
Member

kraih commented May 20, 2023

I don't really see the use of this feature i'm afraid. Maybe some more specific examples for use cases would help.

@tcheukueppo
Copy link
Author

Really sorry for bothering you guys, I was working on a project and I came across this, I thought that do be useful but turns out that my approach was not the right approach, it is obviously not useful to me now and probably be never useful to anyone out there, I found another technique that did not even required me to monkey_patch anything, thanks for your attention,

@tcheukueppo
Copy link
Author

Closing this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants