Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor change code, add TextRange and TextSize types #638

Closed
wants to merge 14 commits into from

Conversation

oberblastmeister
Copy link
Contributor

I have put the new Change in a separate file and haven't updated everything yet.

@kirawi
Copy link
Member

kirawi commented Aug 22, 2021

I'm confused, was there prior discussion about this on Matrix? I don't quite understand the reason for this change 😅

@cessen
Copy link
Contributor

cessen commented Aug 22, 2021

I really appreciate your enthusiasm to contribute to Helix. However, it's not at all clear to me what the benefit of this refactor is, nor the new types you're adding. It feels like abstraction for abstraction's sake, and to my eye at least actually makes things less straight-forward/obvious.

Maybe I missed where this was discussed, but I'd really recommend opening an issue to discuss sweeping changes like this and get some consensus before actually coding them up. That way you don't spend a lot of effort that just gets rejected. (Again, maybe I just missed where that discussion happened, in which case nevermind.)

@oberblastmeister
Copy link
Contributor Author

I did talk on matrix about something similar but that was just for me to understand things. There was no prior discussion matrix about this refactor, The reason for this change is that this greatly simplifies the code. A single Change can now be taken in isolation. There are no unnecessary conversions between changes and Operation and then back. Changing to Operation doesn't even help because it makes processing even harder as it cannot be taken in isolation which means processing them must repeat the same mutable code. This code also enforces stronger invariants. It makes sure that every Change is disjoint and that they are sorted. Not doing this can cause issues. In addition, this pr adds TextSize and TextRange which are more ergonomic types. They inforce more invariants than representing them as tuples and have multiple useful functions that I have seen repeated in code.

@oberblastmeister
Copy link
Contributor Author

Thanks for appreciating my enthusiasm! I should definitely post an issue next time, sorry. I think I definitely am not overabstracting though. I removed the Operation abstraction, which I thought was unnecessary for the reasons above.TextSize and TextRange are nice abstractions, especially when we are in an editor that uses these sort of things all the time. The stuff in the one module I am not sure about yet. I think they are nice because they allow types to be 'correct by construction'.

@oberblastmeister oberblastmeister changed the title refactor change code, add TextRange and TextSize primitives refactor change code, add TextRange and TextSize types Aug 22, 2021
@oberblastmeister
Copy link
Contributor Author

@cessen Thanks for your suggestions. I have simplified the code a lot, it is now really nice. I think I did overabstract the Change struct, so now it is much better. Please give this another chance!

Copy link
Member

@kirawi kirawi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still not overtly clear to me what benefits this has over the existing code, but I'm not really familiar with this area of the codebase either. (I would also suggest marking this PR as a draft since I noticed a lot of commented code; there should be an option on the right-hand sidebar of this page)

pub use smallvec::SmallVec;
pub use syntax::Syntax;

pub use diagnostic::Diagnostic;
pub use state::State;

pub use line_ending::{LineEnding, DEFAULT_LINE_ENDING};
pub use transaction::{Assoc, Change, ChangeSet, Operation, Transaction};
pub use transaction::{Assoc, Change, ChangeSet, Operation, Transaction};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing newline here.

helix-core/src/text_size/offset.rs Outdated Show resolved Hide resolved
///
/// It is a logic error for `start` to be greater than `end`.
#[derive(Default, Copy, Clone, Eq, PartialEq, Hash)]
pub struct TextRange {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this basically just SelectionRange?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SelectionRange can be flipped

@@ -0,0 +1,25 @@
use super::size::TextSize;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also confused here, I feel like this is also already a covered case w/ RopeGraphemes (I might be wrong though).

helix-core/src/transaction.rs Outdated Show resolved Hide resolved
fn text_len(&self) -> TextSize {
*self
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing newline.

fn sum<I: Iterator<Item = A>>(iter: I) -> TextSize {
iter.fold(0.into(), Add::add)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing newline.

helix-core/src/transaction.rs Show resolved Hide resolved
@oberblastmeister
Copy link
Contributor Author

Could you explain what you still don't think has benefits over the existing code? I think I can keep TextRange and remove TextSize and TextOffset as those don't seem to be very useful. I think we can just use a type alias instead for TextSize.

@oberblastmeister oberblastmeister marked this pull request as draft August 22, 2021 17:58
@kirawi
Copy link
Member

kirawi commented Aug 22, 2021

From the cursory look I gave, it seems to me that you can achieve the same benefit for less abstraction/work by improving upon the already-existing code in transaction.rs rather than pulling in all this external code. If you are able to flatten the Change processing through this PR, I think that would be helpful. Again, I'm not really familiar with this area of the codebase (I only worked with it for helix-core/src/diff.rs).

@oberblastmeister
Copy link
Contributor Author

Like I explained above, the existing Change code is much larger with more abstractions and isn't as safe as this code. It has to convert from and to Operation which is a worse representation in my opinion. For example, look at the function invert, in the new code it is like 2 lines of code. The existing implementation takes 20 something lines. This is similar when comparing many functions. Additionaly, the new code has the option to ensure that everything is disjoint and sorts the changes. This guarantees that everything happens properly, it is even listed as a todo in the code by blaz. Thats why I created this pr, to refactor the code and make it simpler. As for the other types I added, I think we can remove TextSize and TextOffset because those don't make much sense. I like TextRange however because it makes stuff nicer to work with. It also ensures that you cannot subtract with overflow which is currently not prevented. However it is not absolutely necessary and I can remove it if people really want me to.

@archseer
Copy link
Member

I agree with @cessen's sentiment: I appreciate the effort put into this, but a sweeping change that affects all open PRs needs a prior discussion (see #362 for an example). In particular, it's good to talk through why the code is currently the way it is, because there's often a reason behind it.

The reason for this change is that this greatly simplifies the code.

The diff is currently +1200 -500. It's also very noisy, most of the diff is renaming various existing structs, I'd prefer to keep those names as is.

There are no unnecessary conversions between changes and Operation and then back. Changing to Operation doesn't even help because it makes processing even harder as it cannot be taken in isolation which means processing them must repeat the same mutable code.

There are of course reasons why we have two types right now. The editing primitive is an Operation which models an operational transform and it's map and compose (map currently being unimplemented) allow for far greater flexibility in the future (async formatting after some user edits? map the formatting transaction over the edits and apply it. It's also the foundation for collaborative editing).

But Operations aren't very ergonomic, so the change/change_by_range functions expose an API that's easier to use. I did attempt to merge the two before, but it either made the OT operations more convoluted, or it made the API less easy to use. I would be open to such a change if we can come up with a proposal that's acceptable.

Additionaly, the new code has the option to ensure that everything is disjoint and sorts the changes. This guarantees that everything happens properly, it is even listed as a todo in the code by blaz.

Since the changeset builder is more low-level and we control all the methods I was thinking there's no need to do a check for ordering and this should actually be a debug_assert. Kakoune has built-in assertions in debug builds that verify a bunch of invariants before and after every command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants