-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Buffer encoding #19
Merged
Merged
Buffer encoding #19
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
2a86ded
using buffer string in binary encoding, as String.new is utf-8, and y…
HoneyryderChuck 621e18e
use String#bytesize, as #length returns the length size in its encoding
HoneyryderChuck 3d97b83
use String#clear instead of String#replace, as the latter changes the…
HoneyryderChuck 9a1b911
do not use rercarpet for jruby
HoneyryderChuck 3bc3787
jruby accusing something other than ASCII-8bit sometimes (US-ASCII), …
HoneyryderChuck f08d80c
hash rocketz
HoneyryderChuck eeff06c
forcing the buffer to binary, as the buffer encoding might change
HoneyryderChuck File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -18,7 +18,7 @@ group :test do | |
end | ||
|
||
group :doc do | ||
gem "redcarpet" | ||
gem "redcarpet", :platform => :mri | ||
gem "yard" | ||
end | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though this is likely a bug in the previous version as well,
#to_s
may make a new object, at which point the purpose of passingoutbuf
is defeated.Perhaps give outbuf a default value (e.g.
outbuf = "".b
instead ofoutbuf = nil
as the argument definition), and raise if it's not aString
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#to_s will not create a new object if the object is a string. But the replace call might change its encoding. That was the reasoning behind it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hence "may make a new object". We never want to make a new object for the buffer... that defeats the point.
How about raising if the supplied buffer is not a
String
type withEncoding::BINARY
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, I see what you mean.
I'm still not sure. The encoding requirement is usually not mandatory, and the buffer passed may be a specialized buffer. Besides "fixing" jruby, I'm not even sure that forcing binary encoding there is the right thing to do.
By the way, what's the reasoning behind allowing
length = nil
? The length argument is usually required forIO#read
andFile#read
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what case would it make sense for it to be anything but a
String
? Ultimately it needs to thunk to a Ruby I/O function which also accepts a preallocated buffer, and those all need to be aString
unless JRuby has some fancy behavior I don't know about where it lets you pass a Java ByteBuffer or something.For these types of buffers, I think mandating
Encoding::BINARY
is the only right option. They should really be aByteArray
/ByteBuffer
type, but Ruby doesn't have those, so we need the next best thing, which is aString
withEncoding::BINARY
.For other encodings there are all sorts of pathological performance implications, and the contract at this point in the code is to handle arbitrary binary data that may not be whatever-encoding-was-passed-in.
Given this is a rarely used power-user feature, I think it should have some pretty exacting semantics, which are:
#to_s
or anything else that might potentially allocate memoryString
is probably the only argument type that makes senseEncoding::BINARY
(i.e.Encoding::ASCII_8BIT
) is probably the only sensible encoding for the bufferString
. If you think other encodings should be supported, do you really think the answer to the question "Willoutbuf
always be a valid string under encoding X?" is yes?No idea, like the
outbuf
beingnil
I think that probably doesn't make sense.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this change, I'll try to answer the questions below.