Don't choke on invalid UTF-8 in `file` output #1298

tdsmith · 2016-10-16T04:22:43Z

Sometimes file output contains data from the file under examination,
which may include binary data that does not represent valid UTF-8
codepoints. String#split dies if it doesn't understand the encoding, so
tell Ruby to treat file output as a bytestring.

Sometimes `file` output contains data from the file under examination, which may include binary data that does not represent valid UTF-8 codepoints. String#split dies if it doesn't understand the encoding, so tell Ruby to treat `file` output as a bytestring.

tdsmith · 2016-10-16T04:23:16Z

cc @jawshooah

tdsmith · 2016-10-16T04:42:02Z

A failing line looks like:
"/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/__pycache__/reprlib.cpython-35.opt-2.pyc\u0000: dBase III DBT, version number 0, next free block index 168627478, 1st item \"\x83\u0001\"\n"

jawshooah · 2016-10-16T06:12:41Z

Library/Homebrew/keg_relocate.rb

-        path, info = line.split("\0")
-        next unless info.to_s.include?("text")
+        path, info = line.split("\0", 2)
+        next unless info.include?("text")


The to_s needs to stay for the reasons given in #1273 (comment).

oh, gross, okay! this is surprising enough i think being more explicit about it is good

Comments addressed.

tdsmith · 2016-10-16T16:58:50Z

Thanks for the reviews!

tdsmith added the in progress Maintainers are working on this label Oct 16, 2016

tdsmith mentioned this pull request Oct 16, 2016

bump python3 devel to python 3.6.0b2 Homebrew/homebrew-core#5954

Merged

jawshooah previously requested changes Oct 16, 2016

View reviewed changes

Explain why info could be nil

22a64aa

MikeMcQuaid approved these changes Oct 16, 2016

View reviewed changes

jawshooah approved these changes Oct 16, 2016

View reviewed changes

tdsmith merged commit de880f1 into Homebrew:master Oct 16, 2016

tdsmith deleted the no-no-bad-unicode branch October 16, 2016 16:47

tdsmith removed the in progress Maintainers are working on this label Oct 16, 2016

Homebrew locked and limited conversation to collaborators May 3, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't choke on invalid UTF-8 in `file` output #1298

Don't choke on invalid UTF-8 in `file` output #1298

tdsmith commented Oct 16, 2016

tdsmith commented Oct 16, 2016

tdsmith commented Oct 16, 2016

jawshooah Oct 16, 2016

tdsmith Oct 16, 2016

tdsmith commented Oct 16, 2016

Don't choke on invalid UTF-8 in file output #1298

Don't choke on invalid UTF-8 in file output #1298

Conversation

tdsmith commented Oct 16, 2016

tdsmith commented Oct 16, 2016

tdsmith commented Oct 16, 2016

jawshooah Oct 16, 2016

Choose a reason for hiding this comment

tdsmith Oct 16, 2016

Choose a reason for hiding this comment

tdsmith commented Oct 16, 2016

Don't choke on invalid UTF-8 in `file` output #1298

Don't choke on invalid UTF-8 in `file` output #1298