Skip to content

Commit

Permalink
Merge pull request Homebrew#1298 from tdsmith/no-no-bad-unicode
Browse files Browse the repository at this point in the history
Don't choke on invalid UTF-8 in `file` output
  • Loading branch information
tdsmith authored Oct 16, 2016
2 parents 87fad25 + 22a64aa commit de880f1
Showing 1 changed file with 9 additions and 2 deletions.
11 changes: 9 additions & 2 deletions Library/Homebrew/keg_relocate.rb
Original file line number Diff line number Diff line change
Expand Up @@ -84,9 +84,16 @@ def text_files
}
output, _status = Open3.capture2("/usr/bin/xargs -0 /usr/bin/file --no-dereference --print0",
stdin_data: files.to_a.join("\0"))
# `file` output sometimes contains data from the file, which may include
# invalid UTF-8 entities, so tell Ruby this is just a bytestring
output.force_encoding(Encoding::ASCII_8BIT)
output.each_line do |line|
path, info = line.split("\0")
next unless info.to_s.include?("text")
path, info = line.split("\0", 2)
# `file` sometimes prints more than one line of output per file;
# subsequent lines do not contain a null-byte separator, so `info`
# will be `nil` for those lines
next unless info
next unless info.include?("text")
path = Pathname.new(path)
next unless files.include?(path)
text_files << path
Expand Down

0 comments on commit de880f1

Please sign in to comment.