Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asciidoctor: Copy referenced images #541

Merged
merged 5 commits into from
Jan 30, 2019
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 2 additions & 6 deletions lib/ES/Util.pm
Original file line number Diff line number Diff line change
Expand Up @@ -79,8 +79,6 @@ sub build_chunked {
$output = run(
'asciidoctor', '-v', '--trace',
'-r' => dir('resources/asciidoctor/lib/extensions.rb')->absolute,
# TODO figure out resource
# ( map { ( '--resource' => $_ ) } @$resources ),
'-b' => 'docbook45',
'-d' => 'book',
'-a' => 'showcomments=1',
Expand All @@ -94,6 +92,7 @@ sub build_chunked {
# missing attributes!
# '-a' => 'attribute-missing=warn',
'-a' => 'asciidoc-dir=' . $asciidoc_dir,
$resources ? ( '-a' => 'resources=' . join(',', @$resources)) : (),
'--destination-dir=' . $dest,
docinfo($index),
$index
Expand All @@ -108,7 +107,6 @@ sub build_chunked {
file('resources/website_chunked.xsl')->absolute,
"$dest/index.xml"
);
# TODO copy_resources?
# TODO clean the xml files
1;
} or do { $output = $@; $died = 1; };
Expand Down Expand Up @@ -205,15 +203,14 @@ sub build_single {
$output = run(
'asciidoctor', '-v', '--trace',
'-r' => dir('resources/asciidoctor/lib/extensions.rb')->absolute,
# TODO figure out resource
# ( map { ( '--resource' => $_ ) } @$resources ),
'-b' => 'docbook45',
'-d' => $type,
'-a' => 'showcomments=1',
'-a' => "lang=$lang",
'-a' => 'root_dir=' . $root_dir,
$private ? () : ( '-a' => "edit_url=$edit_url" ),
'-a' => 'asciidoc-dir=' . $asciidoc_dir,
$resources ? ( '-a' => 'resources=' . join(',', @$resources)) : (),
# Disable warning on missing attributes because we have
# missing attributes!
# '-a' => 'attribute-missing=warn',
Expand All @@ -231,7 +228,6 @@ sub build_single {
file('resources/website.xsl')->absolute,
"$dest/index.xml"
);
# TODO copy_resources?
# TODO clean the xml file
1;
} or do { $output = $@; $died = 1; };
Expand Down
69 changes: 69 additions & 0 deletions resources/asciidoctor/lib/copy_images/extension.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
require 'csv'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to parse a comma separated list and the simplest way looks to be to use the built in CSV parsing....

require 'fileutils'
require 'set'
require_relative '../scaffold.rb'

include Asciidoctor

##
# Copies images that are referenced into the same directory as the output files.
#
class CopyImages < TreeProcessorScaffold
include Logging

def initialize name
super
@copied = Set[]
end

def process_block block
return unless block.context == :image
uri = block.image_uri(block.attr 'target')
return if Helpers.uriish? uri # Skip external images
return unless @copied.add? uri # Skip images we've copied before
source = find_source block, uri
return unless source # Skip images we can't find
logger.info message_with_context "copying #{uri}", :source_location => block.source_location
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if info is right here but it doesn't seem too bad. We don't have too many images.

copy_image_proc = block.document.attr 'copy_image'
if copy_image_proc
# Delegate to a proc for copying if one is defined. Used for testing.
copy_image_proc.call(uri, source)
else
destination = ::File.join block.document.options[:to_dir], uri
destination_dir = ::File.dirname destination
FileUtils.mkdir_p destination_dir
FileUtils.cp source, destination
end
end

##
# Does a breadth first search starting at the base_dir of the document and
# any referenced resources. This isn't super efficient but it is how a2x works
# and we strive for compatibility.
#
def find_source block, uri
to_check = [block.document.base_dir]
checked = []

resources = block.document.attr 'resources'
if resources
to_check += CSV.parse_line(resources)
end

while (dir = to_check.shift)
checked << block.normalize_system_path(uri, dir)
return checked.last if File.readable? checked.last
next unless Dir.exist?(dir)
Dir.new(dir).each { |f|
next if f == '.' || f == '..'
f = File.join(dir, f)
to_check << f if File.directory?(f)
}
end

# We'll skip images we can't find but we should log something about it so
# we can fix them.
logger.warn message_with_context "can't read image at any of #{checked}", :source_location => block.source_location
nil
end
end
2 changes: 2 additions & 0 deletions resources/asciidoctor/lib/extensions.rb
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
require_relative 'added/extension'
require_relative 'copy_images/extension'
require_relative 'cramped_include/extension'
require_relative 'edit_me/extension'
require_relative 'elastic_compat_tree_processor/extension'
Expand All @@ -11,6 +12,7 @@
document.sourcemap = true
preprocessor CrampedInclude
preprocessor ElasticCompatPreprocessor
treeprocessor CopyImages
treeprocessor EditMe
treeprocessor ElasticCompatTreeProcessor
include_processor ElasticIncludeTagged
Expand Down
225 changes: 225 additions & 0 deletions resources/asciidoctor/spec/copy_images_spec.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
require 'copy_images/extension'
require 'fileutils'
require 'tmpdir'

RSpec.describe CopyImages do
before(:each) do
Extensions.register do
tree_processor CopyImages
end
end

after(:each) do
Extensions.unregister_all
end

private
def copy_attributes copied
{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the funky formatting meant to communicate something here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're returning a map. Would it be clearer if I said return {? I don't ruby too well so I'm not sure.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would avoid confusion with things like block formatting. It makes the reader have to think about whether there are any | | present or something.

'copy_image' => Proc.new { |uri, source|
copied << [uri, source]
}
}
end

spec_dir = File.dirname(__FILE__)

it "copies a file when directly referenced" do
copied = []
attributes = copy_attributes copied
input = <<~ASCIIDOC
== Example
image::resources/copy_images/example1.png[]
ASCIIDOC
expect { convert input, attributes }.to raise_error { |error|
expect(error).to be_a(ConvertError)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this testing idiom unintuitive. By description this appears to be a positive test, one which should do a thing (and behave doing so) and then verify the thing has been done. But this is looking for an error, and then that error is actually an INFO... it's got a lot to tease apart just to line up what it says it does with how it checks doing it.

I don't know yet if I am requesting a change or not, but even if I figure there's probably a really good explanation for it, I'd still like to hear it because it definitely made me go "...huh?" reviewing it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It exists because the tests throw a ConvertError if we log anything during the conversion. That way when we're testing that we don't log anything we don't have to do anything - we get that behavior by not catching the exception. It made more sense when I was catching warnings to be honest.

I think I can make this more clear. I'll give it a shot on Monday.

expect(error.message).to match(/INFO: <stdin>: line 2: copying resources\/copy_images\/example1.png/)
}
expect(copied).to eq([
["resources/copy_images/example1.png", "#{spec_dir}/resources/copy_images/example1.png"]
])
end

it "copies a file when it can be found in a sub tree" do
copied = []
attributes = copy_attributes copied
input = <<~ASCIIDOC
== Example
image::example1.png[]
ASCIIDOC
expect { convert input, attributes }.to raise_error { |error|
expect(error).to be_a(ConvertError)
expect(error.message).to match(/INFO: <stdin>: line 2: copying example1.png/)
}
expect(copied).to eq([
["example1.png", "#{spec_dir}/resources/copy_images/example1.png"]
])
end

it "copies a path when it can be found in a sub tree" do
copied = []
attributes = copy_attributes copied
input = <<~ASCIIDOC
== Example
image::copy_images/example1.png[]
ASCIIDOC
expect { convert input, attributes }.to raise_error { |error|
expect(error).to be_a(ConvertError)
expect(error.message).to match(/INFO: <stdin>: line 2: copying copy_images\/example1.png/)
}
expect(copied).to eq([
["copy_images/example1.png", "#{spec_dir}/resources/copy_images/example1.png"]
])
end

it "warns when it can't find a file" do
copied = []
attributes = copy_attributes copied
input = <<~ASCIIDOC
== Example
image::not_found.jpg[]
ASCIIDOC
expect { convert input, attributes }.to raise_error { |error|
expect(error).to be_a(ConvertError)
expect(error.message).to match(/
WARN:\ <stdin>:\ line\ 2:\ can't\ read\ image\ at\ any\ of\ \[
"#{spec_dir}\/not_found.jpg",\s
"#{spec_dir}\/resources\/not_found.jpg",\s
.+
"#{spec_dir}\/resources\/copy_images\/not_found.jpg"
.+
\]/x)
expect(error.message).not_to match(/INFO: <stdin>/)
}
expect(copied).to eq([])
end

it "only attempts to copy each file once" do
copied = []
attributes = copy_attributes copied
input = <<~ASCIIDOC
== Example
image::resources/copy_images/example1.png[]
image::resources/copy_images/example1.png[]
image::resources/copy_images/example2.png[]
image::resources/copy_images/example1.png[]
image::resources/copy_images/example2.png[]
ASCIIDOC
expect { convert input, attributes }.to raise_error { |error|
expect(error).to be_a(ConvertError)
expect(error.message).to match(/INFO: <stdin>: line 2: copying resources\/copy_images\/example1.png/)
expect(error.message).to match(/INFO: <stdin>: line 4: copying resources\/copy_images\/example2.png/)
}
expect(copied).to eq([
["resources/copy_images/example1.png", "#{spec_dir}/resources/copy_images/example1.png"],
["resources/copy_images/example2.png", "#{spec_dir}/resources/copy_images/example2.png"],
])
end

it "skips external images" do
copied = []
attributes = copy_attributes copied
input = <<~ASCIIDOC
== Example
image::https://f.cloud.github.com/assets/4320215/768165/19d8b1aa-e899-11e2-91bc-6b0553e8d722.png[]
ASCIIDOC
convert input, attributes
expect(copied).to eq([])
end

it "can find files using a single valued resources attribute" do
Dir.mktmpdir {|tmp|
FileUtils.cp(
::File.join(spec_dir, 'resources', 'copy_images', 'example1.png'),
::File.join(tmp, 'tmp_example1.png')
)

copied = []
attributes = copy_attributes copied
attributes['resources'] = tmp
input = <<~ASCIIDOC
== Example
image::tmp_example1.png[]
ASCIIDOC
expect { convert input, attributes }.to raise_error { |error|
expect(error).to be_a(ConvertError)
expect(error.message).to match(/INFO: <stdin>: line 2: copying tmp_example1.png/)
# NOCOMMIT full paths in logs too, I think
}
expect(copied).to eq([
["tmp_example1.png", "#{tmp}/tmp_example1.png"]
])
}
end

it "can find files using a multi valued resources attribute" do
Dir.mktmpdir {|tmp|
FileUtils.cp(
::File.join(spec_dir, 'resources', 'copy_images', 'example1.png'),
::File.join(tmp, 'tmp_example1.png')
)

copied = []
attributes = copy_attributes copied
attributes['resources'] = "dummy1,#{tmp},/dummy2"
input = <<~ASCIIDOC
== Example
image::tmp_example1.png[]
ASCIIDOC
expect { convert input, attributes }.to raise_error { |error|
expect(error).to be_a(ConvertError)
expect(error.message).to match(/INFO: <stdin>: line 2: copying tmp_example1.png/)
}
expect(copied).to eq([
["tmp_example1.png", "#{tmp}/tmp_example1.png"]
])
}
end

it "has a nice error message when it can't find a file with single valued resources attribute" do
Dir.mktmpdir {|tmp|
copied = []
attributes = copy_attributes copied
attributes['resources'] = tmp
input = <<~ASCIIDOC
== Example
image::not_found.png[]
ASCIIDOC
expect { convert input, attributes }.to raise_error { |error|
expect(error).to be_a(ConvertError)
expect(error.message).to match(/
WARN:\ <stdin>:\ line\ 2:\ can't\ read\ image\ at\ any\ of\ \[
"#{spec_dir}\/not_found.png",\s
"#{tmp}\/not_found.png"
.+
\]/x)
expect(error.message).not_to match(/INFO: <stdin>/)
}
expect(copied).to eq([])
}
end

it "has a nice error message when it can't find a file with multi valued resources attribute" do
Dir.mktmpdir {|tmp|
copied = []
attributes = copy_attributes copied
attributes['resources'] = "#{tmp},/dummy2"
input = <<~ASCIIDOC
== Example
image::not_found.png[]
ASCIIDOC
expect { convert input, attributes }.to raise_error { |error|
expect(error).to be_a(ConvertError)
expect(error.message).to match(/
WARN:\ <stdin>:\ line\ 2:\ can't\ read\ image\ at\ any\ of\ \[
"#{spec_dir}\/not_found.png",\s
"#{tmp}\/not_found.png",\s
"\/dummy2\/not_found.png"
.+
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We list out all of the directories that we try and I don't think that list is worth maintaining here. I do think it is helpful to see it when it fails. Though I don't like how long the list gets.....

\]/x)
expect(error.message).not_to match(/INFO: <stdin>/)
}
expect(copied).to eq([])
}
end
end
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.