embedart: write candidate image only if it is "similar" to already embedded one (suite) #974

Kraymer · 2014-09-21T12:41:14Z

Original issue : #848
Continuation of work started by PR #966

Taken from the docs :

The compare_threshold option defines how similar must candidate art be regarding to embedded art to be written to the file.
This works by computing the perceptual hashes (PHASH_) of the two images and checking that the difference between the two does not exceed a given threshold.
The threshold used is given by the compare_threshold option:

use '0' to always embed image (disable similarity check)

use any positive integer to define a similarity threshold. The smaller the value, the more similar the images must be. A value in the range [10,100] is recommended.

Additional commit infos :

26ec2b8: I kinda hacked ArtResizer original goal, using it only to inform me if IM is installed or not. In the long term we might want to do some other basics operations and renaming the class into ArtProcessor or something would fit better I think.
a06c27 : what's tricky here is that compare return code is 1 in case of non match, yet the two images in such case may be judged as similar from a human point of view. So I needed to interpret the printed diff value even when CalledProcessError is raised. Maybe this edit to command_output could be done more elegantly ?
187497c: I don't know what range of values can compare -metric PHASH return ([0, ?]), so I decided to let user set the threshold and hinting at a reasonable max in the docs (100)

Most clients other than Library._fetch know what type they have!

The type tests now live where they ought to live.

This turned out to be less useful than I was hoping.

Models can now have a dict of special sort classes.

The mock wasn't being triggered; these tests were going to the network. Now we don't match on the query string and instead test that it was correct by actually parsing it.

It would be nice to cache the Sort object so we didn't have to re-parse this every time...

This is basically always what you want, so now you can just use the name of the field without "smart".

This makes fast SQL queries for singletons possible

Fixes beetbox#964.

Uses the new API from the previous commit and fixes beetbox#963. There is a possible issue with backwards compatibility: Changes to the item in the 'write' event do not propagate to the tags anymore. But I'm not aware of other plugins that use the API in that way.

Zero plugin can modify tags without changing the item

Fixes beetbox#968

Reproduces beetbox#970

Fixes regression from 3197795 and makes tests from 56aba87 pass. Fixes beetbox#970

Update type of last_played to library.DateType().

Conflicts: test/test_embedart.py

Kraymer · 2014-09-21T12:53:33Z

Problem with python 2.6:

  File "/home/travis/build/sampsyo/beets/beets/util/__init__.py", line 643, in command_output
    raise subprocess.CalledProcessError(proc.returncode, cmd, stderr)
TypeError: __init__() takes exactly 3 arguments (4 given)

If someone have an idea how to work around that... by raising a custom error (matching the py2.7 error signature) that we will define at the top of the file?

sampsyo · 2014-09-21T19:03:16Z

Hmm... it looks like you're using this to capture the stderr of the command when the invocation fails. You probably already thought about this, but is this really necessary? Could you invoke the comparison command so that, when it fails, you know the images are too dissimilar?

Alternatively, you could consider just not using the command_output utility here; other uses of the utility don't care about the stderr so the use case is different enough to justify reimplementation.

Kraymer · 2014-09-21T21:28:11Z

Could you invoke the comparison command so that, when it fails, you know the images are too dissimilar?

I thought not, but after checking compare --help it seems setting -dissimilarity-threshold when calling could help.

Edit (after testing) : actually, no. Changing the dissimilarity-threshold does not yield any result regarding the return code value.

we used to parse `convert` output but `convert` happens to be a Windows cli command too. using `identify` is less error prone.

Kraymer · 2014-09-24T18:58:03Z

@sampsyo if it s OK with u, I d like to merge that PR. Got another one comin up =]

embedart: write candidate image only if it is "similar" to already embedded one (suite)

sampsyo · 2014-09-25T22:34:23Z

LGTM! All merged up. See minor changes above (mainly docs, some style).

Kraymer · 2014-09-26T05:27:04Z

added a note concerning the doc.
And yeah thanks to get rid of all these passive forms I use when I write.

sampsyo and others added 30 commits September 15, 2014 10:25

replaygain: Check for bad mp3gain output (beetbox#961)

84c0f90

convert: Fix beetbox#962, extensions in auto mode

08b9b90

dbcore: parse_sorted_query (beetbox#953)

0f37737

Factor out parse_query from get_query_sort

9c93c06

Introduce parse_query_string for the common case

80116cc

Most clients other than Library._fetch know what type they have!

The demise of get_query_sort (beetbox#953)

eb89d3a

The type tests now live where they ought to live.

Remove SortedQuery (beetbox#953)

e2b3faf

This turned out to be less useful than I was hoping.

Move SmartArtistSort to library (beetbox#953)

f9c6dd6

Models can now have a dict of special sort classes.

Add slow sort to SmartArtistSort

f5e1846

Hide task specific code from importer stage

e579db6

Test importing unmatched tracks

9a382eb

Fix mocking in Spotify tests

0bdd0c7

The mock wasn't being triggered; these tests were going to the network. Now we don't match on the query string and instead test that it was correct by actually parsing it.

Default sort configuration is global (beetbox#953)

2795a01

It would be nice to cache the Sort object so we didn't have to re-parse this every time...

Rename smartartist to artist/albumartist (beetbox#953)

5f2ca0b

This is basically always what you want, so now you can just use the name of the field without "smart".

More on sorting in the changelog

c3f9b08

Extend the documentation of the ImportTask

89d7c5d

Add NoneQuery

ad71af2

This makes fast SQL queries for singletons possible

Decode bytestring paths to unicode when logging

a38a6b2

Fixes beetbox#964.

Media file tags can be customized with the write event

0bf7c06

Merge pull request beetbox#965 from geigerzaehler/write-hook-mutate

66f952b

Zero plugin can modify tags without changing the item

Changelog cleanup

893f4c5

Date for 1.3.8 release

c16c90b

Version bump: 1.3.9

9aa05bd

embed_item function does not raise if image file not found

1e45ba5

Fixes beetbox#968

Fix oldstyle object initializer for py26

d0ac66b

convert: Test skip existing files

56aba87

Reproduces beetbox#970

convert: Check the correct path when determening whether to skip

dc3c488

Fixes regression from 3197795 and makes tests from 56aba87 pass. Fixes beetbox#970

Remove unnecessary method on ImportTask

79d1203

Update type of last_played to library.DateType().

606d47a

sampsyo and others added 5 commits September 18, 2014 15:21

Merge pull request beetbox#971 from zacharydenton/patch-1

bec6d0f

Update type of last_played to library.DateType().

Changelog for beetbox#971

6ac568c

release.py: Fix bumping setup.py version

dcc63e9

release.py: Prettify foo-cmd in ReST changelog

4f3a52a

Merge branch 'fetchart_issue848'

c1224ca

Conflicts: test/test_embedart.py

Kraymer mentioned this pull request Sep 21, 2014

embedart: write candidate image only if it is "similar" to already embedded one #966

Closed

fix flake8

e4180e4

Kraymer added 2 commits September 22, 2014 13:45

artresizer: parse output of identify command to check IM version

a0c38a0

we used to parse `convert` output but `convert` happens to be a Windows cli command too. using `identify` is less error prone.

restore command_output() implementation as of 0ec285f

d2cf41f

sampsyo merged commit d2cf41f into beetbox:master Sep 25, 2014

sampsyo added a commit that referenced this pull request Sep 25, 2014

Merge pull request #974 from KraYmer/fetchart_issue848_2

4f2d7e0

embedart: write candidate image only if it is "similar" to already embedded one (suite)

sampsyo added a commit that referenced this pull request Sep 25, 2014

Minor fixes, changelog for #974

d17c148

Kraymer mentioned this pull request Oct 31, 2014

fetchart: "upgrade" existing embedded art #848

Closed

Kraymer deleted the fetchart_issue848_2 branch February 13, 2016 12:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

embedart: write candidate image only if it is "similar" to already embedded one (suite) #974

embedart: write candidate image only if it is "similar" to already embedded one (suite) #974

Kraymer commented Sep 21, 2014

Kraymer commented Sep 21, 2014

sampsyo commented Sep 21, 2014

Kraymer commented Sep 21, 2014

Kraymer commented Sep 24, 2014

sampsyo commented Sep 25, 2014

Kraymer commented Sep 26, 2014

embedart: write candidate image only if it is "similar" to already embedded one (suite) #974

embedart: write candidate image only if it is "similar" to already embedded one (suite) #974

Conversation

Kraymer commented Sep 21, 2014

Kraymer commented Sep 21, 2014

sampsyo commented Sep 21, 2014

Kraymer commented Sep 21, 2014

Kraymer commented Sep 24, 2014

sampsyo commented Sep 25, 2014

Kraymer commented Sep 26, 2014