Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use https and http appropriately when connecting to MusicBrainz #450

Merged
merged 2 commits into from
Jan 29, 2020

Conversation

ABCbum
Copy link
Contributor

@ABCbum ABCbum commented Jan 15, 2020

Submitted as a part of GCI competition

Fixed some bugs:

  • MusicBrainz submit URL always has https as protocol: hardcoded, even when
    inappropriate. It's just a graphical issue.
  • Whipper appears to always communicate with MusicBrainz using musicbrainzngs
    over http. The musicbrainzngs.set_hostname(server).
  • musicbrainzngs.set_hostname(server) always defaults to http. Since musicbrainzngs
    version 0.7 the method set_hostname takes an optional argument named use_https
    (defaults to False) which whipper never passes.

Changed behaviour of server option (musicbrainz section of whipper's configuration file).
Now it expects an URL with a valid scheme (scheme must be http or http, empty scheme isn't allowed anymore).
Only the scheme and netloc parts of the URL are taken into account.

Fixes #437.

@JoeLametta
Copy link
Collaborator

I think that the best way to solve this is to add support for a boolean option named https to section musicbrainz of the config file so that if the MusicBrainz server is running on a non-standard port whipper will know and make a correct decision (the user is the only one who can know if it's an http or https server).

Summary:

  • If both server and https options aren't defined, then default to server='musicbrainz.org', https=True
  • If server is defined as musicbrainz.org and https is not, then assume https=True
  • If server is defined (it's not musicbrainz.org) and https is not defined: then assume https=False
  • In all other cases just use the values as defined
  • If https=True then the correct scheme for the MusicBrainz lookup URL is https, otherwise it must be set to http.

This way it should be possible to solve all three sub-issues in a reliable manner. 😉

EDIT:

Another possible solution (I think this is better) would be to redefine server to also include the scheme so that introducing a new variable isn't required anymore.

@ABCbum
Copy link
Contributor Author

ABCbum commented Jan 15, 2020

Another possible solution (I think this is better) would be to redefine server to also include the scheme so that introducing a new variable isn't required anymore.

I don't know if this is possible since musicbrainzngs.set_hostname accepts host:[port] format.

I think that the best way to solve this is to add support for a boolean option named https to section musicbrainz of the config file so that if the MusicBrainz server is running on a non-standard port whipper will know and make a correct decision (the user is the only one who can know if it's an http or https server).

Yeah i think this is a great idea since user will be in control but do we need to inform them about the possible bug of using a wrong scheme ?

Also is the current way of handling not reliable since in order to accomplish

If server is defined (it's not musicbrainz.org): then assume https=False

assuming to use http here might not always be appropriate (?) since server still can be beta.musicbrainz.org. Despite that, at the moment i can't think of any other way to detect when whipper in that case should use https or http besides the portSpecified way.

Or maybe we just add a config option https as you suggested and only use https if the user set it to true ? Then whipper won't have to assume anything.

I need confirmation on this, the current test that i add seems to be modifying the config file (this means if user run test their config might be changed), should i somehow hackily mock the function instead since i can't really come up with a way to get around the config right now.😉

@JoeLametta JoeLametta marked this pull request as ready for review January 16, 2020 11:04
@JoeLametta JoeLametta changed the title Use https and http approriately in Musicbrainz URL(s) WIP: Use https and http approriately in Musicbrainz URL(s) Jan 16, 2020
@JoeLametta
Copy link
Collaborator

I've tried to push my changes to this branch but it seems it's not allowed.
Here's the patch (inline):

From 07f7e273c53b8b47553b571b2b30362f836ac5dc Mon Sep 17 00:00:00 2001
From: JoeLametta <JoeLametta@users.noreply.github.com>
Date: Thu, 16 Jan 2020 10:19:54 +0000
Subject: [PATCH 1/4] Revert "Use https and http approriately in Musicbrainz
 URL"

This reverts commit 769a222f5897292f8c62635ff05bba77d12d7eca.
---
 whipper/command/main.py | 3 +--
 whipper/image/table.py  | 4 +---
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/whipper/command/main.py b/whipper/command/main.py
index 1fa5dbd..9571eb3 100644
--- a/whipper/command/main.py
+++ b/whipper/command/main.py
@@ -20,8 +20,7 @@ logger = logging.getLogger(__name__)
 
 def main():
     server = config.Config().get_musicbrainz_server()
-    portNotSpecified = (':' not in server)
-    musicbrainzngs.set_hostname(server, use_https=portNotSpecified)
+    musicbrainzngs.set_hostname(server)
 
     # Find whipper's plugins paths (local paths have higher priority)
     plugins_p = [directory.data_path('plugins')]  # local path (in $HOME)
diff --git a/whipper/image/table.py b/whipper/image/table.py
index 6873479..56ec8ae 100644
--- a/whipper/image/table.py
+++ b/whipper/image/table.py
@@ -352,7 +352,6 @@ class Table:
 
     def getMusicBrainzSubmitURL(self):
         host = config.Config().get_musicbrainz_server()
-        portSpecified = (':' in host)
 
         discid = self.getMusicBrainzDiscId()
         values = self._getMusicBrainzValues()
@@ -362,10 +361,9 @@ class Table:
             ('tracks', self.getAudioTracks()),
             ('id', discid),
         ])
-        scheme = 'http' if portSpecified else 'https'
 
         return urlunparse((
-            scheme, host, '/cdtoc/attach', '', query, ''))
+            'https', host, '/cdtoc/attach', '', query, ''))
 
     def getFrameLength(self, data=False):
         """
-- 
2.25.0


From 0d603e0a2fb17afc43c589041103d274c0ad241d Mon Sep 17 00:00:00 2001
From: JoeLametta <JoeLametta@users.noreply.github.com>
Date: Thu, 16 Jan 2020 10:44:10 +0000
Subject: [PATCH 2/4] Use https and http approriately in Musicbrainz URL

The tests need to be updated.

Fixes:
- MusicBrainz submit URL always has https as protocol: hardcoded, even when
inappropriate. It's just a graphical issue.
- Whipper appears to always communicate with MusicBrainz using musicbrainzngs
over http. The musicbrainzngs.set_hostname(server) method is called every
time (even when no custom server has been defined).
-  always defaults to http. With musicbrainzngs
version 0.7, the method set_hostname takes an optional argument named use_https
(it defaults to False) which whipper never passes.

Signed-off-by: JoeLametta <JoeLametta@users.noreply.github.com>
---
 whipper/command/main.py  |  8 +++++++-
 whipper/common/config.py | 10 ++++++----
 whipper/image/table.py   |  4 ++--
 3 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/whipper/command/main.py b/whipper/command/main.py
index 9571eb3..6c27fbb 100644
--- a/whipper/command/main.py
+++ b/whipper/command/main.py
@@ -20,7 +20,13 @@ logger = logging.getLogger(__name__)
 
 def main():
     server = config.Config().get_musicbrainz_server()
-    musicbrainzngs.set_hostname(server)
+    https_enabled = True if server['scheme'] == 'https' else False
+    try:
+        musicbrainzngs.set_hostname(server['netloc'], https_enabled)
+    # Parameter 'use_https' is missing in versions of musicbrainzngs < 0.7
+    except TypeError as e:
+        logger.warning(e)
+        musicbrainzngs.set_hostname(server['netloc'])
 
     # Find whipper's plugins paths (local paths have higher priority)
     plugins_p = [directory.data_path('plugins')]  # local path (in $HOME)
diff --git a/whipper/common/config.py b/whipper/common/config.py
index 8774c86..49e5c41 100644
--- a/whipper/common/config.py
+++ b/whipper/common/config.py
@@ -75,10 +75,12 @@ class Config:
     # musicbrainz section
 
     def get_musicbrainz_server(self):
-        server = self.get('musicbrainz', 'server') or 'musicbrainz.org'
-        server_url = urlparse('//' + server)
-        if server_url.scheme != '' or server_url.path != '':
-            raise KeyError('Invalid MusicBrainz server: %s' % server)
+        conf = self.get('musicbrainz', 'server') or 'https://musicbrainz.org'
+        if not conf.startswith(('http://', 'https://')):
+            raise KeyError('Invalid MusicBrainz server: unsupported '
+                           'or missing scheme')
+        scheme, netloc, _, _, _, _ = urlparse(server)
+        server = {'scheme':scheme, 'netloc':netloc}
         return server
 
     # drive sections
diff --git a/whipper/image/table.py b/whipper/image/table.py
index 56ec8ae..b4804e1 100644
--- a/whipper/image/table.py
+++ b/whipper/image/table.py
@@ -351,7 +351,7 @@ class Table:
         return disc.id
 
     def getMusicBrainzSubmitURL(self):
-        host = config.Config().get_musicbrainz_server()
+        serv = config.Config().get_musicbrainz_server()
 
         discid = self.getMusicBrainzDiscId()
         values = self._getMusicBrainzValues()
@@ -363,7 +363,7 @@ class Table:
         ])
 
         return urlunparse((
-            'https', host, '/cdtoc/attach', '', query, ''))
+            serv['scheme'], server['netloc'], '/cdtoc/attach', '', query, ''))
 
     def getFrameLength(self, data=False):
         """
-- 
2.25.0


From 0b41a6e1c06e5f60a9c62990c40b7ce160f448d5 Mon Sep 17 00:00:00 2001
From: JoeLametta <JoeLametta@users.noreply.github.com>
Date: Thu, 16 Jan 2020 10:52:09 +0000
Subject: [PATCH 3/4] Update ABCbum's test case

Also removed trailing whitespace

Signed-off-by: JoeLametta <JoeLametta@users.noreply.github.com>
---
 whipper/test/test_image_table.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/whipper/test/test_image_table.py b/whipper/test/test_image_table.py
index c063bff..c85cc49 100644
--- a/whipper/test/test_image_table.py
+++ b/whipper/test/test_image_table.py
@@ -63,10 +63,10 @@ class LadyhawkeTestCase(tcommon.TestCase):
         self.config = config.Config()
         self.config._parser.add_section('musicbrainz')
         self.config._parser.set('musicbrainz', 'server',
-                                'musicbrainz.org:80')
+                                'http://musicbrainz.org')
         self.config.write()
         self.assertEqual(self.table.getMusicBrainzSubmitURL(),
-                         "http://musicbrainz.org:80/cdtoc/attach?toc=1+12+1958"
+                         "http://musicbrainz.org/cdtoc/attach?toc=1+12+1958"
                          "56+150+15687+31841+51016+66616+81352+99559+116070+13"
                          "3243+149997+161710+177832&tracks=12&id=KnpGsLhvH.lPr"
                          "Nc1PBL21lb9Bg4-")
@@ -75,7 +75,7 @@ class LadyhawkeTestCase(tcommon.TestCase):
 
         self.config._parser.remove_section('musicbrainz')
         print("Removed ", self.config._parser.sections())
-  
+
     def testAccurateRip(self):
         self.assertEqual(self.table.accuraterip_ids(), (
             "0013bd5a", "00b8d489"))
-- 
2.25.0


From 26df7f743ae8a61767665ba177e2f2f7efe1ac52 Mon Sep 17 00:00:00 2001
From: JoeLametta <JoeLametta@users.noreply.github.com>
Date: Thu, 16 Jan 2020 11:02:15 +0000
Subject: [PATCH 4/4] Update `test_get_musicbrainz_server()` test case

Signed-off-by: JoeLametta <JoeLametta@users.noreply.github.com>
---
 whipper/test/test_common_config.py | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/whipper/test/test_common_config.py b/whipper/test/test_common_config.py
index d0bcb44..68231f8 100644
--- a/whipper/test/test_common_config.py
+++ b/whipper/test/test_common_config.py
@@ -69,25 +69,25 @@ class ConfigTestCase(tcommon.TestCase):
 
     def test_get_musicbrainz_server(self):
         self.assertEqual(self._config.get_musicbrainz_server(),
-                         'musicbrainz.org',
+                         {'scheme':'https', 'netloc':'musicbrainz.org'},
                          msg='Default value is correct')
 
         self._config._parser.add_section('musicbrainz')
 
         self._config._parser.set('musicbrainz', 'server',
-                                 '192.168.2.141:5000')
+                                 'http://192.168.2.141:5000')
         self._config.write()
         self.assertEqual(self._config.get_musicbrainz_server(),
-                         '192.168.2.141:5000',
+                         {'scheme':'http', 'netloc':'192.168.2.141:5000'},
                          msg='Correctly returns user-set value')
 
-        self._config._parser.set('musicbrainz', 'server',
-                                 '192.168.2.141:5000/hello/world')
+        # Test for unsupported scheme
+        self._config._parser.set('musicbrainz', 'server', 'ftp://example.com')
         self._config.write()
         self.assertRaises(KeyError, self._config.get_musicbrainz_server)
 
-        self._config._parser.set('musicbrainz', 'server',
-                                 'http://192.168.2.141:5000')
+        # Test for absent scheme
+        self._config._parser.set('musicbrainz', 'server', 'example.com')
         self._config.write()
         self.assertRaises(KeyError, self._config.get_musicbrainz_server)
 
-- 
2.25.0

@ABCbum
Copy link
Contributor Author

ABCbum commented Jan 16, 2020

I've tried to push my changes to this branch but it seems it's not allowed.
Here's the patch (inline):

Ahh my bad, i forgot to check the allow edit from maintainers box.

Edit: i think it should be possible now.

@ABCbum
Copy link
Contributor Author

ABCbum commented Jan 16, 2020

Agh, @JoeLametta seems like my push overlapped yours, btw i fixed a typo in getSubmitUrl :
serv['scheme'], server['netloc'], '/cdtoc/attach', '', query, ''))
i think it should be serv instead 😉
Edit: it's here.

@JoeLametta JoeLametta force-pushed the develop branch 8 times, most recently from eab745c to 05a1ba2 Compare January 16, 2020 17:48
@JoeLametta
Copy link
Collaborator

JoeLametta commented Jan 16, 2020

I've fixed the bug in a hacky way. The environment variable hijack only applies to the process which runs that instruction and to its child so it shouldn't cause issues (other processes aren't affected).

@JoeLametta JoeLametta changed the title WIP: Use https and http approriately in Musicbrainz URL(s) Use https and http approriately in Musicbrainz URL(s) Jan 16, 2020
@JoeLametta JoeLametta requested a review from Freso January 16, 2020 17:55
README.md Outdated Show resolved Hide resolved
Fixed some bugs:
- MusicBrainz submit URL always has https as protocol: hardcoded, even when
inappropriate. It's just a graphical issue.
- Whipper appears to always communicate with MusicBrainz using musicbrainzngs
over http. The musicbrainzngs.set_hostname(server).
- `musicbrainzngs.set_hostname(server)` always defaults to http. Since musicbrainzngs
version 0.7 the method `set_hostname` takes an optional argument named `use_https`
(defaults to False) which whipper never passes.

Changed behaviour of `server` option (`musicbrainz` section of whipper's configuration file).
Now it expects an URL with a valid scheme (scheme must be `http` or `http`, empty scheme isn't allowed anymore).
Only the scheme and netloc parts of the URL are taken into account.

Fixes whipper-team#437.

Signed-off-by: JoeLametta <JoeLametta@users.noreply.github.com>
Applies to a default configuration (no custom MusicBrainz server specified).

Co-authored-by: JoeLametta <JoeLametta@users.noreply.github.com>
Signed-off-by: JoeLametta <JoeLametta@users.noreply.github.com>
Signed-off-by: ABCbum <kimlong221002@gmail.com>
@JoeLametta JoeLametta changed the title Use https and http approriately in Musicbrainz URL(s) Use https and http appropriately when connecting to MusicBrainz Jan 17, 2020
@JoeLametta JoeLametta merged commit 7b8a20b into whipper-team:develop Jan 29, 2020
@JoeLametta
Copy link
Collaborator

Merged, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: MusicBrainz lookup URL is hardcoded to always use https
3 participants