Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import log should probably force a UTF-8 encoding #4693

Closed
shrippen opened this issue Mar 7, 2023 · 1 comment · Fixed by #4730
Closed

Import log should probably force a UTF-8 encoding #4693

shrippen opened this issue Mar 7, 2023 · 1 comment · Fixed by #4730
Labels
bug bugs that are confirmed and actionable

Comments

@shrippen
Copy link

shrippen commented Mar 7, 2023

Problem

When trying to import a specific folder I get a UnicodeEncodeError. It is most likely related to the pathname/filename.
Running this command in verbose (-vv) mode:

$ beet -vv -c appdata\Roaming\beets\config.yaml import V:\Lupin\

Led to this problem:

--- Logging error ---
--- Logging error ---
Traceback (most recent call last):
  File "C:\Users\arian\AppData\Local\Programs\Python\Python311\Lib\logging\__init__.py", line 1113, in emit
    stream.write(msg + self.terminator)
  File "C:\Users\arian\AppData\Local\Programs\Python\Python311\Lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode characters in position 14-58: character maps to <undefined>
Call stack:
  File "C:\Users\arian\AppData\Local\Programs\Python\Python311\Lib\threading.py", line 995, in _bootstrap
    self._bootstrap_inner()
  File "C:\Users\arian\AppData\Local\Programs\Python\Python311\Lib\threading.py", line 1038, in _bootstrap_inner
    self.run()
  File "C:\Users\arian\AppData\Local\Programs\Python\Python311\Lib\site-packages\beets\util\pipeline.py", line 311, in run
    out = self.coro.send(msg)
  File "C:\Users\arian\AppData\Local\Programs\Python\Python311\Lib\site-packages\beets\util\pipeline.py", line 170, in coro
    task = func(*(args + (task,)))
  File "C:\Users\arian\AppData\Local\Programs\Python\Python311\Lib\site-packages\beets\importer.py", line 1400, in user_query
    task.choose_match(session)
  File "C:\Users\arian\AppData\Local\Programs\Python\Python311\Lib\site-packages\beets\importer.py", line 861, in choose_match
    session.log_choice(self)
  File "C:\Users\arian\AppData\Local\Programs\Python\Python311\Lib\site-packages\beets\importer.py", line 281, in log_choice
    self.tag_log('skip', paths)
  File "C:\Users\arian\AppData\Local\Programs\Python\Python311\Lib\site-packages\beets\importer.py", line 260, in tag_log
    self.logger.info('{0} {1}', status, displayable_path(paths))
  File "C:\Users\arian\AppData\Local\Programs\Python\Python311\Lib\logging\__init__.py", line 1489, in info
    self._log(INFO, msg, args, **kwargs)
  File "C:\Users\arian\AppData\Local\Programs\Python\Python311\Lib\site-packages\beets\logging.py", line 88, in _log
    return super()._log(level, m, (), exc_info, extra)
Message: <beets.logging.StrFormatLogger._LogMessage object at 0x00000236564C5F90>
Arguments: ()
Sending event: import_task_choice
Sending event: import
Sending event: cli_exit

Here's a link to the music files that trigger the bug (if relevant):
https://nxtc.arianw.de/s/m2sQx35S37nG8e9

Setup

  • OS: Windows
  • Python version: 3.11.1
  • beets version: 1.6.0
  • Turning off plugins made problem go away (yes/no): no

My configuration (output of beet config) is:

lyrics:
    bing_lang_from: []
    auto: yes
    bing_client_secret: REDACTED
    bing_lang_to:
    google_API_key: REDACTED
    google_engine_ID: REDACTED
    genius_api_key: REDACTED
    fallback:
    force: no
    local: no
    sources:
    - google
    - musixmatch
    - genius
    - tekstowo

plugins: embedart scrub chroma lyrics lastgenre discogs plexupdate VGMplug
directory: F:\Music
library: C:\Users\arian\AppData\Roaming\beets\library.db
art_filename: albumart
threaded: yes

clutter:
- Thumbs.DB
- .DS_Store
- '*.jpg'
- '*.png'

paths:
    default: $albumartist/$album%aunique{}/$disc.$track - $title
    singleton: Non-Album/$artist - $title
    comp: Various Artists/$album%aunique{}/$disc.$track - $title
    albumtype_soundtrack: Soundtracks/$album/$disc.$track - $title

import:
    write: yes
    copy: no
    move: yes
    resume: ask
    incremental: no
    quiet_fallback: skip
    timid: no
    log: C:\Users\arian\AppData\Roaming\beets\beet.log

replace:
    ^\.: _
    '[\x00-\x1f]': _
    '[<>:"\?\*\|]': _
    '[\xE8-\xEB]': e
    '[\xEC-\xEF]': i
    '[\xE2-\xE6]': a
    '[\xF2-\xF6]': o
    '[\xF8]': o
    \.$: _
    \s+$: ''
web:
    host: 0.0.0.0
    port: 8337
scrub:
    auto: yes
plex:
    host: px.arianw.de
    port: 443
    token: REDACTED
    secure: yes
    library_name: Music
    ignore_cert_errors: no
VGMplug:
    lang-priority: en, ja-latn, ja
    source_weight: 0.0
    artist-priority: composers,performers,arrangers
chroma:
    auto: yes
embedart:
    maxwidth: 0
    auto: yes
    compare_threshold: 0
    ifempty: no
    remove_art_file: no
    quality: 0
discogs:
    apikey: REDACTED
    apisecret: REDACTED
    tokenfile: discogs_token.json
    source_weight: 0.5
    user_token: REDACTED
    separator: ', '
    index_tracks: no
lastgenre:
    whitelist: yes
    min_weight: 10
    count: 1
    fallback:
    canonical: no
    source: album
    force: yes
    auto: yes
    separator: ', '
    prefer_specific: no
    title_case: yes
@sampsyo sampsyo added the needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." label Mar 8, 2023
@sampsyo
Copy link
Member

sampsyo commented Mar 8, 2023

As ever, Python 3's automatic handling of encodings can be really annoying, especially on Windows!

It looks like this error is coming from the import log (which goes to a file), not our normal logging stream (which goes to stderr). So I believe the fix on our end might be to force this stream to be UTF-8 here:

loghandler = logging.FileHandler(logpath)

Namely, that could become FileHandler(filename, encoding="utf-8") to force UTF-8. Which is probably what we want, but I'm not entirely sure.

In the mean time, you can work around this problem on your system by setting the environment variable PYTHONUTF8=1.

@sampsyo sampsyo changed the title Unicode Problem Import log should probably force a UTF-8 encoding Mar 8, 2023
@sampsyo sampsyo added bug bugs that are confirmed and actionable and removed needinfo We need more details or follow-up from the filer before this can be tagged "bug" or "feature." labels Mar 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug bugs that are confirmed and actionable
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants