Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

SIGHUP handler can throw exception due to reentrant call, which stops all future logging #8769

Closed
michaelkaye opened this issue Nov 17, 2020 · 5 comments · Fixed by #8817
Closed
Assignees
Labels
A-Logging Synapse's logs (structured or otherwise). Not metrics. z-bug (Deprecated Label) z-p2 (Deprecated Label)

Comments

@michaelkaye
Copy link
Contributor

When doing a SIGHUP to reload logging, this exception was thrown, the logs were not rotated, and no further logging was added to the log (the logging infrastructure seemed to remove the old loggers but never create the new ones?)

Seen on matrix.org

--- Logging error ---
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1038, in emit
    self.flush()
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1018, in flush
    self.stream.flush()
  File "/home/synapse/src/synapse/app/_base.py", line 253, in handle_sighup
    i(*args, **kwargs)
  File "/home/synapse/src/synapse/config/logger.py", line 289, in _reload_logging_config
    _load_logging_config(log_config_path)
  File "/home/synapse/src/synapse/config/logger.py", line 278, in _load_logging_config
    logging.config.dictConfig(log_config)
  File "/usr/local/lib/python3.7/logging/config.py", line 799, in dictConfig
    dictConfigClass(config).configure()
  File "/usr/local/lib/python3.7/logging/config.py", line 535, in configure
    _clearExistingHandlers()
  File "/usr/local/lib/python3.7/logging/config.py", line 272, in _clearExistingHandlers
    logging.shutdown(logging._handlerList[:])
  File "/usr/local/lib/python3.7/logging/__init__.py", line 2038, in shutdown
    h.flush()
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1018, in flush
    self.stream.flush()
RuntimeError: reentrant call inside <_io.BufferedWriter name='XXX.log'>
Call stack:
  File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/synapse/src/synapse/app/federation_sender.py", line 24, in <module>
    start(sys.argv[1:])
  File "/home/synapse/src/synapse/app/generic_worker.py", line 988, in start
    _base.start_worker_reactor("synapse-generic-worker", config)
  File "/home/synapse/src/synapse/app/_base.py", line 79, in start_worker_reactor
    run_command=run_command,
  File "/home/synapse/src/synapse/app/_base.py", line 132, in start_reactor
    run()
  File "/home/synapse/src/synapse/app/_base.py", line 116, in run
    run_command()
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/base.py", line 1283, in run
    self.mainLoop()
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/base.py", line 1295, in mainLoop
    self.doIteration(t)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/epollreactor.py", line 235, in doPoll
    log.callWithLogger(selectable, _drdw, selectable, fd, event)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/python/log.py", line 103, in callWithLogger
    return callWithContext({"system": lp}, func, *args, **kw)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/python/log.py", line 86, in callWithContext
    return context.call({ILogContext: newCtx}, func, *args, **kw)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/python/context.py", line 122, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/python/context.py", line 85, in callWithContext
    return func(*args,**kw)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/posixbase.py", line 614, in _doReadOrWrite
    why = selectable.doRead()
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/tcp.py", line 243, in doRead
    return self._dataReceived(data)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/tcp.py", line 249, in _dataReceived
    rval = self.protocol.dataReceived(data)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/endpoints.py", line 132, in dataReceived
    return self._wrappedProtocol.dataReceived(data)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/protocols/tls.py", line 330, in dataReceived
    self._flushReceiveBIO()
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/protocols/tls.py", line 295, in _flushReceiveBIO
    ProtocolWrapper.dataReceived(self, bytes)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/protocols/policies.py", line 120, in dataReceived
    self.wrappedProtocol.dataReceived(data)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/_newclient.py", line 1693, in dataReceived
    self._parser.dataReceived(bytes)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/_newclient.py", line 391, in dataReceived
    HTTPParser.dataReceived(self, data)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/protocols/basic.py", line 579, in dataReceived
    why = self.rawDataReceived(data)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/_newclient.py", line 304, in rawDataReceived
    self.bodyDecoder.dataReceived(data)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/http.py", line 1889, in dataReceived
    data = getattr(self, '_dataReceived_%s' % (self.state,))(data)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/http.py", line 1857, in _dataReceived_TRAILER
    self.finishCallback(data)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/_newclient.py", line 456, in _finished
    self.finisher(rest)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/_newclient.py", line 1050, in dispatcher
    return func(*args, **kwargs)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/_newclient.py", line 1647, in _finishResponse_WAITING
    self._disconnectParser(reason)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/_newclient.py", line 1673, in _disconnectParser
    parser.connectionLost(reason)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/_newclient.py", line 567, in connectionLost
    self.response._bodyDataFinished()
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/_newclient.py", line 1050, in dispatcher
    return func(*args, **kwargs)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/_newclient.py", line 1306, in _bodyDataFinished_CONNECTED
    self._bodyProtocol.connectionLost(reason)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/treq/content.py", line 39, in connectionLost
    self.finished.callback(None)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 460, in callback
    self._startRunCallbacks(result)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 568, in _startRunCallbacks
    self._runCallbacks()
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 654, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/home/synapse/src/synapse/util/async_helpers.py", line 517, in success_cb
    new_d.callback(val)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 460, in callback
    self._startRunCallbacks(result)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 568, in _startRunCallbacks
    self._runCallbacks()
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 654, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 1475, in gotResult
    _inlineCallbacks(r, g, status)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
  File "/home/synapse/src/synapse/metrics/background_process_metrics.py", line 212, in run
    result = await result
  File "/home/synapse/src/synapse/federation/sender/per_destination_queue.py", line 332, in _transaction_transmission_loop
    self._destination, pending_pdus, pending_edus
  File "/home/synapse/src/synapse/util/metrics.py", line 92, in measured_func
    r = await func(self, *args, **kwargs)
  File "/home/synapse/src/synapse/federation/sender/transaction_manager.py", line 163, in send_new_transaction
    logger.info("TX [%s] {%s} got %d response", destination, txn_id, code)
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1383, in info
    self._log(INFO, msg, args, **kwargs)
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1519, in _log
    self.handle(record)
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1529, in handle
    self.callHandlers(record)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/sentry_sdk/integrations/logging.py", line 77, in sentry_patched_callhandlers
    return old_callhandlers(self, record)
Message: 'TX [%s] {%s} got %d response'

Argument were present but not listing in public issue: it did not inherently look like bad log data.

Not sure if related to recent SIGHUP change #8697

This was caused by an automatic midnight SIGHUP as part of log rotation. This failed and no further logs were written until a second manual SIGHUP restarted logging again.

@richvdh
Copy link
Member

richvdh commented Nov 17, 2020

I suspect, without verification, that this is unrelated to any recent changes and is due to the unlikely event of the SIGHUP arriving during an active flush.

@clokep clokep added z-bug (Deprecated Label) A-Logging Synapse's logs (structured or otherwise). Not metrics. z-p2 (Deprecated Label) labels Nov 17, 2020
@richvdh
Copy link
Member

richvdh commented Nov 20, 2020

this happened again last night.

@benbz
Copy link
Contributor

benbz commented Nov 20, 2020

--- Logging error ---
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1038, in emit
    self.flush()
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1018, in flush
    self.stream.flush()
  File "/home/synapse/src/synapse/app/_base.py", line 253, in handle_sighup
    i(*args, **kwargs)
  File "/home/synapse/src/synapse/config/logger.py", line 289, in _reload_logging_config
    _load_logging_config(log_config_path)
  File "/home/synapse/src/synapse/config/logger.py", line 278, in _load_logging_config
    logging.config.dictConfig(log_config)
  File "/usr/local/lib/python3.7/logging/config.py", line 799, in dictConfig
    dictConfigClass(config).configure()
  File "/usr/local/lib/python3.7/logging/config.py", line 535, in configure
    _clearExistingHandlers()
  File "/usr/local/lib/python3.7/logging/config.py", line 272, in _clearExistingHandlers
    logging.shutdown(logging._handlerList[:])
  File "/usr/local/lib/python3.7/logging/__init__.py", line 2038, in shutdown
    h.flush()
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1018, in flush
    self.stream.flush()
RuntimeError: reentrant call inside <_io.BufferedWriter name='XXX.log'>
Call stack:
  File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/synapse/src/synapse/app/generic_worker.py", line 993, in <module>
    start(sys.argv[1:])
  File "/home/synapse/src/synapse/app/generic_worker.py", line 988, in start
    _base.start_worker_reactor("synapse-generic-worker", config)
  File "/home/synapse/src/synapse/app/_base.py", line 79, in start_worker_reactor
    run_command=run_command,
  File "/home/synapse/src/synapse/app/_base.py", line 132, in start_reactor
    run()
  File "/home/synapse/src/synapse/app/_base.py", line 116, in run
    run_command()
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/base.py", line 1283, in run
    self.mainLoop()
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/base.py", line 1295, in mainLoop
    self.doIteration(t)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/epollreactor.py", line 235, in doPoll
    log.callWithLogger(selectable, _drdw, selectable, fd, event)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/python/log.py", line 103, in callWithLogger
    return callWithContext({"system": lp}, func, *args, **kw)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/python/log.py", line 86, in callWithContext
    return context.call({ILogContext: newCtx}, func, *args, **kw)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/python/context.py", line 122, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/python/context.py", line 85, in callWithContext
    return func(*args,**kw)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/posixbase.py", line 614, in _doReadOrWrite
    why = selectable.doRead()
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/tcp.py", line 243, in doRead
    return self._dataReceived(data)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/tcp.py", line 249, in _dataReceived
    rval = self.protocol.dataReceived(data)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/http.py", line 3024, in dataReceived
    return self._channel.dataReceived(data)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/http.py", line 2305, in dataReceived
    return basic.LineReceiver.dataReceived(self, data)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/protocols/basic.py", line 579, in dataReceived
    why = self.rawDataReceived(data)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/http.py", line 2312, in rawDataReceived
    self._transferDecoder.dataReceived(data)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/http.py", line 1755, in dataReceived
    finishCallback(data[contentLength:])
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/http.py", line 2171, in _finishRequestBody
    self.allContentReceived()
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/http.py", line 2284, in allContentReceived
    req.requestReceived(command, path, version)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/http.py", line 946, in requestReceived
    self.process()
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/server.py", line 235, in process
    self.render(resrc)
  File "/home/synapse/src/synapse/http/site.py", line 145, in render
    Request.render(self, resrc)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/web/server.py", line 302, in render
    body = resrc.render(self)
  File "/home/synapse/src/synapse/http/server.py", line 216, in render
    defer.ensureDeferred(self._async_render_wrapper(request))
  File "/home/synapse/src/synapse/logging/context.py", line 669, in g
    return run_in_background(f, *args, **kwargs)
  File "/home/synapse/src/synapse/logging/context.py", line 700, in run_in_background
    res = defer.ensureDeferred(res)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 911, in ensureDeferred
    return _cancellableInlineCallbacks(coro)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 1529, in _cancellableInlineCallbacks
    _inlineCallbacks(None, g, status)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
  File "/home/synapse/src/synapse/http/server.py", line 168, in wrapped_async_request_handler
    await h(self, request)
  File "/home/synapse/src/synapse/http/server.py", line 228, in _async_render_wrapper
    callback_return = await self._async_render(request)
  File "/home/synapse/src/synapse/http/server.py", line 401, in _async_render
    raw_callback_return = callback(request, **kwargs)
  File "/home/synapse/src/synapse/replication/http/_base.py", line 263, in _cached_handler
    return self.response_cache.wrap(txn_id, self._handle_request, request, **kwargs)
  File "/home/synapse/src/synapse/util/caches/response_cache.py", line 156, in wrap
    d = run_in_background(callback, *args, **kwargs)
  File "/home/synapse/src/synapse/logging/context.py", line 700, in run_in_background
    res = defer.ensureDeferred(res)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 911, in ensureDeferred
    return _cancellableInlineCallbacks(coro)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 1529, in _cancellableInlineCallbacks
    _inlineCallbacks(None, g, status)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
  File "/home/synapse/src/synapse/replication/http/send_event.py", line 121, in _handle_request
    "Got event to send with ID: %s into room: %s", event.event_id, event.room_id
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1383, in info
    self._log(INFO, msg, args, **kwargs)
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1519, in _log
    self.handle(record)
  File "/usr/local/lib/python3.7/logging/__init__.py", line 1529, in handle
    self.callHandlers(record)
  File "/home/synapse/env-py37/lib/python3.7/site-packages/sentry_sdk/integrations/logging.py", line 77, in sentry_patched_callhandlers
    return old_callhandlers(self, record)
Message: '%s - %s - {%s} Processed request: %.3fsec/%.3fsec (%.3fsec, %.3fsec) (%.3fsec/%.3fsec/%d) %sB %s "%s %s %s" "%s" [%d dbevts]'

^ Is last night's instance. Sequence of events (mm:ss):

  • 00:40 - Last non-error log line seen in the old log file prior to the above
  • 00:42 - SIGHUP sent to worker by reload process
  • 01:30 - event persistance rates drop by more than half since 00:30
  • 02:12 - reload process initially times out. Reload process is attempted to be killed & repeats every 90s
  • 03:54 - This error is recorded and is the last thing in the old log file
  • 04:00 - event peristance rates still at same level as they dropped to at 01:30
  • 04:30 - event persistance rates spike to more than 2x what they were at 00:30
  • 05:30/06:00 - event persistance rates back to pre-issue levels
  • Some point after 12:26 - SIGHUP manually sent to worker
  • 15:43 - final repetition of reloadp rocess timing out
  • 15:44 - First line in new log file

@benbz
Copy link
Contributor

benbz commented Nov 24, 2020

This happened again overnight, this time on a pusher. If I'm reading the graphs correctly it didn't stop the pusher making requests. We have no logs for ~8hrs though

@erikjohnston
Copy link
Member

Probably the easiest thing to do here is to not flush in the signal handler, but instead queue it up to be done next reactor tick.

@erikjohnston erikjohnston self-assigned this Nov 25, 2020
erikjohnston added a commit that referenced this issue Nov 25, 2020
We can get a SIGHUP at any point, including times where we are not in a
sane state. By deferring calling the handlers until the next reactor
tick we ensure that we don't get unexpected conflicts, e.g. trying to
flush logs from the signal handler while the code was in the process of
writing a log entry.

Fixes #8769.
erikjohnston added a commit that referenced this issue Nov 26, 2020
We can get a SIGHUP at any point, including times where we are not in a
sane state. By deferring calling the handlers until the next reactor
tick we ensure that we don't get unexpected conflicts, e.g. trying to
flush logs from the signal handler while the code was in the process of
writing a log entry.

Fixes #8769.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Logging Synapse's logs (structured or otherwise). Not metrics. z-bug (Deprecated Label) z-p2 (Deprecated Label)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants