Lingering renderer processes #2

mihe · 2018-03-06T09:37:17Z

Just gave this build a try in our application and in your sample cefmixer application. In both cases I have excellent performance. Disabling vsync in Chromium definitely helped with responsiveness.

I do have an issue that shows up in both applications though, which is that the --type=renderer processes are hanging around after the browser process has closed, and they all seem to be spinning on one thread (each taking up 25% on my 4-core CPU). It seems somewhat random when this happens.

Any idea what might be causing this?

Might be worth mentioning that I have an AMD RX 580 graphics card.

The text was updated successfully, but these errors were encountered:

wesselsga · 2018-03-06T13:54:34Z

When you saw the issue with the cefmixer application - were you just using the stock aquarium url? I've tried to reproduce this here with no luck yet. Just wondering if there was a particular page you are navigating to?

mihe · 2018-03-06T15:01:35Z

I am able to reproduce the issue with the default fishgl URL in cefmixer. It's very random though.

The only way I've been able to reproduce it somewhat consistently is if I launch 6+ instances of the application simultaneously and close them all after a couple of seconds. But even then there's no guarantee that it happens.

wesselsga · 2018-03-14T15:58:02Z

I still have not been able to reproduce this issue locally yet. I'm curious if you build the latest cefmixer project if you still see it. I recently modified how the lifetime of CEF is handled in the test app (I went away from using the multi_threaded_message_loop option). The new code can be found in:

void CefModule::startup() { ... }
void CefModule::shutdown() { ... }

If it still happens - do you have a DEBUG build of CEF that can be attached to the hung process?

mihe · 2018-03-14T16:09:16Z

I'll give the new cefmixer a try as soon as I can to see if I can still reproduce the issue there.

As I've mentioned we do experience this issue in our own application as well. We only just recently switched multi_threaded_message_loop from false to true, and I've had this issue appear with both configurations.

I haven't compiled my own build of your patched CEF, but instead made use of your binary distributions, so I'm not able to debug it unfortunately.

You don't happen to have built your distribution with debug symbols by any chance? Otherwise I'll see about compiling CEF/Chromium if the new cefmixer doesn't work out.

wesselsga · 2018-03-14T18:22:28Z

It will take me a little while, but I will also get the symbols together for the sample distribution.

mihe · 2018-03-15T22:17:07Z

So I finally managed to get my own build going, with debug symbols and everything, and I managed to reproduce the issue in the latest cefmixer.

Here's the call stack for the lingering render process (on the CrRendererMain thread)

NtCreateEvent()
CreateEventW()
base::WaitableEvent::WaitableEvent(base::WaitableEvent::ResetPolicy reset_policy, base::WaitableEvent::InitialState initial_state) Line 27
ui::Gpu::EstablishGpuChannelSync() Line 343
content::RenderThreadImpl::EstablishGpuChannelSync() Line 1972
content::RenderThreadImpl::RequestNewLayerTreeFrameSink(int routing_id, scoped_refptr<content::FrameSwapMessageQueue> frame_swap_message_queue, const GURL & url, const base::RepeatingCallback<void (std::unique_ptr<cc::LayerTreeFrameSink,std::default_delete<cc::LayerTreeFrameSink> >)> & callback) Line 2062
content::RenderWidget::RequestNewLayerTreeFrameSink(const base::RepeatingCallback<void (std::unique_ptr<cc::LayerTreeFrameSink,std::default_delete<cc::LayerTreeFrameSink> >)> & callback) Line 1002
content::RenderWidgetCompositor::RequestNewLayerTreeFrameSink() Line 1220
base::debug::TaskAnnotator::RunTask(const char * queue_function, base::PendingTask * pending_task) Line 53
blink::scheduler::TaskQueueManager::ProcessTaskFromWorkQueue(blink::scheduler::internal::WorkQueue * work_queue, blink::scheduler::LazyNow time_before_task, base::TimeTicks * time_after_task) Line 543
blink::scheduler::TaskQueueManager::DoWork(blink::scheduler::internal::Sequence::WorkType work_type) Line 343
base::debug::TaskAnnotator::RunTask(const char * queue_function, base::PendingTask * pending_task) Line 53
blink::scheduler::internal::ThreadControllerImpl::DoWork(blink::scheduler::internal::Sequence::WorkType work_type) Line 99
base::debug::TaskAnnotator::RunTask(const char * queue_function, base::PendingTask * pending_task) Line 53
base::MessageLoop::RunTask(base::PendingTask * pending_task) Line 399
base::MessageLoop::DoWork() Line 462
base::MessagePumpDefault::Run(base::MessagePump::Delegate * delegate) Line 37
base::RunLoop::Run() Line 136
content::RendererMain(const content::MainFunctionParams & parameters) Line 218
content::ContentMainRunnerImpl::Run() Line 717
service_manager::MainRun(service_manager::MainParams & params) Line 467
service_manager::Main(service_manager::MainParams & params) Line 514
content::ContentMain(const content::ContentMainParams & params) Line 19
CefExecuteProcess(const CefMainArgs & args, scoped_refptr<CefApp> application, void * windows_sandbox_info) Line 200
cef_execute_process(const _cef_main_args_t * args, _cef_app_t * application, void * windows_sandbox_info) Line 194

Let me know if you need anything else.

mihe · 2018-03-16T08:08:14Z

Can confirm that I get the same call stack in our own application as well. It's constantly trying the RenderWidgetCompositor::RequestNewLayerTreeFrameSink task and failing, thereby sending another request, forever and ever.

wesselsga · 2018-03-16T13:14:54Z

With DEBUG builds I have seen a few issues with shutdown and DCHECKs failing. I also added a --grid option to cefmixer and noticed more DCHECK failures with multiple html view instances. The --grid option lets you specify something like --grid=2x2 and cefmixer will tile multiple html views. Works well for stress testing.

I believe the code I added to CEF originally is problematic:

compositor->SetAcceleratedWidget(gfx::kNullAcceleratedWidget);

This code was intended to get Chromium to use the Offscreen FBO, but it seems to have negative consequences. I removed this since it really turned out to be unnecessary because we have another flag on the compositor telling it to use shared textures with the method EnableSharedTextures.

Anyway, I did reproduce the hung process issue here and my call stack was the same as yours - I'm testing the above change now to see if the results are more stable. It did resolve the DCHECK startup failures with multiple instances

mihe · 2018-03-17T06:17:34Z

I am unfortunately still able to reproduce the issue with your latest commit, both with the --grid=2x2 argument as well as without.

wesselsga · 2018-03-18T19:19:57Z

The only way I can seem to reproduce this here is by running a DEBUG build in the debugger and the terminating it prematurely with a Stop Debugging command. I have yet to reproduce this locally with a Release build.

Do you have any tips for reproducing this with a Release build? I've been attempting to launch several instances and quickly closing them ... no luck yet. Even force killing the browser process with Task Manager does not seem to exhibit the lingering render process

mihe · 2018-03-18T19:45:03Z

One thing that made it easier for me to reproduce was to do...

// ...
while (msg.message != WM_QUIT)
{
	static int counter = 0;
	if (counter++ > 1000)
		exit(0);

	if (PeekMessage(&msg, nullptr, 0, 0, PM_REMOVE))
// ...

... in main.cpp, and then launch 5-6 of them in parallel and keep launching them as they disappear.

You can probably come up with better ways to do the same thing, but that worked well enough for me, and lets me reproduce the issue within at most 20-30 launches.

I have had this appear without doing Stop Debugging or exit(0), but using those methods does seem to make the issue appear more frequently.

mihe · 2018-03-18T20:21:53Z

Alright, so I just downloaded the 3.3325.1750 build from the Spotify automated builds website, pointed cefmixer to use it, and just commented out the OnAcceleratedPaint/shared_textures_enabled code. I can happily/unfortunately report that I see the same issue there.

So I assume that means it can't be an issue with your branch then. I'm gonna go back and try a couple of more builds from their archive and see if I can find when it started happening.

Apologies for not trying this earlier on. We jumped straight from 3.3112.1650 to your branch, so I can only assume that this bug appeared somewhere inbetween there.

mihe · 2018-03-18T22:06:56Z

For posterity's sake, it seems to have been introduced with Chromium 64. I don't have the issue in build CEF 3.3239.1723.g071d1c1 / Chromium 63.0.3239.132, and I do have the issue in build CEF 3.3282.1728.g2171fc7 / Chromium 64.0.3282.119.

I'll try and track down what caused it and post an issue in the appropriate forum.

mihe closed this as completed Mar 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lingering renderer processes #2

Lingering renderer processes #2

mihe commented Mar 6, 2018

wesselsga commented Mar 6, 2018

mihe commented Mar 6, 2018

wesselsga commented Mar 14, 2018 •

edited

Loading

mihe commented Mar 14, 2018

wesselsga commented Mar 14, 2018

mihe commented Mar 15, 2018

mihe commented Mar 16, 2018 •

edited

Loading

wesselsga commented Mar 16, 2018 •

edited

Loading

mihe commented Mar 17, 2018

wesselsga commented Mar 18, 2018 •

edited

Loading

mihe commented Mar 18, 2018

mihe commented Mar 18, 2018

mihe commented Mar 18, 2018

Lingering renderer processes #2

Lingering renderer processes #2

Comments

mihe commented Mar 6, 2018

wesselsga commented Mar 6, 2018

mihe commented Mar 6, 2018

wesselsga commented Mar 14, 2018 • edited Loading

mihe commented Mar 14, 2018

wesselsga commented Mar 14, 2018

mihe commented Mar 15, 2018

mihe commented Mar 16, 2018 • edited Loading

wesselsga commented Mar 16, 2018 • edited Loading

mihe commented Mar 17, 2018

wesselsga commented Mar 18, 2018 • edited Loading

mihe commented Mar 18, 2018

mihe commented Mar 18, 2018

mihe commented Mar 18, 2018

wesselsga commented Mar 14, 2018 •

edited

Loading

mihe commented Mar 16, 2018 •

edited

Loading

wesselsga commented Mar 16, 2018 •

edited

Loading

wesselsga commented Mar 18, 2018 •

edited

Loading