Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Steel Battalion & Line of Contact: pgraph_surface_invalidate: Assertion `surface != d->pgraph.zeta_binding' failed. #893

Closed
HadetTheUndying opened this issue May 8, 2022 · 11 comments · Fixed by #935
Labels
bug Something isn't working

Comments

@HadetTheUndying
Copy link

Title

https://xemu.app/titles/43430002/#Steel-Battalion
https://xemu.app/titles/43430009/#Steel-Battalion-Line-of-Contact

Bug Description

Game Loads to Intro Screen, waiting for the "Demo" to load to preview gameplay/rendering causes the crash/freeze

Steps to Reproduce:
Launch Game
Wait for the Gameplay Demo to begin

Expected Behavior

Demo to Load and not crash.

I assume it does finish loading and just can't figure out how to render. It does get past this in CXBX but rendering is not correct.(Not sure if you share any render code).

https://i.imgur.com/kmLMx3N.png

xemu Version

Version: 0.7.0
Branch: master
Commit: 9c06980
Date: Sun May 8 01:23:05 AM UTC 2022

System Information

CPU: AMD Ryzen 7 2700X Eight-Core Processor
OS Platform: Linux
OS Version: Void Linux
Manufacturer: AMD
GPU Model: AMD Radeon RX Vega (VEGA10, DRM 3.44.0, 5.16.20_1, LLVM 12.0.1)
Driver: 4.6 (Core Profile) Mesa 21.3.7
Shader: 4.60

Additional Context

gdb output:

[New Thread 0x7ffefe7fc640 (LWP 5096)]
[New Thread 0x7ffefdffb640 (LWP 5097)]
[New Thread 0x7ffefcdb6640 (LWP 5098)]
[Thread 0x7ffefcdb6640 (LWP 5098) exited]
[New Thread 0x7ffefcdb6640 (LWP 5099)]
[New Thread 0x7ffefc5b5640 (LWP 5100)]
[Thread 0x7ffefcdb6640 (LWP 5099) exited]
[Thread 0x7ffefc5b5640 (LWP 5100) exited]
[New Thread 0x7ffefc5b5640 (LWP 5101)]
[New Thread 0x7ffefcdb6640 (LWP 5102)]
[Thread 0x7ffefc5b5640 (LWP 5101) exited]
[New Thread 0x7ffefc5b5640 (LWP 5103)]
[New Thread 0x7ffefbdb4640 (LWP 5104)]
[New Thread 0x7ffefb5b3640 (LWP 5106)]
[New Thread 0x7ffefadb2640 (LWP 5107)]
[Thread 0x7ffefcdb6640 (LWP 5102) exited]
[New Thread 0x7ffefcdb6640 (LWP 5108)]
[New Thread 0x7ffefa5b1640 (LWP 5109)]
[Thread 0x7ffefcdb6640 (LWP 5108) exited]
[Thread 0x7ffefa5b1640 (LWP 5109) exited]
[New Thread 0x7ffefa5b1640 (LWP 5110)]
[New Thread 0x7ffefcdb6640 (LWP 5111)]
[Thread 0x7ffefcdb6640 (LWP 5111) exited]
[Thread 0x7ffefa5b1640 (LWP 5110) exited]
[New Thread 0x7ffefa5b1640 (LWP 5112)]
[New Thread 0x7ffefcdb6640 (LWP 5113)]
[Thread 0x7ffefcdb6640 (LWP 5113) exited]
[Thread 0x7ffefa5b1640 (LWP 5112) exited]
[New Thread 0x7ffefa5b1640 (LWP 5130)]
[New Thread 0x7ffefcdb6640 (LWP 5131)]
[New Thread 0x7ffef769e640 (LWP 5132)]
[New Thread 0x7ffef6e9d640 (LWP 5133)]
[New Thread 0x7ffef669c640 (LWP 5134)]
[Thread 0x7ffef6e9d640 (LWP 5133) exited]
[Thread 0x7ffef669c640 (LWP 5134) exited]
[Thread 0x7ffef769e640 (LWP 5132) exited]
[Thread 0x7ffefa5b1640 (LWP 5130) exited]
[Thread 0x7ffefe7fc640 (LWP 5096) exited]
[Thread 0x7ffefbdb4640 (LWP 5104) exited]
[Thread 0x7ffefcdb6640 (LWP 5131) exited]
[Thread 0x7ffefc5b5640 (LWP 5103) exited]
[Thread 0x7ffefdffb640 (LWP 5097) exited]
[Thread 0x7ffefadb2640 (LWP 5107) exited]
[Thread 0x7ffefb5b3640 (LWP 5106) exited]
[Thread 0x7ffefffff640 (LWP 5091) exited]
xemu: ../hw/xbox/nv2a/pgraph.c:5227: pgraph_surface_invalidate: Assertion `surface != d->pgraph.zeta_binding' failed.

Thread 33 "xemu" received signal SIGABRT, Aborted.
[Switching to Thread 0x7fff2ebf6640 (LWP 5011)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
49	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
@HadetTheUndying HadetTheUndying added the bug Something isn't working label May 8, 2022
@abaire
Copy link
Contributor

abaire commented May 17, 2022

Can you get a backtrace (in a debug build) when that assert hits? pgraph_surface_invalidate is called in a handful of scenarios and it'd be helpful to know exactly what path leads to this case.

@HadetTheUndying
Copy link
Author

HadetTheUndying commented May 17, 2022

Can you get a backtrace (in a debug build) when that assert hits? pgraph_surface_invalidate is called in a handful of scenarios and it'd be helpful to know exactly what path leads to this case.

bt
xemu: ../hw/xbox/nv2a/pgraph.c:5249: pgraph_surface_invalidate: Assertion `surface != d->pgraph.zeta_binding' failed.

Thread 33 "xemu" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffeeebf6640 (LWP 6242)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
49	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
#1  0x00007ffff649b536 in __GI_abort () at abort.c:79
#2  0x00007ffff649b41f in __assert_fail_base (
    fmt=0x7ffff6603ea8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=0x5555560c2e48 "surface != d->pgraph.zeta_binding", 
    file=0x5555560c16e0 "../hw/xbox/nv2a/pgraph.c", line=5249, function=<optimized out>)
    at assert.c:92
#3  0x00007ffff64aa792 in __GI___assert_fail (
    assertion=assertion@entry=0x5555560c2e48 "surface != d->pgraph.zeta_binding", 
    file=file@entry=0x5555560c16e0 "../hw/xbox/nv2a/pgraph.c", line=line@entry=5249, 
    function=function@entry=0x5555560c0a10 <__PRETTY_FUNCTION__.95> "pgraph_surface_invalidate") at assert.c:101
#4  0x0000555555a8b6c5 in pgraph_surface_invalidate (d=d@entry=0x7fff794cb010, 
    surface=surface@entry=0x7ffec4c54580) at ../hw/xbox/nv2a/pgraph.c:5249
#5  0x0000555555a8dd05 in pgraph_surface_put (d=d@entry=0x7fff794cb010, 
    addr=<optimized out>, surface_in=surface_in@entry=0x7ffeeebf1fc0)
    at ../hw/xbox/nv2a/pgraph.c:5195
#6  0x0000555555a8e32b in pgraph_update_surface_part (d=d@entry=0x7fff794cb010, 
    upload=upload@entry=true, color=color@entry=true) at ../hw/xbox/nv2a/pgraph.c:5873
#7  0x0000555555a8eba2 in pgraph_update_surface (d=d@entry=0x7fff794cb010, 
    upload=upload@entry=true, color_write=color_write@entry=true, 
    zeta_write=zeta_write@entry=false) at ../hw/xbox/nv2a/pgraph.c:5969
--Type <RET> for more, q to quit, c to continue without paging--RET
#8  0x0000555555a91cc5 in pgraph_NV097_SET_BEGIN_END_handler (d=d@entry=0x7fff794cb010, 
    pg=pg@entry=0x7fff794edcd0, subchannel=subchannel@entry=0, method=method@entry=6140, 
    parameter=parameter@entry=6, parameters=parameters@entry=0x7ffee7d55ffc, 
    num_words_available=1, num_words_consumed=0x7ffeeebf2260, inc=true)
    at ../hw/xbox/nv2a/pgraph.c:2799
#9  0x0000555555a94309 in pgraph_method (d=d@entry=0x7fff794cb010, 
    subchannel=subchannel@entry=0, method=method@entry=6140, parameter=parameter@entry=6, 
    parameters=parameters@entry=0x7ffee7d55ffc, 
    num_words_available=num_words_available@entry=1, max_lookahead_words=1073314425, 
    inc=true) at ../hw/xbox/nv2a/pgraph.c:1118
#10 0x0000555555a8500f in pfifo_run_puller (d=d@entry=0x7fff794cb010, 
    method_entry=<optimized out>, parameter=6, parameters=0x7ffee7d55ffc, 
    num_words_available=1, max_lookahead_words=1073314425) at ../hw/xbox/nv2a/pfifo.c:226
#11 0x0000555555a851ae in pfifo_run_pusher (d=d@entry=0x7fff794cb010)
    at ../hw/xbox/nv2a/pfifo.c:337
#12 0x0000555555a85855 in pfifo_thread (arg=arg@entry=0x7fff794cb010)
    at ../hw/xbox/nv2a/pfifo.c:489
#13 0x0000555555c4ac3b in qemu_thread_start (args=0x7fff8019ee80)
    at ../util/qemu-thread-posix.c:541
#14 0x00007ffff6645eae in start_thread (arg=0x7ffeeebf6640) at pthread_create.c:463
#15 0x00007ffff65732ff in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

@HadetTheUndying
Copy link
Author

Adding a pgraph trace as well
pgraghtrace.txt

@abaire
Copy link
Contributor

abaire commented May 17, 2022

It looks like the surface_put is evicting the overlapping zeta surface. Interestingly the zeta surface offset is set to 0 and depth testing and writes are both disabled, so it may be that the intent of the game is to ignore the zeta surface entirely and xemu is being unnecessarily protective of keeping a valid surface bound.

I can't tell from the pgraph log what the actual address of the zeta surface is, can you run again with

#define DBG_SURFACES 1
#define DBG_SURFACE_SYNC 1

and post the last 20 lines or so of output? I expect that it'll tell us the zeta surface is being evicted for overlapping, and it'll also tell us the actual vram addr.

@HadetTheUndying
Copy link
Author

nv2a: Target: [COLOR @ 3a18000] (ln) aa:0 clip:x=0,w=1280,y=0,h=720
nv2a:  Match: [COLOR @ 3a18000 (1280x720)] (ln) aa:0, clip:x=0,w=1280,y=0,h=720
nv2a:    Hit: [COLOR @ 3a18000 (1280x720)] (ln) aa:0, clip:x=0,w=1280,y=0,h=720
nv2a: Target: [COLOR @ 3694000] (ln) aa:0 clip:x=0,w=1280,y=0,h=720
nv2a:  Match: [COLOR @ 3694000 (1280x720)] (ln) aa:0, clip:x=0,w=1280,y=0,h=720
nv2a:    Hit: [COLOR @ 3694000 (1280x720)] (ln) aa:0, clip:x=0,w=1280,y=0,h=720
nv2a: Target: [COLOR @ 3a18000] (ln) aa:0 clip:x=0,w=1280,y=0,h=720
nv2a:  Match: [COLOR @ 3a18000 (1280x720)] (ln) aa:0, clip:x=0,w=1280,y=0,h=720
nv2a:    Hit: [COLOR @ 3a18000 (1280x720)] (ln) aa:0, clip:x=0,w=1280,y=0,h=720
nv2a: Target: [COLOR @ 35c0000] (ln) aa:0 clip:x=0,w=256,y=0,h=256
nv2a: Create: [COLOR @ 35c0000 (256x256)] (ln) aa:0, clip:x=0,w=256,y=0,h=256
nv2a: [RAM->GPU] COLOR (lin) surface @ 35c0000 (w=256,h=256,p=1024,bpp=4)
nv2a: Target: [COLOR @ 3694000] (ln) aa:0 clip:x=0,w=1280,y=0,h=720
nv2a:  Match: [COLOR @ 3694000 (1280x720)] (ln) aa:0, clip:x=0,w=1280,y=0,h=720
nv2a:    Hit: [COLOR @ 3694000 (1280x720)] (ln) aa:0, clip:x=0,w=1280,y=0,h=720
nv2a: Target: [ ZETA @ 2f7c000] (ln) aa:0 clip:x=0,w=1800,y=0,h=1800
nv2a: Create: [ ZETA @ 2f7c000 (1800x1800)] (ln) aa:0, clip:x=0,w=1800,y=0,h=1800
nv2a: [RAM->GPU] ZETA (lin) surface @ 2f7c000 (w=1800,h=1800,p=3648,bpp=2)
nv2a: [GPU->RAM] COLOR (lin) surface @ 35c0000 (w=256,h=256,p=1024,bpp=4)
nv2a: Target: [ ZETA @ 2f7c000] (ln) aa:0 clip:x=0,w=1800,y=0,h=1800
nv2a:  Match: [ ZETA @ 2f7c000 (1800x1800)] (ln) aa:0, clip:x=0,w=1800,y=0,h=1800
nv2a:    Hit: [ ZETA @ 2f7c000 (1800x1800)] (ln) aa:0, clip:x=0,w=1800,y=0,h=1800
nv2a: Target: [COLOR @ 303c000] (sz) aa:0 clip:x=0,w=64,y=0,h=64
nv2a: Evicting overlapping surface @ 2f7c000 (1800x1800)
nv2a: [GPU->RAM] ZETA (lin) surface @ 2f7c000 (w=1800,h=1800,p=3648,bpp=2)
xemu: ../hw/xbox/nv2a/pgraph.c:5249: pgraph_surface_invalidate: Assertion `surface != d->pgraph.zeta_binding' failed.

@abaire
Copy link
Contributor

abaire commented May 18, 2022

Thanks!

What I see from the pgraph log (filtered for relevance) is:

It sets Zeta to 0x2F7C000 in an operation where Color is set to 0. Color is masked off, depth is masked on.

NV097_SET_SURFACE_PITCH<0x20C> (0xE400E40)
NV097_SET_SURFACE_COLOR_OFFSET<0x210> (0x0)
NV097_SET_SURFACE_ZETA_OFFSET<0x214> (0x2F7C000)
NV097_SET_COLOR_MASK<0x358> (0x0 {Red:RO, Green:RO, Blue:RO, Alpha:RO})
NV097_SET_DEPTH_MASK<0x35C> (NV097_SET_DEPTH_MASK_V_TRUE<0x1>)

A large number of draw operations occur (presumably it's building up a depth map for shadows or something)

It then briefly sets

NV097_SET_SURFACE_COLOR_OFFSET<0x210> (0x3694000)
NV097_SET_SURFACE_ZETA_OFFSET<0x214> (0x2BF8000)
NV097_SET_DEPTH_TEST_ENABLE<0x30C> (NV097_SET_DEPTH_TEST_ENABLE_V_TRUE<0x1>)
NV097_SET_STENCIL_TEST_ENABLE<0x32C> (NV097_SET_STENCIL_TEST_ENABLE_V_FALSE<0x0>)
NV097_SET_COLOR_MASK<0x358> (0x1010101 {Red:W, Green:W, Blue:W, Alpha:W})

but sets Color to 0x303C000, Zeta to 0 and disables zeta writing/testing before doing any actual drawing.

NV097_SET_SURFACE_COLOR_OFFSET<0x210> (0x303C000)
NV097_SET_SURFACE_ZETA_OFFSET<0x214> (0x0)
NV097_SET_DEPTH_TEST_ENABLE<0x30C> (NV097_SET_DEPTH_TEST_ENABLE_V_TRUE<0x1>)
NV097_SET_STENCIL_TEST_ENABLE<0x32C> (NV097_SET_STENCIL_TEST_ENABLE_V_FALSE<0x0>)
NV097_SET_DEPTH_MASK<0x35C> (NV097_SET_DEPTH_MASK_V_FALSE<0x0>)
NV097_SET_DEPTH_TEST_ENABLE<0x30C> (NV097_SET_DEPTH_TEST_ENABLE_V_FALSE<0x0>)

As you can see from the last output before the assertion, it is attempting to evict 2f7c000, but for some reason it is still bound as the zeta surface despite the fact that the offset was set to 0. I assume this is the ultimate cause of the issue and will see if I can craft a test case to exercise this assertion.

@Triticum0
Copy link

Note to abarie: When you figure out the issue. Close these #349 #405

@Triticum0
Copy link

Triticum0 commented May 18, 2022

Also the original issue #116

@abaire
Copy link
Contributor

abaire commented May 18, 2022

Also the original issue #116

I think this is likely a separate problem from #116, from what I see in the pgraph log I think there isn't really any overlap at all and xemu is erroneously asserting.

@abaire
Copy link
Contributor

abaire commented May 19, 2022

I was able to create a test case that reproduces the conditions I see from the pgraph log (and in #932)
Test case
HW results

@HadetTheUndying
Copy link
Author

I was able to create a test case that reproduces the conditions I see from the pgraph log (and in #932) Test case HW results

Can confirm Line of Contact gets in game. The performance is awful at the moment, but the assert is no longer happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants