Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconstruct "erased" tags from DWARF debugging info #51

Merged
merged 5 commits into from
May 16, 2024

Conversation

cadmic
Copy link
Contributor

@cadmic cadmic commented May 14, 2024

The DWARF info for the OOT GC emulator has useful info that has been partially overwritten by padding tags. Specifically, the overwritten info seems to be tags from things stripped by the linker, mainly unused functions or functions that have been inlined away. So far this has been incredibly helpful so far for figuring out potential inline functions when decomping.

I learned about this because https://github.com/LuigiBlood/dwarfone attempts to include these tags too. It requires some heuristics to parse because the format is not standard, and the first tag's type and length has been overwritten by the padding tag. It's also little-endian unlike the rest of the tags, and there seems to be some MetroWerks-specific stuff too.

I tried to improve the heuristics a little when porting this to dtk. I have no idea if it works for other games, other compiler versions, or C++, so erased tag parsing is hidden behind an --include-erased flag. Currently only erased functions are output, it seems like erased local variables are things like static double _half$localstatic0$sqrtf__Ff which are more confusing than useful IMO.

Is "erased" a good name for this? Maybe "stripped" or something is better.

Example:

$ dtk dwarf dump --include-erased oot-gc/SIM_original.elf
...
// Erased
static int xlFileLoad(char * szFileName /* r30 */, void * ppBuffer /* r31 */) {
    // Local variables
    int nSize; // r1+0x18
    struct tXL_FILE * pFile; // r1+0x14

    // References
    // -> static int (* gpfOpen)(char *, struct DVDFileInfo *);
    // -> struct _XL_OBJECTTYPE gTypeFile;
}
...

Comment on lines 556 to 648
FormKind::Addr => AttributeValue::Address(u32::from_reader(reader, e)?),
FormKind::Ref => AttributeValue::Reference(u32::from_reader(reader, e)?),
FormKind::Addr => AttributeValue::Address(u32::from_reader(reader, Endian::Big)?),
FormKind::Ref => AttributeValue::Reference(u32::from_reader(reader, Endian::Big)?),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I think we might have to pass in both the "default endianness" and the "current endianness" for this

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will break PS2 DWARF, which is little endian. Can we fix this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed this by plumbing both "data" and "address" endianness through but I worry it's getting a bit messy. I tested this on Sonic Heroes and the output is identical (and unfortunately it did not discover any erased tags).

src/util/dwarf.rs Outdated Show resolved Hide resolved
Copy link
Owner

@encounter encounter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff! Recovering partially overwritten data is a bit complicated, so I have a few questions while trying to follow the logic.

Comment on lines 556 to 648
FormKind::Addr => AttributeValue::Address(u32::from_reader(reader, e)?),
FormKind::Ref => AttributeValue::Reference(u32::from_reader(reader, e)?),
FormKind::Addr => AttributeValue::Address(u32::from_reader(reader, Endian::Big)?),
FormKind::Ref => AttributeValue::Reference(u32::from_reader(reader, Endian::Big)?),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will break PS2 DWARF, which is little endian. Can we fix this?

pub fn parse_int(value: u16) -> Result<Self, TryFromPrimitiveError<Self>> {
if value >> 8 == 0x1 {
// Can appear in erased tags
Self::try_from(value & 0xFF)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, can the user types not appear here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User types can still appear here, this is a hack to parse these values:

FundType::parse_int: 0101
FundType::parse_int: 0103
FundType::parse_int: 0105
FundType::parse_int: 0106
FundType::parse_int: 0107
FundType::parse_int: 0108
FundType::parse_int: 0109
FundType::parse_int: 010E
FundType::parse_int: 010F

maybe it would be more robust to add enums for all these explicitly, like MwSignedShort = 0x0105?

}

#[derive(Debug, Eq, PartialEq, Copy, Clone, IntoPrimitive, TryFromPrimitive)]
#[repr(u8)]
pub enum Modifier {
MwPointerTo = 0x00, // Used in erased tags
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this happen?

Copy link
Contributor Author

@cadmic cadmic May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 3 values we need to handle in the OOT elf:

  • 00: used for all pointers except signed int (not sure what to call this)
  • 80: used for pointers to signed int (maybe should be called MwPointerTo?)
  • 82: not sure, it's only used twice, for a local variable signed int & pDL in two different functions (maybe should be called MwReferenceTo? I'm still confused about pointer vs reference here)

@encounter
Copy link
Owner

Looks good. Thanks for the PR!

@encounter encounter merged commit 876b78b into encounter:main May 16, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants