-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconstruct "erased" tags from DWARF debugging info #51
Conversation
src/util/dwarf.rs
Outdated
FormKind::Addr => AttributeValue::Address(u32::from_reader(reader, e)?), | ||
FormKind::Ref => AttributeValue::Reference(u32::from_reader(reader, e)?), | ||
FormKind::Addr => AttributeValue::Address(u32::from_reader(reader, Endian::Big)?), | ||
FormKind::Ref => AttributeValue::Reference(u32::from_reader(reader, Endian::Big)?), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I think we might have to pass in both the "default endianness" and the "current endianness" for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will break PS2 DWARF, which is little endian. Can we fix this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed this by plumbing both "data" and "address" endianness through but I worry it's getting a bit messy. I tested this on Sonic Heroes and the output is identical (and unfortunately it did not discover any erased tags).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool stuff! Recovering partially overwritten data is a bit complicated, so I have a few questions while trying to follow the logic.
src/util/dwarf.rs
Outdated
FormKind::Addr => AttributeValue::Address(u32::from_reader(reader, e)?), | ||
FormKind::Ref => AttributeValue::Reference(u32::from_reader(reader, e)?), | ||
FormKind::Addr => AttributeValue::Address(u32::from_reader(reader, Endian::Big)?), | ||
FormKind::Ref => AttributeValue::Reference(u32::from_reader(reader, Endian::Big)?), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will break PS2 DWARF, which is little endian. Can we fix this?
pub fn parse_int(value: u16) -> Result<Self, TryFromPrimitiveError<Self>> { | ||
if value >> 8 == 0x1 { | ||
// Can appear in erased tags | ||
Self::try_from(value & 0xFF) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, can the user types not appear here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
User types can still appear here, this is a hack to parse these values:
FundType::parse_int: 0101
FundType::parse_int: 0103
FundType::parse_int: 0105
FundType::parse_int: 0106
FundType::parse_int: 0107
FundType::parse_int: 0108
FundType::parse_int: 0109
FundType::parse_int: 010E
FundType::parse_int: 010F
maybe it would be more robust to add enums for all these explicitly, like MwSignedShort = 0x0105
?
} | ||
|
||
#[derive(Debug, Eq, PartialEq, Copy, Clone, IntoPrimitive, TryFromPrimitive)] | ||
#[repr(u8)] | ||
pub enum Modifier { | ||
MwPointerTo = 0x00, // Used in erased tags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 3 values we need to handle in the OOT elf:
00
: used for all pointers exceptsigned int
(not sure what to call this)80
: used for pointers tosigned int
(maybe should be calledMwPointerTo
?)82
: not sure, it's only used twice, for a local variablesigned int & pDL
in two different functions (maybe should be calledMwReferenceTo
? I'm still confused about pointer vs reference here)
Looks good. Thanks for the PR! |
The DWARF info for the OOT GC emulator has useful info that has been partially overwritten by padding tags. Specifically, the overwritten info seems to be tags from things stripped by the linker, mainly unused functions or functions that have been inlined away. So far this has been incredibly helpful so far for figuring out potential inline functions when decomping.
I learned about this because https://github.com/LuigiBlood/dwarfone attempts to include these tags too. It requires some heuristics to parse because the format is not standard, and the first tag's type and length has been overwritten by the padding tag. It's also little-endian unlike the rest of the tags, and there seems to be some MetroWerks-specific stuff too.
I tried to improve the heuristics a little when porting this to dtk. I have no idea if it works for other games, other compiler versions, or C++, so erased tag parsing is hidden behind an
--include-erased
flag. Currently only erased functions are output, it seems like erased local variables are things likestatic double _half$localstatic0$sqrtf__Ff
which are more confusing than useful IMO.Is "erased" a good name for this? Maybe "stripped" or something is better.
Example: