Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ghidra analysis misinterprets the IP-relative x86 calls #6684

Open
NancyAurum opened this issue Jul 1, 2024 · 1 comment
Open

Ghidra analysis misinterprets the IP-relative x86 calls #6684

NancyAurum opened this issue Jul 1, 2024 · 1 comment

Comments

@NancyAurum
Copy link

NancyAurum commented Jul 1, 2024

Describe the bug
Near call analysis fails if the result of adding the operand to the offset of the instruction's end results into 0.
Instead it comes up with some unrelated address as the call target.
For example

6ffe:0579 e8 84 fa

0x579+3 - (0xFFFF-0xfa84+1) = 0 but Ghidra jumps to 7e7c:181f

To Reproduce
Steps to reproduce the behavior:

  1. Analyze the attached st.exe with the default settings
  2. Press G and enter 6ffe:0579

Expected behavior
Ghidra does the arithmetic correctly, at least in the 16bit x86 code, where NULL pointers have legit code at them.

Attachments
st.zip

Environment (please complete the following information):

Additional context
It doesn't appear often, so the current work around is to just override the reference. A simple Python script could be made to do sanity checks on all Ghidra resolved near calls. Apparently the bug is related to the segmented models, since in flat models NULL pointer is treated specially, so Ghidra doesn't expect calls going there.

@NancyAurum
Copy link
Author

Ok. I had near hundred of such near calls over severals functions I wrote two scripts (with the help of ChatGPT, which somehow has expert knowledge of Ghidra).

One script checks presence, and the other fixes them. Please be sure to backup your project, before running any such scripts.

#### This script checks if Ghidra misgenerated near call references ####

def ubytes(bs):
  return map(lambda b: b & 0xff, bs)

def check_near_calls():
  instructions = currentProgram.getListing().getInstructions(True)
  while instructions.hasNext():
      instruction = instructions.next()
      if instruction.getMnemonicString() == "CALL" and instruction.getDefaultOperandRepresentation(0).startswith("0x"):
        ibs = ubytes(instruction.getBytes())
        if ibs[0] == 0xE8:
          call_address = instruction.getAddress()
          refs = getReferencesFrom(call_address)
          for ref in refs:
            if ref.getReferenceType().toString() == "UNCONDITIONAL_CALL":
              adr = ref.getToAddress()
              seg = adr.getSegment()
              ofs = adr.getSegmentOffset()
              fadr = ref.getFromAddress()
              fseg = fadr.getSegment()
              fofs = fadr.getSegmentOffset()
              cseg = call_address.getSegment()
              cofs = call_address.getSegmentOffset()
              disp = ibs[2]*0x100 + ibs[1]
              proper_ofs = (fofs+3 + disp)&0xFFFF
              if ofs != proper_ofs:
                print("Call at {:04X}:{:04X}".format(fseg,fofs))
                print("  target is {:04X}:{:04X} but should be {:04X}:{:04X}"
                  .format(seg,ofs,cseg,proper_ofs))

check_near_calls()
#### This script fixes misgenerated near call references ####
from ghidra.program.model.symbol import RefType, SourceType

# Required to add references
reference_manager = currentProgram.getReferenceManager()

def ubytes(bs):
  return map(lambda b: b & 0xff, bs)

def fix_near_calls():
  instructions = currentProgram.getListing().getInstructions(True)
  while instructions.hasNext():
      instruction = instructions.next()
      if instruction.getMnemonicString() == "CALL" and instruction.getDefaultOperandRepresentation(0).startswith("0x"):
        ibs = ubytes(instruction.getBytes())
        if ibs[0] == 0xE8:
          call_address = instruction.getAddress()
          cseg = call_address.getSegment()
          cofs = call_address.getSegmentOffset()
          disp = ibs[2]*0x100 + ibs[1]
          proper_ofs = (cofs+3 + disp)&0xFFFF
          refs = getReferencesFrom(call_address)
          needs_fix = 0
          for ref in refs:
            if ref.getReferenceType().toString() == "UNCONDITIONAL_CALL":
              adr = ref.getToAddress()
              seg = adr.getSegment()
              ofs = adr.getSegmentOffset()
              fadr = ref.getFromAddress()
              fseg = fadr.getSegment()
              fofs = fadr.getSegmentOffset()
              if seg != cseg or ofs != proper_ofs: needs_fix = 1
          if needs_fix:
            # Likely all references are invalid
            for ref in refs: removeReference(ref)
            # Create the correct reference
            proper_adr = toAddr((cseg << 4) + proper_ofs)
            reference_manager.addMemoryReference(call_address, proper_adr, RefType.UNCONDITIONAL_CALL, SourceType.USER_DEFINED, 0)

fix_near_calls()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant