Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opaque pointer support (type inference) for c #1323

Merged
merged 77 commits into from
Jan 25, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
f3761ea
find type assert
jumormt Jan 6, 2024
18970cb
add assertion checking num field
jumormt Jan 6, 2024
40b8def
fix indirect call does not have called func
jumormt Jan 6, 2024
19fc337
add type check option
jumormt Jan 8, 2024
f57fc37
add type check option && first version of collect types
jumormt Jan 8, 2024
f38e36a
add "store (value operand) <- gep" rule
jumormt Jan 8, 2024
9c8d789
add inter-procedural rule
jumormt Jan 8, 2024
8e13066
add some comments
jumormt Jan 8, 2024
5144f86
fix call base bug
jumormt Jan 8, 2024
fae7e57
add callsite to callee inference
jumormt Jan 8, 2024
8934f21
do not propagate to declare func
jumormt Jan 8, 2024
4fcfafe
fix vararg
jumormt Jan 8, 2024
14efeb5
refactor
jumormt Jan 9, 2024
27fb0ea
use hash map to cache results
jumormt Jan 9, 2024
2e6d12d
refine
jumormt Jan 9, 2024
6f62afb
iterative memorizing dfs
jumormt Jan 9, 2024
7279bee
iterative memorizing dfs
jumormt Jan 9, 2024
5edb5f2
fix bug
jumormt Jan 9, 2024
3138be2
add comments and fix bug for c++20
jumormt Jan 9, 2024
7cad38a
fix deadloop
jumormt Jan 9, 2024
cc5f2aa
fix deadloop
jumormt Jan 9, 2024
28b941e
refine coding
jumormt Jan 9, 2024
f1cab1a
refactor inferTypeOfHeapObjOrStaticObj
jumormt Jan 9, 2024
8c838bf
refactor inferTypeOfHeapObjOrStaticObj
jumormt Jan 9, 2024
87b671c
add a new typeinference class
jumormt Jan 10, 2024
d38f373
refactor
jumormt Jan 10, 2024
076ca38
fix bug
jumormt Jan 10, 2024
8d15f3f
add debug support
jumormt Jan 10, 2024
cf89135
add default type
jumormt Jan 10, 2024
4712b6b
move getptrelementtype to test
jumormt Jan 10, 2024
550ebb1
move getClassNameOfThisPtr to chg builder and add diff test
jumormt Jan 10, 2024
3063af8
remove getelementptr in SVFExt and llvmmodule
jumormt Jan 10, 2024
376d729
return max field for ptr type
jumormt Jan 11, 2024
fd0e4d4
default for heap is i8
jumormt Jan 11, 2024
9a89761
for param of ext api, first find source and then derive type
jumormt Jan 11, 2024
868e023
refactor getOrInferLLVMObjType
jumormt Jan 11, 2024
9f45f32
skip global function value -> callsite when forward infer type
jumormt Jan 11, 2024
578a379
update comment
jumormt Jan 11, 2024
ef078d7
fix passing a function as a param
jumormt Jan 11, 2024
8889fd0
rename typesizedifftest, remove unnecessary getfirstcast
jumormt Jan 12, 2024
ef8ccd1
remove getptrelement and getpointerto in svf/*
jumormt Jan 12, 2024
058f626
ptr in svf main
jumormt Jan 12, 2024
b13b37f
thisptr class name prepare
jumormt Jan 12, 2024
8639716
infer type based on c++ constructor
jumormt Jan 13, 2024
ec19b79
add comments
jumormt Jan 13, 2024
1d33ec4
getOrInferThisPtrClassName update
jumormt Jan 13, 2024
d17948b
refine type diff test
jumormt Jan 14, 2024
5f35694
refine type diff test
jumormt Jan 14, 2024
1d52b55
fix indirect call passing
jumormt Jan 14, 2024
f854a53
separate cpp source and allocation
jumormt Jan 15, 2024
1857464
refactor iscpp constructor
jumormt Jan 15, 2024
a243d10
update
jumormt Jan 15, 2024
8efc471
for c++ fw type inference, consider constructor for now
jumormt Jan 15, 2024
04d89a8
update
jumormt Jan 15, 2024
f3a4f8e
delete c++
jumormt Jan 15, 2024
6cf0f73
delete c++
jumormt Jan 15, 2024
146fbd8
update based on comments
jumormt Jan 16, 2024
ab8963d
rename
jumormt Jan 16, 2024
ab04330
refactor
jumormt Jan 17, 2024
0aeeacb
reformat
jumormt Jan 17, 2024
76cdfc3
refactor
jumormt Jan 17, 2024
22dd059
move typeinference to LLVMModuleSet
jumormt Jan 21, 2024
12b3431
release llvm modouleset immediately after svfir is built
jumormt Jan 21, 2024
7352b98
remove static method
jumormt Jan 21, 2024
c363ad3
fix ander diff test
jumormt Jan 21, 2024
6ffb132
fix saber
jumormt Jan 21, 2024
3cdc19e
bytesize default 1
jumormt Jan 24, 2024
37f43f2
fix wpa release llvm module
jumormt Jan 24, 2024
8118f53
fix mac CI
jumormt Jan 24, 2024
eb34d0a
delete cpp
jumormt Jan 24, 2024
a176ca5
move release llvmmoduleset to the end of main
jumormt Jan 24, 2024
63aa46e
merge SVF master into opaque c
jumormt Jan 24, 2024
4d2f565
rename typeinference to objtypeinference
jumormt Jan 25, 2024
82f2e31
enable type check by default
jumormt Jan 25, 2024
9e61dd0
refactor
jumormt Jan 25, 2024
b0980a0
rename cpp and add ctest type infer CI
jumormt Jan 25, 2024
d620c42
add some assertions for npd
jumormt Jan 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
return max field for ptr type
  • Loading branch information
jumormt committed Jan 24, 2024
commit 376d729400c7ff04a9012b46caed1f09a88e2259
2 changes: 1 addition & 1 deletion svf-llvm/include/SVF-LLVM/SymbolTableBuilder.h
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ class SymbolTableBuilder
std::unique_ptr<TypeInference> & getTypeInference();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

manual release


/// Forward collect all possible infer sites starting from a value
const Type* getOrInferLLVMObjType(const Value *startValue);
const Type* fwGetOrInferLLVMObjType(const Value *startValue);

/// Get the reference type of heap/static object from an allocation site.
//@{
Expand Down
23 changes: 18 additions & 5 deletions svf-llvm/include/SVF-LLVM/TypeInference.h
Original file line number Diff line number Diff line change
Expand Up @@ -36,13 +36,19 @@ namespace SVF {
class TypeInference {

public:
typedef Map<const Value *, Set<const Value *>> ValueToInferSites;
typedef Set<const Value *> ValueSet;
typedef Map<const Value *, ValueSet> ValueToValueSet;
typedef ValueToValueSet ValueToInferSites;
typedef ValueToValueSet ValueToSources;
typedef Map<const Value *, const Type *> ValueToType;
typedef std::pair<const Value *, bool> ValueBoolPair;


private:
static std::unique_ptr<TypeInference> _typeInference;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove unique_ptr

ValueToInferSites _valueToInferSites; // value inference site cache
ValueToType _valueToType; // value type cache
ValueToSources _valueToSources; // value type cache

explicit TypeInference() = default;

Expand All @@ -59,7 +65,10 @@ class TypeInference {
}

/// Forward collect all possible infer sites starting from a value
const Type *getOrInferLLVMObjType(const Value *startValue);
const Type *fwGetOrInferLLVMObjType(const Value *startValue);

/// Backward collect all possible sources starting from a value
Set<const Value*> bwGetOrfindSourceVals(const Value * startValue);

/// Validate type inference
void validateTypeCheck(const CallBase *cs);
Expand All @@ -68,12 +77,16 @@ class TypeInference {

void typeDiffTest(const PointerType *oPTy, const Type *iTy, const Value *val);

/// Default type
const Type *defaultTy(const Value *val);

protected:
private:
static const Type *infersiteToType(const Value *val);

/// Default type
const Type *defaultTy(const Value *val);
inline bool isSourceVal(const Value* val) const {
return LLVMUtil::isObject(val);
}

};
}
#endif //SVF_TYPEINFERENCE_H
2 changes: 1 addition & 1 deletion svf-llvm/lib/CHGBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -679,7 +679,7 @@ std::string CHGBuilder::getClassNameOfThisPtr(const CallBase* inst)
{
const Value* thisPtr = LLVMUtil::getVCallThisPtr(inst);
if (const PointerType *ptrTy = SVFUtil::dyn_cast<PointerType>(thisPtr->getType())) {
const Type *objTy = TypeInference::getTypeInference()->getOrInferLLVMObjType(thisPtr);
const Type *objTy = TypeInference::getTypeInference()->fwGetOrInferLLVMObjType(thisPtr);
TypeInference::getTypeInference()->typeDiffTest(ptrTy, objTy, thisPtr);
// TODO: getPtrElementType need type inference
if (const StructType *st = SVFUtil::dyn_cast<StructType>(getPtrElementType(ptrTy))) {
Expand Down
4 changes: 3 additions & 1 deletion svf-llvm/lib/SVFIRExtAPI.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,9 @@ const Type* SVFIRBuilder::getBaseTypeAndFlattenedFields(const Value* V, std::vec
{
assert(V);
const Value* value = getBaseValueForExtArg(V);
const Type *objType = TypeInference::getTypeInference()->getOrInferLLVMObjType(value);
// Set<const Value *> sources = TypeInference::getTypeInference()->bwGetOrfindSourceVals(value);

const Type *objType = TypeInference::getTypeInference()->fwGetOrInferLLVMObjType(value);
u32_t numOfElems = pag->getSymbolInfo()->getNumOfFlattenElements(LLVMModuleSet::getLLVMModuleSet()->getSVFType(objType));
/// use user-specified size for this copy operation if the size is a constaint int
if(szValue && SVFUtil::isa<ConstantInt>(szValue))
Expand Down
39 changes: 16 additions & 23 deletions svf-llvm/lib/SymbolTableBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -575,8 +575,8 @@ std::unique_ptr<TypeInference> & SymbolTableBuilder::getTypeInference() {
}


const Type* SymbolTableBuilder::getOrInferLLVMObjType(const Value *startValue) {
return getTypeInference()->getOrInferLLVMObjType(startValue);
const Type* SymbolTableBuilder::fwGetOrInferLLVMObjType(const Value *startValue) {
return getTypeInference()->fwGetOrInferLLVMObjType(startValue);
}

/*!
Expand All @@ -597,15 +597,15 @@ const Type* SymbolTableBuilder::inferTypeOfHeapObjOrStaticObj(const Instruction
originalPType = newTy;
}
}
inferedType = getOrInferLLVMObjType(startValue);
inferedType = fwGetOrInferLLVMObjType(startValue);
}
else if(SVFUtil::isHeapAllocExtCallViaArg(svfinst))
{
const CallBase* cs = LLVMUtil::getLLVMCallSite(inst);
int arg_pos = SVFUtil::getHeapAllocHoldingArgPosition(SVFUtil::getSVFCallSite(svfinst));
const Value* arg = cs->getArgOperand(arg_pos);
originalPType = SVFUtil::dyn_cast<PointerType>(arg->getType());
inferedType = getOrInferLLVMObjType(startValue = arg);
inferedType = fwGetOrInferLLVMObjType(startValue = arg);
}
else
{
Expand Down Expand Up @@ -802,26 +802,19 @@ u32_t SymbolTableBuilder::analyzeHeapAllocByteSize(const Value* val)
*/
u32_t SymbolTableBuilder::analyzeHeapObjType(ObjTypeInfo* typeinfo, const Value* val)
{
if(const Value* castUse = getFirstUseViaCastInst(val))
{
typeinfo->setFlag(ObjTypeInfo::HEAP_OBJ);
analyzeObjType(typeinfo,castUse);
const Type* objTy = LLVMModuleSet::getLLVMModuleSet()->getLLVMType(typeinfo->getType());
if(SVFUtil::isa<ArrayType>(objTy))
typeinfo->setFlag(ObjTypeInfo::HEAP_OBJ);
analyzeObjType(typeinfo, val);
const Type* objTy = LLVMModuleSet::getLLVMModuleSet()->getLLVMType(typeinfo->getType());
if(SVFUtil::isa<ArrayType>(objTy))
return getNumOfElements(objTy);
else if(const StructType* st = SVFUtil::dyn_cast<StructType>(objTy))
{
/// For an C++ class, it can have variant elements depending on the vtable size,
/// Hence we only handle non-cpp-class object, the type of the cpp class is treated as default PointerType
if(classTyHasVTable(st))
typeinfo->resetTypeForHeapStaticObj(LLVMModuleSet::getLLVMModuleSet()->getSVFType(TypeInference::getTypeInference()->defaultTy(val)));
else
return getNumOfElements(objTy);
else if(const StructType* st = SVFUtil::dyn_cast<StructType>(objTy))
{
/// For an C++ class, it can have variant elements depending on the vtable size,
/// Hence we only handle non-cpp-class object, the type of the cpp class is treated as PointerType at the cast site
if(classTyHasVTable(st))
typeinfo->resetTypeForHeapStaticObj(LLVMModuleSet::getLLVMModuleSet()->getSVFType(castUse->getType()));
else
return getNumOfElements(objTy);
}
}
else
{
typeinfo->setFlag(ObjTypeInfo::HEAP_OBJ);
}
return typeinfo->getMaxFieldOffsetLimit();
}
Expand Down
97 changes: 89 additions & 8 deletions svf-llvm/lib/TypeInference.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -100,19 +100,101 @@ const Type *TypeInference::infersiteToType(const Value *val) {
}
}

Set<const Value*> TypeInference::bwGetOrfindSourceVals(const Value* startValue) {

// consult cache
auto tIt = _valueToSources.find(startValue);
if (tIt != _valueToSources.end()) {
WARN_IFNOT(!tIt->second.empty(), "empty type:" + VALUE_WITH_DBGINFO(startValue));
return !tIt->second.empty() ? tIt->second : Set<const Value*>({startValue});
}

// simulate the call stack, the second element indicates whether we should update valueTypes for current value
FILOWorkList<ValueBoolPair> workList;
Set<ValueBoolPair> visited;
workList.push({startValue, false});
while (!workList.empty()) {
auto curPair = workList.pop();
if (visited.count(curPair)) continue;
visited.insert(curPair);
const Value *curValue = curPair.first;
bool canUpdate = curPair.second;

Set<const Value*> sources;
auto insertSource = [&sources, &canUpdate](const Value *source) {
if (canUpdate) sources.insert(source);
};
auto insertSourcesOrPushWorklist = [this, &sources, &workList, &canUpdate](const auto &pUser) {
auto vIt = _valueToSources.find(pUser);
if (canUpdate) {
if (vIt != _valueToSources.end()) {
sources.insert(vIt->second.begin(), vIt->second.end());
}
} else {
if (vIt == _valueToSources.end()) workList.push({pUser, false});
}
};

if (!canUpdate && !_valueToSources.count(curValue)) {
workList.push({curValue, true});
}

if(isSourceVal(curValue)) {
insertSource(curValue);
} else if (const BitCastInst *bitCastInst = SVFUtil::dyn_cast<BitCastInst>(curValue)) {
Value *prevVal = bitCastInst->getOperand(1);
insertSourcesOrPushWorklist(prevVal);
} else if (const PHINode *phiNode = SVFUtil::dyn_cast<PHINode>(curValue)) {
for (u32_t i = 1; i < phiNode->getNumOperands(); ++i) {
insertSourcesOrPushWorklist(phiNode->getOperand(i));
}
} else if (const LoadInst *loadInst = SVFUtil::dyn_cast<LoadInst>(curValue)) {
for (const auto &use: loadInst->getPointerOperand()->uses()) {
if (const StoreInst *storeInst = SVFUtil::dyn_cast<StoreInst>(use.getUser())) {
if (storeInst->getPointerOperand() == loadInst) {
insertSourcesOrPushWorklist(storeInst->getValueOperand());
}
}
}
} else if (const Argument *argument = SVFUtil::dyn_cast<Argument>(curValue)) {
for (const auto &use: argument->getParent()->uses()) {
if (const CallBase *callBase = SVFUtil::dyn_cast<CallBase>(use.getUser())) {
insertSourcesOrPushWorklist(callBase->getArgOperand(argument->getArgNo()));
}
}
} else if (const CallBase *callBase = SVFUtil::dyn_cast<CallBase>(curValue)) {
ABORT_IFNOT(!callBase->doesNotReturn(), "callbase does not return:" + VALUE_WITH_DBGINFO(callBase));
if (Function *callee = callBase->getCalledFunction()) {
if (!callee->isDeclaration()) {
const SVFFunction *svfFunc = LLVMModuleSet::getLLVMModuleSet()->getSVFFunction(callee);
const Value *pValue = LLVMModuleSet::getLLVMModuleSet()->getLLVMValue(svfFunc->getExitBB()->back());
const ReturnInst *retInst = SVFUtil::dyn_cast<ReturnInst>(pValue);
ABORT_IFNOT(retInst && retInst->getReturnValue(), "not return inst?");
insertSourcesOrPushWorklist(retInst->getReturnValue());
}
}
}
if (canUpdate) {
_valueToSources[curValue] = SVFUtil::move(sources);
}
}
Set<const Value*> srcs = _valueToSources[startValue];
if (srcs.empty()) {
srcs = {startValue};
WARN_MSG("Using default type, trace ID is " + std::to_string(traceId) + ":" + VALUE_WITH_DBGINFO(startValue));
}
return srcs;
}

const Type *TypeInference::defaultTy(const Value *val) {
ABORT_IFNOT(val, "val cannot be null");
// heap has a default type of 8-bit integer type
if(SVFUtil::isa<Instruction>(val) && SVFUtil::isHeapAllocExtCallViaRet(LLVMModuleSet::getLLVMModuleSet()->getSVFInstruction(SVFUtil::cast<Instruction>(val))))
return Type::getInt8Ty(LLVMModuleSet::getLLVMModuleSet()->getContext());
// otherwise we return a pointer type in the default address space
return PointerType::getUnqual(LLVMModuleSet::getLLVMModuleSet()->getContext());
}
/*!
* Forward collect all possible infer sites starting from a value
* @param startValue
*/
const Type *TypeInference::getOrInferLLVMObjType(const Value *startValue) {
const Type *TypeInference::fwGetOrInferLLVMObjType(const Value *startValue) {
// consult cache
auto tIt = _valueToType.find(startValue);
if (tIt != _valueToType.end()) {
Expand All @@ -123,7 +205,6 @@ const Type *TypeInference::getOrInferLLVMObjType(const Value *startValue) {
INC_TRACE();

// simulate the call stack, the second element indicates whether we should update valueTypes for current value
typedef std::pair<const Value *, bool> ValueBoolPair;
FILOWorkList<ValueBoolPair> workList;
Set<ValueBoolPair> visited;
workList.push({startValue, false});
Expand Down Expand Up @@ -259,7 +340,7 @@ const Type *TypeInference::getOrInferLLVMObjType(const Value *startValue) {
const Type* type = _valueToType[startValue];
if (type == nullptr) {
type = getTypeInference()->defaultTy(startValue);
WARN_MSG("empty type, trace ID is " + std::to_string(traceId) + ":" + VALUE_WITH_DBGINFO(startValue));
WARN_MSG("Using default type, trace ID is " + std::to_string(traceId) + ":" + VALUE_WITH_DBGINFO(startValue));
}
return type;
}
Expand All @@ -272,7 +353,7 @@ const Type *TypeInference::getOrInferLLVMObjType(const Value *startValue) {
void TypeInference::validateTypeCheck(const CallBase *cs) {
if (const Function *func = cs->getCalledFunction()) {
if (func->getName().find(TYPEMALLOC) != std::string::npos) {
const Type *objType = getOrInferLLVMObjType(cs);
const Type *objType = fwGetOrInferLLVMObjType(cs);
ConstantInt *pInt =
SVFUtil::dyn_cast<llvm::ConstantInt>(cs->getOperand(1));
assert(pInt && "the second argument is a integer");
Expand Down