Skip to content

Extrapolating knowledge graphs from unstructured text using GPT-3 🕵️‍♂️

License

Notifications You must be signed in to change notification settings

HUIXIN-TW/GraphGPT

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GraphGPT

Natural Language → Knowledge Graph

demo

If you want to use knowledge graphs in your project, check out GPT Index.

GraphGPT converts unstructured natural language into a knowledge graph. Pass in the synopsis of your favorite movie, a passage from a confusing Wikipedia page, or transcript from a video to generate a graph visualization of entities and their relationships.

Successive queries can update the existing state of the graph or create an entirely new structure. For example, updating the current state could involve injecting new information through nodes and edges or changing the color of certain nodes.

The current few-shot prompt guides GPT-3 in accurately understanding the JSON formatting GraphGPT requires for proper rendering. You can see the entire prompt in public/prompts/xxx.prompt. A major issue at the moment is latency. Due to the nature of OpenAI API calls, it takes up to 20 seconds to receive a response.

Expected Functions

  • Take csv files as input
  • Generate a graph from the csv files
  • Allow users to query the graph
  • Allow users to update the graph
  • Allow users to change the color of nodes
  • Allow users to change the color of edges
  • Allow users to export the graph as a triple class

Possoible Issues

  • When the input is empty and users click "Generate", the default or random graph will be displayed
  • The Graph zone is not scrollable
  • The graph is not responsive to the size of the window.
  • Stateless and Stateful components require different formatting due to stop symbol.

How to install GraphGPT

Download or Fork

Run web app

  1. Run npm install to download required dependencies (currently just react-graph-vis).
  2. Make sure you have an OpenAI API key. You will enter this into the web app when running queries.
  3. Run npm run start. GraphGPT should open up in a new browser tab.

Find a example file

Prompts are located in the public/prompts folder.

Test the prompt example

  1. text input

  2. change the color of the node

  3. remove duplicate nodes

Modify the prompt to make GraphGTP works better!

Data

Obtain all csv files from AccreditationExplorer

File Name Content Structure
activity.csv Unit Activities
  • unitCode: string, to uniquely identify a unit
  • unit: string, name of unit
  • activity: string, to match different activity according to unit
advisable_prior_study.csv Unit Prior Studies Prerequisites
  • unitCode: string, uniquely identify a unit that one wants to study
  • apUnitCode: string, uniquely identify a unit advised to have studied before
AQF.csv Unit Outcomes and AQF ID
  • unitCode: string, uniquely identify a unit
  • aqfId: string, identify corresponding AQF Id
  • outcomeId: integer, corresponding to unit outcome Id
cbok_end.csv End Level CBoK (E.g: abstraction)
  • End: End levels CBoK
  • Sub: Sub levels CBoK, map to End levels
cbok_sub.csv Sub Level CBoK (E.g: problem solving)
  • Sub: Sub levels CBoK
  • Top: Top levels CBoK, map to Sub levels
cbok_top.csv Top Level CBoK (E.g: General, Essential) Two lines string containing CBoK top level knowledge areas
course.csv Program Course Information
  • courseCode: string, uniquely identify course
  • title: string
  • abbreviation: string, abbreviation of the course title
  • CRICOSCode: string
  • degree: string, it is either PG or UG
MIT_AQF_Outcome_cat.csv AQF categories for MIT
  • Program: string, abbreviated program name (MIT)
  • ID: integer, used to identify AQF category
  • Category: string, AQF category name
  • Description: string, full description
MIT_AQF_Outcomes .csv AQF outcomes for MIT
  • aqfId: string, used to identify aqf outcome
  • Description: string, full outcome description
  • Area_ key : integer, used to link to aqf category
incompatibilities.csv Units Incompatibilites
  • unitCode: string, uniquely identify a unit
  • iUnitCode: string, uniquely identify a unit that is incompatible with unitCode.
outcome.csv Unit Outcomes
  • unitCode: string, uniquely identify a unit
  • outcome: string, to match different outcome according to unit
  • outcomeId: integer, label the outcome in the unit with a serial number
  • level: string
  • describe: string, to explain the level means
prer_course.csv Course Prerequisites for Units
  • unitCode: string, uniquely identify a unit
  • courseCode: string, uniquely identify a course
prer_program.csv Programming Units Prerequisites
  • unitCode: string, uniquely identify a unit
  • programmingPoint: integer, points of programming-based units required by specific unit
prer_unit.csv Unit to Unit Prerequisites
  • unitCode: string, uniquely identify a unit
  • preUnitCode: string, uniquely identify a unit studied
role.csv Units Role in Program Course
  • unitCode: string, uniquely identify a unit
  • courseCode: string, uniquely identify a course
  • role: string, Conversion, Option or Conversion
unit.csv Units Information
  • unitCode: string, unit code
  • title: string
  • credit: integer, each unit has 6 points
  • programmingBased: integer, 0 or 1, 0 for non-programming-based, 1 for programming-based
  • availabilities: string
unit_activity_cbok.csv Unit-Activities-CBoK Mappings
  • unitCode: uniquely identify a unit
  • unit: string, name of unit
  • activity: string, activity title of a unit
  • knowledge: string, CBoK knowledge area
  • taxonomy: string, Bloom’s taxonomy

Demo

First prompt

unitCode,courseCode\nCITS4009,62510\nCITS4009,62530

Second prompt

unitCode,title,credit,programmingBased,availabilities\nCITS4009,"Computational Data Analysis",6,1,"Semester 2 2021, Crawley (Face to face); Semester 2 2021, Crawley (Online-TT) [Contact hours: n/a];"

Third prompt

courseCode,title,abbreviation,CRICOSCode,degree\n62510,Master of Information Technology,MIT,083866G,PG\n62530,Master of Data Science,MDSc,093310E,PG

About

Extrapolating knowledge graphs from unstructured text using GPT-3 🕵️‍♂️

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 68.5%
  • CSS 21.3%
  • HTML 10.2%