Skip to content

CEREBRUS-MAXIMUS/Surfer-Data

 
 

Repository files navigation

Surfer: Export your personal data in one click

Contributors Forks Stargazers Issues MIT License

Table of Contents
  1. How it works
  2. Getting Started
  3. Roadmap
  4. License
  5. Contact
  6. Acknowledgements

Demo (click to view)

YouTube

Surfer is a digital footprint exporter, designed to aggregate all your personal data from various online platforms into a single folder.

Currently, your personal data is scattered across hundreds of platforms and the companies operating these platforms have no incentive to give this data back to you. Surfer solves this problem by navigating to websites and scraping data from these websites.

We believe that personal data aggregation is the key to enabling truly useful, universal personal assistants.

Currently Supported Platforms

  • Twitter Posts
  • Twitter Bookmarks
  • LinkedIn Profile
  • GitHub Repositories
  • YouTube
  • Notion
  • ChatGPT History
  • Gmail
  • iMessages (coming soon!)
  • Reddit (coming soon!)

How it works

Surfer Diagram

  1. Click on "Export" to initiate the data extraction process.
  2. The app waits for the target page to load completely.
  3. The system checks if the user is signed in to the platform being scraped.
  4. If not signed in, the user is prompted to sign in.
  5. If signed in, the process continues.
  6. Once signed in, the app interacts with the platform's user interface.
  7. The app then scrapes the user's data from the platform.
  8. Finally, the extracted data is exported and saved to your local storage.

Sample Exported Data

  "platform_name": "X Corp",
  "name": "Twitter",
  "runID": "twitter-001-1724267514217",
  "timestamp": 1724267623318,
  "content": [
    "Twitter Post 1",
    "Twitter Post 2",
    "Twitter Post 3",
    ...
  ]
}

Getting Started

To download the app, head over to https://surfsup.ai. Or you can go to the releases page.

For instructions on setting up the app locally and contributing to the project, please refer to the Contributing Guidelines, Helper Functions Documentation, and Guide to Adding New Platforms.

See the open issues for a full list of proposed features (and known issues).

Roadmap

Short-Term

  • Data being maintained/updated everyday
  • Scheduled exports
  • Obtain a code signing certificate for Windows
  • Replace setTimeout with await for script execution to ensure elements exist before scraping
  • Implement robust error handling for the scraping process
  • Add support for more online platforms
  • Add verbosity to runs

Medium to Long-Term

  • Implement concurrent scraping to allow for multiple scraping jobs to run simultaneously
  • Adding knowledge graphs, chatting with data, visualizations, etc
  • Adding sub-tasks within platforms (i.e. Twitter Bookmarks, LinkedIn Connections Data, etc)
  • Integrate with other agentic frameworks like LangChain for advanced personal AI assistants
  • Explore integration with wearable devices for enhanced personal data tracking and acknowledgment

License

Distributed under the MIT License. See LICENSE for more information.

Built With

  • Electron
  • React
  • Tailwind
  • Shadcn UI

Contact

Surfer Discord Server - @SahilLalani0 - @JackBlair87 - @T0M_3D

Project Link: https://github.com/CEREBRUS-MAXIMUS/Surfer-Data

Star History

Star History Chart

Acknowledgements