-
Sun Yat-sen University
- Guangzhou, Guangdong, China
Block or Report
Block or report FreddyBanana
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (1)
Sort Name ascending (A-Z)
Stars
Language
Sort by: Recently starred
PDF Guru Anki是一款以PDF为中心的多功能办公学习工具箱软件,包含四大板块功能:PDF实用工具箱、Anki制卡神器、Anki最强辅助、视频笔记神器,软件功能众多且强大,熟练运用可以大幅提高办公和学习效率,绝对是您不可多得的效率神器。人生苦短,我用Guru!
The Paper List of Large Multi-Modality Model, Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.
🦜🔗 Build context-aware reasoning applications
Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
Open-source RAG Framework for building GenAI Second Brains 🧠 Build productivity assistant (RAG) ⚡️🤖 Chat with your docs (PDF, CSV, ...) & apps using Langchain, GPT 3.5 / 4 turbo, Private, Anthropic…
MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
An implementation of the Prompt-to-Prompt paper for the SDXL architecture
HQ-Edit: A High-Quality and High-Coverage Dataset for General Image Editing
Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly …
CLIP4IDC: CLIP for Image Difference Captioning (AACL 2022)
Data of ACL 2019 Paper "Expressing Visual Relationships via Language".
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"
Situation With Groundings (SWiG) dataset and Joint Situation Localizer (JSL)
EMNLP 2018. Learning to Describe Differences Between Pairs of Similar Images. Harsh Jhamtani, Taylor Berg-Kirkpatrick.
Code and dataset release for Park et al., Robust Change Captioning (ICCV 2019)
Image Hosting solution, Flickr/imgur alternative, make it easy for users to share their images. Using Cloudflare Pages and Telegraph.
awesome grounding: A curated list of research papers in visual grounding
Github for my ICCV 2017 paper: "Localizing Moments in Video with Natural Language"
LaVIT: Empower the Large Language Model to Understand and Generate Visual Content
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
Transparent Image Layer Diffusion using Latent Transparency
[CVPR 2024] LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge
The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".