CS Notes

❯

❯

❯

readlist

2026年3月20日1分钟阅读

深入了解视觉语言模型_huggingface

Vision-Language Pre-training: Basics, Recent Advances, and Future Trends

MM-LLMs: Recent Advances in MultiModal Large Language Models

Vision Transformers Need Registers

概念

vit

SSL Self-supervised Learning 自监督学习

Contrastive Learning 对比学习 CLIP (Contrastive Language-Image Pre-Training) https://github.com/OpenAI/CLIP https://github.com/mlfoundations/open_clip

COCO（Common Objects in Context）

PEFT （Parameter-Efficient Fine-Tuning ）

关系图谱

Created with Quartz v1.0.0 © 2026

GitHub