Skip to content
View voidism's full-sized avatar

Highlights

  • Pro

Block or report voidism

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
voidism/README.md

👋 You've reached the GitHub profile of Yung-Sung!

  • 👋 Hi, I’m Yung-Sung (@voidism) I am a final-year PhD student in Electrical Engineering and Computer Science at CSAIL, Massachusetts Institute of Technology, where I work with Jim Glass.
  • My research focuses on large language models: hallucinations, factuality, and retrieval-augmented generation. In addition, I worked on pre-training MetaCLIP 2, a multilingual vision-language model pre-trained on worldwide web-scale data, during my internship at Meta FAIR.
  • My research has introduced several approaches for improving LLM factuality. DoLa enhances factuality through layer-wise knowledge contrasting during decoding. Lookback Lens detects and mitigates hallucinations by analyzing attention patterns under RAG settings. Most recently, SelfCite enables LLMs to generate accurate citations without external supervision.
  • I also used to work on retrieval methods, developing DiffCSE for better sentence embeddings and Query Reranking for more accurate passage retrieval.

[Google Scholar] [CV] [Twitter] [Github] [DBLP] [Blog] [Linkedin] [Instagram]

Pinned Loading

  1. DoLa DoLa Public

    Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"

    Python 532 67

  2. DiffCSE DiffCSE Public

    Code for the NAACL 2022 long paper "DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings"

    Python 296 30

  3. huggingface/transformers huggingface/transformers Public

    🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

    Python 154k 31.6k

  4. s3prl/s3prl s3prl/s3prl Public

    Self-Supervised Speech Pre-training and Representation Learning Toolkit

    Python 2.5k 523

  5. facebookresearch/MetaCLIP facebookresearch/MetaCLIP Public

    NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024

    Python 1.8k 75

  6. facebookresearch/SelfCite facebookresearch/SelfCite Public

    Code for the ICML 2025 paper "SelfCite Self-Supervised Alignment for Context Attribution in Large Language Models"

    Python 21 1