Skip to main content
1 vote
2 answers
153 views

Given: $ pip install transformers mistral-common -Uq $ pip show transformers mistral-common torch # to verify Name: transformers Version: 4.57.3 Summary: State-of-the-art Machine Learning for JAX, ...
farid's user avatar
  • 1,641
0 votes
0 answers
84 views

I am new to reinforcement learning. So as an educational exercise, I am implementing the GRPO from scratch with pytorch. My goal is mimic how TRL works, but boil it down to just the loss function and ...
csnate's user avatar
  • 1,661
0 votes
0 answers
57 views

I am trying to run an example from z-image-turbo (https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) import os os.environ["HF_HOME"] = "E:/MyCustomHFCache" os.environ["...
Recently_Created_User's user avatar
Best practices
0 votes
2 replies
61 views

I would like to use a LLM Encoder model to create vector embeddings for certain texts in my dataset. The texts are written as technical problem descriptions by experts who are trying to repair a ...
Alles Klar's user avatar
-1 votes
2 answers
103 views

I have used the following code to do sft: base_model = "google/gemma-3-270m" it_model = "google/gemma-3-270m-it" checkpoint_dir = "checkpoint" learning_rate = 5e-5 #@...
keramat's user avatar
  • 4,611
1 vote
0 answers
153 views

I try to run Flux2 inference on 2 GPUs as follows: import torch from diffusers import Flux2Pipeline from accelerate import PartialState import argparse from pathlib import Path def main(): parser ...
Franck Dernoncourt's user avatar
0 votes
0 answers
46 views

I am now trying to use FSDP in Huggingface transformers Trainer. The training script is something like train_dataset = Mydataset(...) args = TrainingArguments(...) model = LlamaForCausalLM....
xuehao-049's user avatar
2 votes
1 answer
74 views

Question: I'm experiencing a question with the transformers library, specifically with the pipeline initialization. When I access the base_model attribute of a LlamaForCausalLM model, it seems to ...
Hank Wang's user avatar
0 votes
0 answers
88 views

I am currently experimenting with modifying the KV cache of the LLaVA model in order to perform controlled interventions during generation (similar to cache-steering methods in recent research). The ...
Pulkit Mittal's user avatar
1 vote
0 answers
207 views

I need to to run a series of pre-trained fine-tuned models from Hugging Face to Jupyter notebook. I have updated to the latest version of both PyTorch and Transformers, but when I run the code from ...
Alex Colville's user avatar
1 vote
1 answer
94 views

I'm trying to implement Speech-to-Text transcription in my Swift app using Hugging Face's swift-transformers package to run Whisper models locally. I've added the package to my Xcode project, but when ...
Zaid's user avatar
  • 513
0 votes
1 answer
83 views

I am trying to run Mistral-7B-Instruct-v0.2. Each run is PROMPT + details[i]. PROMPT has instructions on how to generate JSON based on details. As the prefix part of each input is same; kind of like a ...
acdhemtos's user avatar
0 votes
0 answers
137 views

I got Python 3.12.3 on an Ubuntu server. I tried to install transformers, tokenizers, datasets and accelerate to use the Seq2SeqTrainer in the transformers. I used a virtual environment for the ...
Raptor's user avatar
  • 54.4k
0 votes
0 answers
43 views

I'm fine-tuning T5-small using PyTorch Lightning and encountering a strange issue during validation and test steps. The Problem: During validation_step and test_step, model.generate() consistently ...
GeraniumCat's user avatar
3 votes
0 answers
129 views

I have encountered a particular problem while executing a function from the transformers library of huggingface on an Intel GPU wheel of torch. Since I am doing something I normally shouldn't be ...
Logarithmnepnep's user avatar

15 30 50 per page
1
2 3 4 5
227