How Prompt Tuning works?

Last Updated : 23 Jul, 2025

Prompt tuning is a technique used to adapt pre-trained language models to downstream tasks without modifying the entire model. Instead of fine-tuning all parameters of the model, prompt tuning focuses on optimizing a small set of learnable tokens. In this article we will learn about them.

How Does Prompt Tuning Work?

Let's understand it step by step:

Step 1: Pre-trained Language Model

Foundation of prompt tuning is pre-trained language model. These models are trained on vast amounts of text data and encode general linguistic knowledge. Examples include GPT-3, BERT and T5.

Step 2: Soft Prompts

Instead of directly feeding raw input text into the model, prompt tuning introduces a set of learnable embeddings called soft prompts. These embeddings are initialized randomly and are optimized during training to guide the model toward the desired task.
For example, if the task is sentiment classification, the soft prompt might encode information about the sentiment labels like positive, negative, neutral.
Rest of the model remains frozen, preserving the general knowledge it acquired during pre-training.

Step 3: Concatenation with Input

The soft prompts are concatenated with the actual input text before being passed to the model. This creates a composite input sequence where the soft prompts serve as a task-specific prefix. For example:

Soft Prompt: [P1, P2, P3]
Input Text: "This movie was fantastic!"
Composite Input: [P1, P2, P3, "This", "movie", "was", "fantastic", "!"]

Step 4: Optimization

During training model’s output is compared to ground truth using a loss function like cross-entropy for classification tasks. Then gradients are backpropagated only through the soft prompts leaving rest of the model’s parameters unchanged.

Step 5: Inference

Once the soft prompts are optimized they can be reused for inference on new inputs for the same task. The frozen model generates predictions based on the learned soft prompts, effectively adapting to the task without requiring full fine-tuning.

Mathematical Explanation of Prompt Tuning

To better understand prompt tuning let’s break it down mathematically.

1. Pre-trained Language Model

Consider a pre-trained LLM f(x;θ) where:

x is the input text.
θ represents the fixed parameters (weights and biases) of the large language model.
f(x;θ) outputs a probability distribution over the next word or generates a continuation of the input text.

To understand the benefits of prompt tuning, let's compare it with fine tuning. See the following schematic that explains the difference between prompt tuning and fine tuning from a mathematical point of view.

2. Learnable Prompts

Instead of directly feeding the input text x into the model we prepend a learnable prompt p. These embeddings are initialized randomly and optimized during training to guide the model toward the desired task. The final input x′ to the model becomes:

x′=[p,x]=[p_1,p_2,…,p_n,x]

Here p_1,p_2,…,p_n are the learnable prompt tokens and x is the original input text.

3. Model Operation

The LLM processes the concatenated input x′ to produce an output \hat{y} . For example if the task is sentiment classification the model might output probabilities for each class: \hat{y} = [0.3, 0.7]

4. Loss Function

The model’s output \hat y is compared to the true label y using a loss function. For binary classification the cross-entropy loss is commonly used:

L = -y \log(\hat y_{\text{positive}})

For example if y=1 (positive sentiment) and \hat y positive=0.7 then the loss is: L=−(1)log(0.7)=0.357

5. Gradient Descent and Updating the Prompt

The optimization process involves using gradient descent to adjust the learnable prompt p. We update the prompt embeddings p based on the gradients of the loss function with respect to p:

p←p−η∇pL

Where:

η is the learning rate (a small step size that controls how much we adjust the prompt).
∇pL is the gradient of the loss with respect to the prompt p.

6. Convergence

After enough iterations the prompt p converges to a set of embeddings that guide the model to classify the sentiment of the input text more accurately.

Implementation of Prompt Tuning

Let’s use a simple python example where we optimize a learnable prompt (p) to guide a model for sentiment classification. The goal is to classify whether a sentence has a positive or negative sentiment

Step 1: Import Necessary Libraries

First, we will import necessary libraries like numpy, tenserflow and matplotlib.

Python

import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models
import matplotlib.pyplot as plt

Step 2: Define the Simple Model

Next, we define a simple neural network model using TensorFlow's Keras API. This model will mimic the behavior of a large language model (LLM) for our example.

The model has two layers one hidden layer with ReLU activation and a output layer that predicts the probability of two classes (positive or negative).
For simplicity this is a small neural network but it represents the core idea of how an LLM processes inputs.

Python

class SimpleModel(tf.keras.Model):
    def __init__(self, input_size, hidden_size):
        super(SimpleModel, self).__init__()
        self.fc1 = layers.Dense(hidden_size, activation='relu', input_shape=(input_size,))
        self.fc2 = layers.Dense(2)  # Binary classification: positive or negative

    def call(self, x):
        x = self.fc1(x)
        x = self.fc2(x)
        return x

Step 3: Prepare Input Data and Learnable Prompt

Now, we prepare the input data and define the learnable prompt (p). The prompt will be optimized during training to guide the model.

sentence_embedding: Represents the input text as numerical vectors (embeddings).
prompt_embedding: A set of learnable tokens that will be adjusted during training.
target: The true label for the input text.

Python

# Input data (embedding for the sentence "The food is delicious")
sentence_embedding = tf.constant([[0.2, 0.8], [0.5, 0.4], [0.9, 0.1], [0.6, 0.7]], dtype=tf.float32)

# Learnable prompt embeddings (p), we will optimize this
prompt_embedding = tf.Variable([[0.1, 0.5], [-0.4, 0.9]], dtype=tf.float32, trainable=True)

# Target label (1 for positive sentiment, 0 for negative)
target = tf.constant([1], dtype=tf.int32)  # Positive sentiment

Step 4: Set Up the Model and Training Components

We initialize the model, define the loss function and set up the optimizer for training.

SparseCategoricalCrossentropy: Measures how well the model’s predictions match the true labels.
Adam optimizer: Updates the learnable prompt (p) based on gradients.

Python

# Model parameters
input_size = 2  # Embedding size
hidden_size = 4

# Initialize the model
model = SimpleModel(input_size, hidden_size)

# Define the loss function
loss_function = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

# Define the optimizer
optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)

# Store the loss values for plotting
loss_values = []

Step 5: Train the Model

We train the model by iteratively adjusting the learnable prompt (p) to minimize the loss.

Concatenation : The learnable prompt (p) is prepended to the input text embeddings.
Forward Pass : The model processes the combined input and produces an output.
Loss Calculation : The difference between the model’s prediction and the true label is computed.
Gradient Descent : The prompt is updated to reduce the loss.

Python

# Training loop to optimize the prompt
for epoch in range(5000):  # Run for 5000 epochs
    with tf.GradientTape() as tape:
        # Concatenate prompt with sentence embedding to form the model input
        model_input = tf.concat([prompt_embedding, sentence_embedding], axis=0)
        
        # Forward pass: Simplified by averaging the input embeddings
        output = model(tf.reduce_mean(model_input, axis=0, keepdims=True))
        
        # Compute the loss
        loss = loss_function(target, output)
    
    # Backward pass and optimization
    gradients = tape.gradient(loss, [prompt_embedding])  # Compute gradients only for the prompt
    optimizer.apply_gradients(zip(gradients, [prompt_embedding]))  # Update the prompt embeddings
    
    # Store the loss value
    loss_values.append(loss.numpy())
    
    # Print progress every 10 epochs
    if epoch % 10 == 0:
        print(f'Epoch {epoch}, Loss: {loss.numpy()}, Prompt: {prompt_embedding.numpy()}')

Step 6: Visualize the Results

After training we plot the loss values to see how the model improved over time and print the optimized prompt.

Python

# Plot the loss values
plt.plot(loss_values)
plt.title('Loss Function Over Epochs')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.grid(True)
plt.show()

# Print the optimized prompt
print("Optimized Prompt:", prompt_embedding.numpy())

Output:

By following these steps you can implement prompt tuning in python and adapt a pre-trained model for a specific task like sentiment analysis. This approach is lightweight, efficient and preserves the general knowledge of the original model.

Advantages of Prompt Tuning

Prompt tuning offers several key benefits over traditional fine-tuning:

Parameter Efficiency : Unlike full fine-tuning which updates all parameters of the model, prompt tuning modifies only a small subset of parameters. This drastically reduces memory and computational requirements.
Task-Specific Adaptation : Soft prompts can be tailored to specific tasks, enabling the same pre-trained model to handle multiple tasks simultaneously without interference.
Scalability : Prompt tuning scales well with larger models. As models grow in size their relative overhead of managing soft prompts remains minimal.
Preservation of General Knowledge : By keeping majority of the model frozen, prompt tuning ensures that the general knowledge acquired during pre-training is preserved reducing the risk of catastrophic forgetting.
Faster Deployment : Since only the soft prompts need to be stored and distributed, prompt tuning simplifies the deployment of LLMs across different tasks and environments.

Limitations of Prompt Tuning

While prompt tuning offers several advantages, it is not without limitations:

Task Complexity: It may struggle with highly complex tasks that require extensive modifications to the model's behavior. In such cases, full fine-tuning might still be necessary.
Initialization Sensitivity: The performance of prompt tuning can be sensitive to the initialization of the soft prompts. Poor initialization may lead to suboptimal results.
Limited Interpretability: Unlike discrete textual prompts, soft prompts are not human-readable making it difficult to interpret what the model has learned.

As NLP models continue to grow in size and complexity, techniques like prompt tuning plays a important role in making these models accessible and practical for real-world applications.

ayushimalm50

Improve

Article Tags :

Artificial Intelligence

How Prompt Tuning works?

How Does Prompt Tuning Work?

Step 1: Pre-trained Language Model

Step 2: Soft Prompts

Step 3: Concatenation with Input

Step 4: Optimization

Step 5: Inference

Mathematical Explanation of Prompt Tuning

1. Pre-trained Language Model

2. Learnable Prompts

3. Model Operation

4. Loss Function

5. Gradient Descent and Updating the Prompt

6. Convergence

Implementation of Prompt Tuning

Step 1: Import Necessary Libraries

Step 2: Define the Simple Model

Step 3: Prepare Input Data and Learnable Prompt

Step 4: Set Up the Model and Training Components

Step 5: Train the Model

Step 6: Visualize the Results

Advantages of Prompt Tuning

Limitations of Prompt Tuning

Explore

Introduction to AI

AI Concepts

Machine Learning in AI

Robotics and AI

Generative AI

AI Practice

Thank You!

What kind of Experience do you want to share?