Skip to content
This repository was archived by the owner on Oct 25, 2024. It is now read-only.

[NeuralChat] Add Multi-Socket LLM Inference Example #1073

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

letonghan
Copy link
Contributor

@letonghan letonghan commented Dec 25, 2023

Type of Change

Add NeuralChat example
API not changed

Description

Add Multi-Socket LLM inference example for NeuralChat.
Related DeepSpeed PR: deepspeedai/DeepSpeed#4750 (not merged yet)

Expected Behavior & Potential Risk

Custormers are able to run LLM inference using multi-socket with DeepSpeed following this example.

How has this PR been tested?

Local tested on SPR server.

Dependency Change?

no.

Signed-off-by: LetongHan <letong.han@intel.com>
Signed-off-by: LetongHan <letong.han@intel.com>
mengfei25 pushed a commit to mengfei25/intel-extension-for-transformers that referenced this pull request Dec 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
1 participant