Get Started with Intel® Extension for PyTorch* on GPU
If you’re looking to accelerate your PyTorch* workflows on GPU, the Intel® Extension for PyTorch* can help you achieve significant performance improvements. In this article, we’ll walk you through the steps to get started with this extension and take advantage of its capabilities.
Step 1: Install PyTorch*
Before you can use the Intel® Extension for PyTorch*, you’ll need to have PyTorch* installed on your system. You can find installation instructions on the PyTorch* website.
Step 2: Install the Intel® Extension for PyTorch*
Once you have PyTorch* installed, you can install the Intel® Extension for PyTorch* using the following command:
pip install torch-ipt
Step 3: Use the Extension with GPU
Now that you have the extension installed, you can start using it with GPU. Simply import the extension in your Python code and specify the device as “gpu” when creating tensors or performing operations:
import torch
import torch_ipt
# Create a tensor on GPU
x = torch.tensor([1, 2, 3]).to('gpu')
# Perform operations on GPU
y = x * 2
Step 4: Measure Performance
After using the extension with GPU, you can measure the performance improvements compared to using PyTorch* without the extension. You can use tools like Intel® VTune™ Profiler to analyze the performance and make further optimizations.
Conclusion
By following these steps, you can quickly get started with the Intel® Extension for PyTorch* on GPU and unlock its performance benefits for your PyTorch* workflows. Experiment with different models and workloads to see the full potential of this extension.
For more information and detailed documentation, be sure to visit the Intel® Software website.
Can this extension be used on all intel GPUs like intel uhd graphics? Where can I find a list of available GPUs that can install this extension?
Is there a demo on training using torch on XPU?
can we train on arc gpu using this?
show me da money!
intel stock holderWhat's the timeframe for supporting Windows and PyTorch 2.0?
Right now the only option that works is relying on OpenVINO for inference, and that seems to be developing faster. And why is the warmup needed for the cache manually?