Skip to content

PyTorch on Windows (ROCm on AMD Radeon 780M gfx1103)

Current Windows Support Status

ROCm support on Windows

ROCm support on Windows is currently incomplete and under active development. Users should expect potential instability, limited feature sets, and library-specific issues compared to the Linux implementation.

Key Resources:

  • GPU Compatibility: Check SUPPORTED_GPUS.md for the latest list of verified hardware.
  • Component Support Status: Refer to windows_support.md for a detailed breakdown of ROCm components.
  • GFX1103 Progress: Track the latest updates for Radeon 780M and other consumer GPUs in TheRock Issue #1337.

Support Summary:

  • Supported: Core math libraries (rocBLAS, rocRAND, rocFFT, rocSOLVER, rocSPARSE), ML libraries (MIOpen, hipDNN), and the AMD-LLVM compiler toolchain are generally functional.
  • Unsupported/Limited: Profiling tools (rocprofiler-sdk, aqlprofile), communication libraries (RCCL), and media decoding (rocDecode, rocJPEG) are currently unsupported or restricted on Windows. System-level tools like amdsmi and rocr-runtime are also pending full support.

Prerequisites

  1. Install the latest Adrenaline driver.
  2. Read TheRock release guidance (Windows and compatibility notes): https://github.com/ROCm/TheRock/blob/main/RELEASES.md.

ROCm installation paths

Support three ways to install ROCm:

uv pip install --index-url https://rocm.nightlies.amd.com/v2/gfx110X-all/ "rocm[libraries,devel]"
  1. Clone TheRock repository
  2. Use the artifact installation helper:
    • TheRock\build_tools\install_rocm_from_artifacts.py
  3. Supported channels:
    • dev (recommended)
    • nightly
  4. Optional: target a specific GitHub Actions run ID for fixed behavior. You can install specific versions verified for certain device models. Example reference run:
    • https://github.com/ROCm/TheRock/actions/runs/23688149485/job/69056949871

Check more in Artifact install docs

Source builds require ~100GB disk, tons of hours, should be equivalent to artifacts in code path. Use only if you need custom build control.

Prerequisites and Set up Environment

  • Confirm prerequisites in RELEASES.md
  • Reinstall Git with "Use Git and optional Unix tools from the Windows Command Prompt".
  • Set these git options:
    • git config --global core.symlinks true
    • git config --global core.longpaths true
    • git config --global core.autocrlf true
  • Clone https://github.com/ROCm/TheRock.
  • Validate environment using:
    • .\TheRock\build_tools\validate_windows_install.ps1

VS Build Tools environment note

  • x64 Native Tools Command Prompt for VS 2022 may fail PowerShell scripts.
  • Developer PowerShell for VS 2022 is x86 by default, not sufficient for TheRock builds.
  • Use scripts\activate_building_tools.ps1 in this repo to map vsDevCmd output into PowerShell.
  • Usage: Open PowerShell, copy the script content and run in terminal. Note that vsDevCmd path needs adjustment based on actual install. This script converts bat settings to PowerShell environment variables to recognize VS Build Tools.

Build flow

  • uv pip install -r requirements.txt. If venv is not activated, activate it first.
  • uv run python ./build_tools/fetch_sources.py. (Takes a long time, handles many files).
  • TheRock\build_tools\setup_ccache.py (Optional).
  • Build: cmake -B build -GNinja . -DTHEROCK_AMDGPU_FAMILIES=gfx110X-all

Additional docs:

  • https://github.com/ROCm/TheRock/blob/main/docs/development/README.md
  • https://github.com/ROCm/TheRock/blob/main/docs/development/windows_support.md

PyTorch installation

This step usually has no issues. Currently supports Python 3.10-3.12 for Torch 2.9 & 2.10.

Recommended:

uv pip install --index-url https://rocm.nightlies.amd.com/v2/gfx110X-all/ torch torchaudio torchvision

Other Methods

This pulls prebuilt wheels that usually have the best compatibility.

If your device still has compatibility issues, inspect the Windows PyTorch wheel release workflow for artifact options or source build instructions.

Fixing Known issue: HIP API error 0100 on iGPU mapping

checkHipErrors() HIP API error = 0100 "no ROCm-capable device is detected"

Observed GPU enumeration issue on Windows

TheRock issues confirm that Windows iGPU + discrete GPU enumeration is still under active work:

Workaround (for current session)

To force the HIP runtime to recognize the iGPU, you must explicitly set the HIP_VISIBLE_DEVICES environment variable to 0.

In PowerShell (Current Session):

$env:HIP_VISIBLE_DEVICES = "0"

In cmd.exe (Current Session):

set HIP_VISIBLE_DEVICES=0

To make permanent, add HIP_VISIBLE_DEVICES user environment variable value 0 in Windows Settings.

More information can be found in GitHub issues.

Final Verification and Testing

import torch
import time

print("ROCm:", torch.version.hip)
print("GPU available:", torch.cuda.is_available())

if torch.cuda.is_available():
    print(torch.cuda.get_device_name(0))

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

# Create deterministic tensors
tensor_a_cpu = torch.full((1000, 1000), 2.0, device='cpu')
tensor_b_cpu = torch.full((1000, 1000), 3.0, device='cpu')

# CPU computation
start_time = time.time()
result_cpu = tensor_a_cpu + tensor_b_cpu
cpu_time = time.time() - start_time
print(f"CPU operation took: {cpu_time:.6f} seconds")

# GPU computation
if torch.cuda.is_available():
    tensor_a_gpu = tensor_a_cpu.to('cuda')
    tensor_b_gpu = tensor_b_cpu.to('cuda')

    torch.cuda.synchronize()  # ensure accurate timing
    start_time = time.time()

    result_gpu = tensor_a_gpu + tensor_b_gpu

    torch.cuda.synchronize()  # wait for GPU to finish
    gpu_time = time.time() - start_time
    print(f"GPU operation took: {gpu_time:.6f} seconds")

    # Move GPU result back to CPU for comparison
    result_gpu_cpu = result_gpu.to('cpu')

    # Verify correctness
    if torch.allclose(result_cpu, result_gpu_cpu):
        print("CPU and GPU results match!")
    else:
        print("Results differ!")

Sample output

CPU operation took: 0.004181 seconds
GPU operation took: 0.001731 seconds
CPU and GPU results match!