Skip to content

Enable GPU-supported RGF calculation by adding 'rgf_device'#29

Merged
AsymmetryChou merged 13 commits into
deepmodeling:mainfrom
AsymmetryChou:rgf_acc
Jun 14, 2026
Merged

Enable GPU-supported RGF calculation by adding 'rgf_device'#29
AsymmetryChou merged 13 commits into
deepmodeling:mainfrom
AsymmetryChou:rgf_acc

Conversation

@AsymmetryChou

@AsymmetryChou AsymmetryChou commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator

This pull request introduces comprehensive improvements to device management and tensor placement throughout the NEGF codebase. The main focus is to allow explicit control over which device (CPU or CUDA) is used for the recursive Green's function (RGF) step, while ensuring all relevant tensors and operations are consistently placed on the correct device. This is achieved by propagating an rgf_device parameter and updating tensor allocations, conversions, and file I/O accordingly. Additionally, the pull request clarifies device usage in documentation and argument parsing.

The most important changes are:

Device management and propagation:

  • Added an rgf_device parameter (defaulting to CPU) to core classes and functions, including DeviceProperty, NEGF, and argument parsing, to control where the RGF step runs. All relevant tensor allocations and computations in the NEGF pipeline now use this device. [1] [2] [3] [4] [5] [6]

Consistent tensor placement:

  • Ensured all tensors used in RGF and related calculations are allocated on or moved to the correct device (rgf_device), including Hamiltonian blocks, self-energies, and intermediate matrices in both batched and non-batched scenarios. [1] [2] [3] [4] [5] [6]
  • Updated tensor operations and file I/O to handle device transfers, including moving tensors to CPU before saving with .numpy() and ensuring device consistency during block construction and property calculations. [1] [2] [3] [4] [5] [6] [7]

Documentation and argument updates:

  • Clarified in comments and argument documentation that Hamiltonian initialization and self-energy calculations always run on CPU, while only the RGF step uses the specified device. Added a new rgf_device argument to the CLI and configuration. [1] [2]

These changes enable flexible GPU acceleration for the RGF step, improve code clarity, and ensure robust and error-free device handling across the codebase.

TODO:
For now, in RGF, all sub-blocks Hamiltonian and Overlap are transfered to GPU at once. For large systems, OOM would happen. More detailed optimizaiton should be implemented.

@AsymmetryChou AsymmetryChou changed the title Add 'rgf_device' support Enable GPU-supported RGF calculation by adding 'rgf_device' Jun 14, 2026
@AsymmetryChou AsymmetryChou merged commit b6663bd into deepmodeling:main Jun 14, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant