[Advisory] MPI, Debugger and Profiler Behavior After CVE-2026-46333 Mitigation​

Dear NSCC Users,

Red Hat published a security advisory (CVE-2026-46333, Red Hat Security Bulletin RHSB-2026-004) describing a local privilege-escalation vulnerability in the Linux kernel. NSCC has applied Red Hat’s recommended mitigation on ASPIRE 2A on 17 May 2026.

The same mitigation has also been applied to ASPIRE 2A+ as a precautionary measure, while we await a response and further guidance from NVIDIA.

This is a defense-in-depth measure intended to ensure continued protection through the planned June kernel upgrade, where the underlying conditions may change. Please be assured that no NSCC user data, jobs, or accounts are known to have been affected by this vulnerability.

However, because this mitigation restricts certain kernel-level process tracking and memory access, it will temporarily alter the behavior of development tools and MPI frameworks as detailed below.

 

Debuggers and Profilers
Debuggers, profilers, and tracing tools that rely on ptrace may not be functional as expected under this mitigation. The most common symptom is “Operation not permitted” when a tool attempts to inspect or attach to a process.

General Guideline: Workflows where a tool starts your program from the beginning are more likely to continue working. Workflows where a tool reaches into or attaches to a process that is already running will likely fail.

If your usual workflow involves attaching to an existing PID (e.g., gdb -p, strace -p, perf -p, nsys attach, ncu –pid, py-spy –pid, gcore), expect it to fail or behave incorrectly.

Tools that may not be functional as expected include, but are not limited to:

  • gdb, cuda-gdb
  • strace, ltrace
  • perf (record, stat, top)
  • nsight-systems (nsys), nsight-compute (ncu)
  • VTune
  • CrayPat (pat_run), perftools-lite
  • valdrind4hpc, heaptrack (in some modes)
  • gcore, py-spy, rr, and other tools that read /proc/<pid>/mem

 

Important Note: Arm Forge (DDT, MAP, PR) is known not to work under this mitigation and should not be used until further notice.

This list is not exhaustive. If you use a tool not mentioned here and observe unexpected behavior, assume it may be related to this change.

MPI and Intra-node Communication
Cray MPICH’s shared-memory single-copy optimizations (XPMEM, cross-memory-attach) rely on kernel mechanisms that this mitigation also gates. Without action, Cray MPICH jobs would be expected to fail or hang on intra-node communication.

We have applied site-wide environment defaults to disable the affected single-copy paths:
export MPICH_CH4_XPMEM_LMT_MSG_SIZE=NONE
export MPICH_SMP_SINGLE_COPY_MODE=NONE

With these in place, Cray MPICH is expected to function normally. The exports are applied in the default module environment, so you do not need to add them to your job scripts unless you have explicitly overridden them. There may be a small performance impact on large intra-node messages, but correctness is preserved.

OpenMPI and NCCL are not affected and require no changes.

If you build your own MPI from source, or use a non-default MPI distribution, please apply equivalent flags to disable single-copy / cross-memory-attach mechanisms in your implementation.

Next Steps
We will revisit and roll back this mitigation once patched kernels are available, validated, and deployed in June. Until then, thank you for your patience and cooperation as you adapt your workflows.

If a particular task or business-critical workflow is materially impacted, please reach out to us so we can assist.

Should you have any questions or need assistance, please contact our Helpdesk via the Service Desk Portal or email us at [email protected].

Thank you.

Warm regards,
The NSCC Team

[Completed] NUS Fire Certification Inspection and Electrical Shutdown Affecting ASPIRE 2A & 2A+ from 15 May 2026, 3PM to 18 May 2026, 10AM

Dear NSCC users,

We are pleased to announce that the activities has been completed. You may proceed to login to the ASPIRE 2A and 2A+ systems as per normal.

Important Note for ASPIRE 2A Users:

  • There will be temporary limitations to the MPI, debugger and profiler behavior after the CVE-2026-46333 mitigation. Please refer to our subsequent follow-up email for specific details.

Should you have any questions or need assistance, please contact our Helpdesk via the Service Desk Portal or email us at [email protected].

Thank you.

Warm regards,
The NSCC Team

NUS Fire Certification Inspection and Electrical Shutdown Affecting ASPIRE 2A & 2A+ from 15 May 2026, 3PM to 18 May 2026, 10AM

Dear NSCC users,

We wish to provide you with an update immediately following the confirmation of the schedule for the upcoming Fire Certification Inspection and Electrical Shutdown at the NUS Innovation 4.0 building. All services to the ASPIRE 2A and ASPIRE 2A+ systems will be affected.

Maintenance Details:
  • Start: 15 May 2026 (Friday), 3:00 PM SGT
  • End: 18 May 2026 (Monday), 10:00 AM SGT
Impact During the Maintenance Period:
  • There will be a full shutdown of the ASPIRE 2A and ASPIRE 2A+ systems.
  • All queues will stop dispatching jobs, and all remaining jobs running will be terminated gracefully before the system shuts down.
  • Users will not be able to access the systems during the maintenance period.

Should you have any questions or need assistance, please contact our Helpdesk via the Service Desk Portal or email us at [email protected].

Thank you.

Warm regards,
The NSCC Team

[Resolved] Service Disruption for NTU and SUTD Users Accessing ASPIRE 2A & ASPIRE 2A+​

Dear NTU and SUTD users,

We are pleased to inform you that the network issue has been resolved. You may proceed to login to the ASPIRE 2A and 2A+ systems as per normal.

Should you have any questions or need assistance, please contact our Helpdesk via the Service Desk Portal or email us at [email protected].

Thank you.

Warm regards,
The NSCC Team

Service Disruption for NTU and SUTD Users Accessing ASPIRE 2A & ASPIRE 2A+​

Dear NTU and SUTD users,

We wish to inform you that there is a service disruption on the network access to ASPIRE 2A and 2A+ system. Our team is diligently investigating the issue and working towards a swift resolution.

Cause of Disruption:
Issue with network connectivity between NSCC, NTU and SUTD.

Impact During the Maintenance Period:
NTU and SUTD users will not be able to access the ASPIRE 2A and ASPIRE 2A+ systems from their respective institution’s network.

Should you have any questions or need assistance, please contact our Helpdesk via the Service Desk Portal or email us at [email protected].

Thank you.

Warm regards,
The NSCC Team

[Completed] Urgent Maintenance for ASPIRE 2A & 2A+ System on 8 May 2026, 9AM

Dear NSCC users,

We are pleased to announce that the urgent scheduled system maintenance for ASPIRE 2A & 2A+ has been completed.

Should you have any questions or need assistance, please contact our Helpdesk via the Service Desk Portal or email us at [email protected].

Thank you.

Warm regards,
The NSCC Team

Urgent Maintenance for ASPIRE2A & 2A+ System on 8 May 2026, 9AM

Dear NSCC Users,

We wish to inform you that urgent system maintenance is currently being carried out on ASPIRE 2A & 2A+ to implement the mitigation from Dirty Frag Linux local privilege escalation vulnerability.

Maintenance Details:

  • Start: 8 May 2026 (Friday), 9:00AM SGT
  • End: 11 May 2026 (Monday), 6:00PM SGT

Impact During the Maintenance Period:

  • Users will not be able to access the ASPIRE 2A and ASPIRE 2A+ systems during the maintenance period.

Should you have any questions or need assistance, please contact our Helpdesk via the Service Desk Portal or email us at [email protected].

Thank you.

Warm regards,
The NSCC Team

Security Alert: Compromised Python Package – litellm

Dear NSCC Users,

We wish to inform you that two malicious versions of the Python package litellm (v1.82.7 or v1.82.8) was found on PyPI.

These tampered versions contained hidden code that runs automatically every time Python starts without needing to import the package. The malicious code was heavily obfuscated and designed to steal sensitive data, including environment variables, SSH keys, and cloud credentials, and transmit them to an attacker-controlled server.

Full details from the LiteLLM developer: https://docs.litellm.ai/blog/security-update-march-2026

Am I Affected?

You are likely affected if you performed any of the following actions between 24 March 2026 18:39 SGT and 25 March 2026 00:00 SGT:

  • Manual Install: Installed or upgraded litellm via pip.
  • Unpinned Versions: Ran `pip install litellm` which was not pinned to a specific version, resulting in the download of v1.82.7 or v1.82.8
  • Docker Builds: Built a Docker image during this window using `pip install litellm`.
  • Transitive Dependency: Used AI frameworks (e.g., CrewAI, LangChain, or MCP servers) that automatically pulled in litellm as a sub-dependency.

Immediate Actions Required:

  • Verify Installed Versions
    • ​​Run the following command in your terminal or environment:
      pip show litellm
      or
      pip list | grep litellm

    • If you have v1.82.7 or v1.82.8 installed, proceed to the next step.
  • Remove or Downgrade
    • Uninstall the compromised version:
      pip uninstall litellm

    • To resume your work safely, downgrade to a safe version:
      pip install litellm==1.82.6

  • Rotate Your Credentials
    • If you were running an affected version, assume all secrets in that environment are compromised.
    • Rotate any environment variables, SSH keys, API keys, and cloud credentials accessible from that environment.

Should you have any questions or need assistance, please contact our Helpdesk via the Service Desk Portal or email us at[email protected].

Thank you.

Warm regards,
The NSCC Team

[Resolved] Network Disruption for NTU Users Accessing ASPIRE 2A & ASPIRE 2A+​

Dear NTU Users,

We are pleased to inform you that the issue with the network disruption has been resolved. You may proceed to login to the ASPIRE 2A and ASPIRE 2A+ systems as per normal.

If you have any questions or require assistance, please contact the NSCC Helpdesk via the Service Desk Portal or email us at [email protected].

Thank you for your understanding.

Warm regards,
The NSCC Team

 

Network Disruption for NTU Users Accessing ASPIRE 2A & ASPIRE 2A+​

Dear NTU Users,

We would like to inform you that there is currently a network disruption affecting access to the ASPIRE 2A and ASPIRE 2A+ systems. The NTU team is working closely with NSCC to resolve the issue as soon as possible.

Cause of Disruption:
Network connectivity issue between NSCC and NTU.


Impact of the Disruption:
All NTU users are unable to access the ASPIRE 2A and ASPIRE 2A+ system.


If you have any questions or require assistance, please contact the NTU Helpdesk via [email protected].

Thank you for your understanding.

Warm regards,
The NSCC Team