Skip to Main Content U.S. Department of Energy
Center for Adaptive Supercomputing - Multithreaded Architectures

CASS-MT Equipment and Facilities

Picture of Cray XMT

The Cray XMT supercomputing system is a scalable, massively multithreaded platform with globally shared memory architecture for large-scale data analysis and data mining.

The system is built for parallel applications that are dynamically changing, require random access to shared memory, and, typically, do not run well on conventional systems. Multithreaded technology is ideally suited for tasks such as pattern matching, scenario development, behavioral prediction, anomaly identification, and graph analysis.

Architectural Overview

The Cray XMT system was architected to leverage Cray’s massively parallel processing (MPP) system design to create a scalable, reliable, and economical multithreaded supercomputing platform. The design is based on a Cray MPP compute blade but utilizes AMD Torrenza Innovation Socket technology to populate the AMD Opteron sockets with custom Cray ThreadStorm chips developed for multithreaded processing. A single Cray ThreadStorm processor can sustain 128 simultaneous threads and is connected with up to 8 GB of memory that is globally accessible by any other processor in the system.

Each Cray ThreadStorm processor is directly connected to a dedicated Cray SeaStar2™ interconnect chip, resulting in a high bandwidth, low latency network characteristic of all Cray systems. This allows the Cray XMT platform to scale from 16 to more than 8000 processors providing over one million simultaneous threads and 64 terabytes of shared memory.

As another technology using the Cray XT™ infrastructure, the Cray XMT platform includes separate AMD Opteron™-based service blades, can be configured for I/O, login, network, or system functions and can also provide scalar processing for applications that are best served by a combination of scalar and multithreading technologies. The Cray XMT system runs an operating system that distributes a multithreaded kernel to the compute blades and standard Linux on the service and I/O blades. This allows the compute nodes to focus on the application without being hampered by system administrative functions.

Systems Software & Programming Environment

Service nodes allow the compute nodes to focus on the application without being hampered by system administrative functions. Multithreaded kernel (MTK) is a monolithic OS that provides a global shared memory view of the system API and is based on BSD 4.4 with Cray extensions.

CASS-MT Current System Configuration

  • Cray XMT hardware
    • 128 multithreaded ThreadStorm processors at 500 MHz (128 threads each)
    • 16 Service and I/O (SIO) nodes with dual 2.4 GHz Opteron processors
    • Seastar2 high speed network
    • Total of 1 TB Global Shared Memory
  • Cray XMT software
    • SIO nodes run a modified SuSE Linux environment
    • C/C++ parallelizing cross-compiler environment hosted on the SIO nodes to target ThreadStorm code generation

Netezza TwinFin

The Netezza TwinFin 6 system is the fourth generation of Netezza appliances. The Netezza TwinFin is a purpose-built, standards-based data appliance that architecturally integrates database, server and storage into a single system. The Netezza TwinFin appliance is designed for rapid analysis of data volumes scaling into the petabytes.

Architectural Overview

The Netezza TwinFin 6 system’s performance advantage over other analytic options comes from its unique asymmetric massively parallel processing (AMPP) architecture that combines open, blade-based servers with commodity disk storage and Netezza’s patented data filtering using Field Programmable Gate Arrays (FPGAs). This combination delivers extremely fast query performance on highly complex mixed workloads supporting tens of thousands of users; sophisticated analytics; and modular scalability to petabytes of data. Each Snippet Blade, or S-Blade, is a combination of: Standard Blade server and a special card called the Netezza Database Accelerator.

Current Configuration

  • 6 S-Blades
  • 16 TB of uncompressed storage for user data
  • OS – Red Hat Linux Advanced Server 5.3
  • Supported APIs: SQL, OLE DB, ODBC 3.5, JDBC V3.0 Type 4
  • Database Portability: from IBM DB2, Informix Microsoft SQL Server, MySQL, Oracle, Red Brick, Sybase IQ, Teradata.


The CASS-MT Niagara2 system uses UltraSPARC T2 processors. The UltraSPARC T2 processor was the industries first “system on a chip,” packing the most cores and threads of any general-purpose processor available and integrating all the key functions of the server on a single chip: computing, networking, security and I/O, plus tight integration with the Solaris operating system.

Current Configuration

  • SUN SPARC Enterprise T5240 Server
  • Solaris 10 OS
  • 128 Virtual CPU’s
  • 8 – Cores at 1.6 GHz
  • Up to 64 threads per CPU (2 CPU’s)


Research and Development


Recent News

PNNL Contacts