skip to Main Content

Adaptive Computing – Torque

Torque is an industry-standard resource manager solution focusing on:

Fault Tolerance

  • Additional failure conditions checked/handled
  • Node health check script support

Scheduling Interface

  • Extended query interface providing the scheduler with additional and more accurate information
  • Extended control interface allowing the scheduler increased control over job behavior and attributes
  • Allows the collection of statistics for completed jobs

Scalability

  • Significantly improved server to MOM communication model
  • Ability to handle larger clusters with tens of thousands of nodes and jobs
  • Ability to handle larger jobs that span hundreds of thousands of processors
  • High responsiveness and reliability with multi-threading and TCP-based communication

Usability

  • Extensive logging additions
  • More human readable logging (i.e. no more ‘error 15038 on command 42’)
Back To Top