Using Performance Co-Pilot to Monitor SNMP Devices

  • Origins of Performance Co-Pilot (PCP): Frustration with SNMP.
  • Started by SGI.
  • Top does not scale.
  • Open source toolkit for system level performance analysis: live and historical data, extensible, distribted.
  • Many agents talking to a master daemon; many clients then talk to the daemon.
  • One common protocol for real-time and historical data.
  • Real time conditional learning via a little language.
  • Log analysis - some nice tools that will e.g. let you compare points in time.
  • GUIs. Everyone loves GUIs.
  • Alerting functionality.

PCP compared to SNMP

  • SNMP is extensible. To a fault. PCP is extensible, but in a more limited fashion.
  • Naming is likewise marginally more limited than SNMP.
  • SNMP MIBs have the metadata; in PCP the metadata is in the agent. There’s little more effort up-front when building agents.

Why Build a Bridge?

  • They use PCP extensively.
  • But you can’t use it with non-general purpose devices.
  • PCP does a great job for monitoring, alerting, trending.
  • but not for e.g. switches, SANs, server LOMs, and so on.
  • The rest of the world uses SNMP.
  • There’s no general purpose SNMP gateway.

How it Went

  • Side-by-side logging and use of the PCP tools.
  • Dynamic, ad-hoc interface that doesn’t need rewriting and reconfiguring ahead of time - this wasn’t possible, unfortunately.
  • There’s a solid TODO, mostly in fixing the PCP perl libraries.
  • Add dynamic mappings.
  • Use the MIBs for number/name mappings.
  • Not in production yet.