Part 3: Migrating pySnmp
Part 3: Migrating pySnmp

Part 3: Migrating pySnmp

Friday, 23 September 2016

Source image: pySNMP

In my last blogs I introduced you to callback hell and snmp. Let's go deeper down the rabbit hole.

Intro


Even before I joined TT there was already quite some pySnmp code. Back then you would find it in nagios checks written by a hard core linux admin. Well to be honest, I hadn't met any of those back then. So, you know, I'd be impressed by any Linux usage at all but I digress. Apart from all other libraries out there pySnmp was the only library I could find that had async support in python. But finding out how to get it to work wasn't as straightforward as it is now. The old documentation is still there (now in September 2016). Going async with pySnmp had some hidden gotcha's that weren't clear to me from the beginning. Apparently...

If the callback function returns nothing, it will stop. Only if the callback function returns 1 (or something that evaluates to True) the next value will be requested.

Getting data

Even though snmp has 3 basic requests:

  • get
  • getNext (usefull for iterating)
  • getBulk (usfull for iterating in bulk)

With snmp you'll most likely be doing one of two things.

  • get a single value
  • get a subtree

In almost all cases I want to get a certain subtree. Getting value by value is far to time consuming and inefficient. In Oversight we're using getBulk in bulk ;)

The old way

The example (compatible with v4.2.4) for getBulk is here. Luckily it contains this line:

return df # This also indicates that we wish to continue walking

Don't get scared of the following code right away. It took me a while to understand it fully as well. I'll walk you through it.

from twisted.internet import reactor, defer
from pysnmp.entity import engine, config
from pysnmp.entity.rfc3413.twisted import cmdgen
from pysnmp.proto import rfc1905
from pysnmp.carrier.twisted import dispatch
from pysnmp.carrier.twisted.dgram import udp

# Create SNMP engine instance
snmpEngine = engine.SnmpEngine()

# Instantiate and register Twisted dispatcher at SNMP engine
snmpEngine.registerTransportDispatcher(dispatch.TwistedDispatcher())

#
# SNMPv3/USM setup
#

# user: usr-md5-des, auth: MD5, priv DES
config.addV3User(
    snmpEngine, 'usr-md5-des',
        config.usmHMACMD5AuthProtocol, 'authkey1',
        config.usmDESPrivProtocol, 'privkey1'
)
config.addTargetParams(snmpEngine, 'my-creds', 'usr-md5-des', 'authPriv')

#
# Setup transport endpoint and bind it with security settings yielding
# a target name
#

# UDP/IPv4
config.addSocketTransport(
    snmpEngine,
    udp.domainName,
    udp.UdpTwistedTransport().openClientMode()
)
config.addTargetAddr(
    snmpEngine, 'my-router',
    udp.domainName, ('195.218.195.228', 161),
    'my-creds'
)

# Error/response receiver
def cbFun(cbCtx):
    (errorIndication, errorStatus, errorIndex, varBindTable) = cbCtx
    if errorIndication:
        print(errorIndication)
    elif errorStatus:
        print('%s at %s' % (
                errorStatus.prettyPrint(),
                errorIndex and varBinds[int(errorIndex)-1][0] or '?'
            )
        )
    else:
        for varBindRow in varBindTable:
            for oid, val in varBindRow:
                print('%s = %s' % (oid.prettyPrint(), val.prettyPrint()))

        # Stop reactor when we are done walking (optional)
        for oid, val in varBindRow:
            if not val.isSameTypeWith(rfc1905.endOfMibView):
                break
        else:
            reactor.stop()
            return

        # Re-create deferred for next GETBULK iteration
        df = defer.Deferred()
        df.addCallback(cbFun)
        return df  # This also indicates that we wish to continue walking

    # Stop reactor on SNMP error (optional)
    reactor.stop()

# Prepare request to be sent yielding Twisted deferred object
df = cmdgen.BulkCommandGenerator().sendReq(
    snmpEngine,
    'my-router',
    0, 25,   # non-repeaters, max-repetitions
    ( ((1,3,6,1,2,1,1), None), ((1,3,6,1,4,1,1), None) )
)

# Register error/response receiver function at deferred
df.addCallback(cbFun)

# Run Twisted main loop
reactor.run()

From when I first started I don't remember the SNMPv3/USM setup part being there. The example seems more complete now. Let's ignore snmp v3 for now. And take it step by step.

Setting up the engine

Pysnmp has multiple engines we're using the twisted one which is async. Pysnmp also has a few configuration functions in the config module. We'll tell it to use UDP.

from pysnmp.entity import engine, config
from pysnmp.carrier.twisted import dispatch
from pysnmp.carrier.twisted.dgram import udp
from pysnmp.entity.rfc3413.twisted import cmdgen

# Create SNMP engine instance
snmpEngine = engine.SnmpEngine()
# Instantiate and register Twisted dispatcher at SNMP engine
snmpEngine.registerTransportDispatcher(dispatch.TwistedDispatcher())

# Setup transport endpoint and bind it with security settings yielding
# a target name

# UDP/IPv4
config.addSocketTransport(
    snmpEngine,
    udp.domainName,
    udp.UdpTwistedTransport().openClientMode()
)

Configuration to reach our host

  • Community string settings.
  • Transport target; our host we want to reach.
# lets keep it basic
community = 'public'

# register our communitystring
config.addV1System(
    snmpEngine,
    communityIndex=community,
    communityName=community
)

# under the hood each snmp version has an integer equivalent
PYSNMP_VERSION_LOOKUP = {
    '1': 0,
    '2c': 1,
    '3': 3
}

config.addTargetParams(
    snmpEngine,
    community + '-creds',  # the name we're going to reference our community by
    community,
    'noAuthNoPriv', # publicly walk
    PYSNMP_VERSION_LOOKUP['2c'] # v2c
)

config.addTargetAddr(
    snmpEngine,
    'my-router',
    udp.domainName,
    ('195.218.195.228', 161),
    community + '-creds' # this tells pysnmp to use the "credentials" for this community
)

Our request

df = cmdgen.BulkCommandGenerator().sendReq(
    snmpEngine,
    'my-router',
    0, 25,   # non-repeaters, max-repetitions
    ( ((1,3,6,1,2,1,1), None), ((1,3,6,1,4,1,1), None) )
)

Here your can see that we're requesting two Oid's.

Lets make sure the request actually runs

Before the reactor is running, no IO will have been done.

df.addCallback(cbFun)

# Run Twisted main loop
reactor.run()

I know we haven't declared the callback function yet. Lets start with it...

The callback

Now lets start with the old callback

# Error/response receiver
def cbFun(cbCtx):
    (errorIndication, errorStatus, errorIndex, varBindTable) = cbCtx
    if errorIndication:
        print(errorIndication)
    elif errorStatus:
        print('%s at %s' % (
                errorStatus.prettyPrint(),
                errorIndex and varBinds[int(errorIndex)-1][0] or '?'
            )
        )
    else:
        for varBindRow in varBindTable:
            for oid, val in varBindRow:
                print('%s = %s' % (oid.prettyPrint(), val.prettyPrint()))

        # Stop reactor when we are done walking (optional)
        for oid, val in varBindRow:
            if not val.isSameTypeWith(rfc1905.endOfMibView):
                break
        else:
            reactor.stop()
            return

        # Re-create deferred for next GETBULK iteration
        df = defer.Deferred()
        df.addCallback(cbFun)
        return df  # This also indicates that we wish to continue walking

    # Stop reactor on SNMP error (optional)
    reactor.stop()

So three different values for errors here's an explanation:

Non-empty errorIndication string indicates SNMP engine-level error.

The pair of errorStatus and errorIndex variables determines SNMP PDU-level error. These are instances of pyasn1 Integer class. If errorStatus evaluates to true, this indicates SNMP PDU error caused by Managed Object at position errorIndex-1 in varBinds. Doing errorStatus.prettyPrint() would return an explanatory text error message.

Personally, I prefer Exceptions to be raised. One could argue that this way you have the offending data as well if any.

Implementation of getChildren / snmpWalk

Since we're interested in getting a subtree lets look at the code I had for that. Remember we're still looking at pysnmp 4.2.4. And the following code can use some improvements.

# This assumes you have performed similar configuration steps as the requests above.
from twisted.internet.defer import inlineCallbacks, returnValue

class SnmpQueryError(Exception):
    pass


def oidStartsWith(oid1, startStr):
    prefix = ".".join(map(str, oid1._value[:-1]))
    return startStr.startswith(prefix)


def testErr(errorIndication, errorStatus, errorIndex, varBindTable):
    if errorIndication:
        raise SnmpQueryError('%s' % errorIndication)
    elif errorStatus:
        raise SnmpQueryError(
            '%s at %s' % (
                errorStatus.prettyPrint(),
                errorIndex and varBindTable[int(errorIndex) - 1] or '?'
            )
        )



def getChildren(hostName, startString, varBind, bulkGen):
    '''
    Note that this starts a new bulk request every time.
    This is because we're inside a generator. We can't simply return True or a Deferred.
    In the next example I'll show how I improved on it.

    returns None if no response received, empty list if no values received
    '''
    varBinds = [varBind]
    initialVarbinds = (varBind,)
    values = None
    while varBinds:
        errorIndication, errorStatus, errorIndex, varBindTable = yield bulkGen.sendReq(TwistedSnmpEngineHandler.snmpEngine, hostName, 0, 15, varBinds)
        if values is None:
            values = []
        testErr(errorIndication, errorStatus, errorIndex, varBindTable)
        for varBindRow in varBindTable:
            for oid, val in varBindRow:
                if val.isSameTypeWith(rfc1905.endOfMibView):
                    returnValue(values)
                elif not oidStartsWith(oid, startString):
                    # getBulk keeps iterating. We want to stop if we our outside of our subtree.
                    returnValue(values)
                else:
                    values.append((oid, val))

        # we want to tell the bulk generator where we were
        errorIndication, varBinds = getNextVarBinds(initialVarbinds, varBindRow)
    returnValue(values)

pySnmp v4.3.2

I hadn't noticed that a new version of pySnmp was released until I did a new installation on my laptop. The new version has a new architecture. It abstracts away the differences between engines and is easier to use than the previous example.

This is the code of the buldCmd in pysnmp.

from pysnmp.hlapi.lcd import *
from pysnmp.hlapi.varbinds import *
from pysnmp.entity.rfc3413 import cmdgen
from twisted.internet.defer import Deferred
from twisted.python.failure import Failure

def bulkCmd(snmpEngine, authData, transportTarget, contextData,
            nonRepeaters, maxRepetitions, *varBinds, **options):

    """
    ... comments were to large to leave in ...
    """
    def __cbFun(snmpEngine, sendRequestHandle,
                errorIndication, errorStatus, errorIndex,
                varBindTable, cbCtx):
        lookupMib, deferred = cbCtx
        if errorIndication:
            deferred.errback(Failure(errorIndication))
        else:
            deferred.callback(
                (errorStatus, errorIndex,
                 [vbProcessor.unmakeVarBinds(snmpEngine, varBindTableRow, lookupMib) for varBindTableRow in varBindTable])
            )

    addrName, paramsName = lcd.configure(snmpEngine, authData, transportTarget)

    deferred = Deferred()

    cmdgen.BulkCommandGenerator().sendVarBinds(
        snmpEngine, addrName, contextData.contextEngineId,
        contextData.contextName, nonRepeaters, maxRepetitions,
        vbProcessor.makeVarBinds(snmpEngine, varBinds),
        __cbFun,
        (options.get('lookupMib', True), deferred)
    )
    return deferred

As you might remember from my previous article, I'm not a fan of the nested callback. And basically I wanted a getbulkSubtree function.

getBulkSubtree

I'm going to do some optimizations the first two are:

  • In the example above for every lookup I do vbProcessor.makeVarBinds gets called.
  • Getting the oid tuple out of a oid costs quite some work. It has to go through multiple levels of nested getters. Since I keep the oid object in memory anyway and request its subtree every 5 minutes I can save of some function calls by caching the tuple.
from pysnmp.smi.rfc1902 import ObjectType

class OidVarbindCache(object):
    '''
    This will add an extra property to the oid
    '''

    
    def get(engine, oid):
        assert isinstance(oid, ObjectType)
        cache = getattr(engine, 'oidVarbindCache', None)
        if cache is None:
            cache = {}
            setattr(oid, 'oidVarbindCache', cache)

        varbinds = None
        oidTuple = getattr(oid, 'oidTupleCache', None)
        if oidTuple is None:
            if not oid.isFullyResolved():
                varbinds = vbProcessor.makeVarBinds(engine, (oid,))
            oidTuple = oid[0][:]._value


        if oidTuple in cache:
            varbinds = cache[oidTuple]
        else:
            varbinds = varbinds or vbProcessor.makeVarBinds(engine, (oid,))
            cache[oidTuple] = varbinds

        return varbinds, oidTuple

In my call I'm not interested in mib lookups so I left it out. sendVarBinds allows to send along some "callback context". Since there is a cbCtxt we'll use that instead of the nested function.

def bulkSubTreeCmd(
        snmpEngine,
        authData,
        transportTarget,
        contextData,
        nonRepeaters,
        maxRepetitions,
        oid,
        logValues=False,
        lookupMib=False):

    deferred = Deferred()

    varbinds, oidTuple = OidVarbindCache.get(snmpEngine, oid)
    oidTupleLen = len(oidTuple)
    addrName, paramsName = lcd.configure(snmpEngine, authData, transportTarget)

    cmdgen.BulkCommandGenerator().sendVarBinds(
        snmpEngine,
        addrName,
        contextData.contextEngineId,
        contextData.contextName,
        nonRepeaters,
        maxRepetitions,
        varbinds,
        __bulkSubTreeCmdcbFun,
        (lookupMib, deferred, [], oidTuple, oidTupleLen, logValues)
    )
    return deferred

Also the callback is a bit different. In twisted you wrap your exception in a Failure class when you errback. expects this. The default version of the callback isn't usable. It will not catch your error and throw it into the generator if it's not a proper Exception.

def testErrInTable(errorIndication, errorStatus, errorIndex, varBindTable):
    if errorIndication:
        raise SnmpQueryError('%s' % errorIndication)
    elif errorStatus:
        raise SnmpQueryError(
            '%s at %s' % (
                errorStatus.prettyPrint(),
                errorIndex and varBindTable[int(errorIndex) - 1] or '?'
            )
        )

def __bulkSubTreeCmdcbFun(
        snmpEngine,
        sendRequestHandle,
        errorIndication,
        errorStatus,
        errorIndex,
        varBindTable,
        cbCtx):
    lookupMib, deferred, result, oidTuple, oidTupleLen, logValues = cbCtx

    try:
        testErrInTable(errorIndication, errorStatus, errorIndex, varBindTable)
    except Exception as ex:
        deferred.errback(Failure(ex))
    else:
        if varBindTable:
            moreLeft = True
            while varBindTable and (oidTuple != varBindTable[-1][0][0]._value[:oidTupleLen] or varBindTable[-1][0][0].isSameTypeWith(endOfMibView)):
                moreLeft = False
                varBindTable.pop()

            if logValues:
                for item in varBindTable:
                    print('{} = {}'.format(item[0][0].prettyPrint(), item[0][1].prettyPrint()))

            result.extend(varBindTable)
        else:
            moreLeft = False
        if not moreLeft:
            deferred.callback(result)
        return moreLeft

Between this code and the pysnmp4.2.4 code there's a nice optimization. In the old version we kept adding new values to the output list. Now we do a single extend. In pseudocode you'd read:

moreLeft = True
while results and not oidIsSubTupleOf(results[-1], oidTuple) or isEndofMib(results[-1]):
    moreLeft = False
    results.pop()

output.extend(results)

That way not every single value has to be checked only the last few. When getting hundred of values in one walk (often more) this saves quite a few function calls.

final code

from twisted.internet import reactor
from twisted.python.util import println
from pysnmp.hlapi.varbinds import CommandGeneratorVarBinds
from pysnmp.hlapi.twisted import UdpTransportTarget
from pysnmp.hlapi.twisted import SnmpEngine
from pysnmp.hlapi.twisted import CommunityData
from pysnmp.hlapi.twisted import ContextData
from pysnmp.hlapi.lcd import CommandGeneratorLcdConfigurator
from pysnmp.smi.rfc1902 import ObjectIdentity
from pysnmp.smi.rfc1902 import ObjectType
from pysnmp.smi.rfc1902 import ObjectType
from pysnmp.entity.rfc3413 import cmdgen
from pysnmp.proto.rfc1905 import endOfMibView
from twisted.internet.defer import Deferred
from twisted.python.failure import Failure

vbProcessor = CommandGeneratorVarBinds()
lcd = CommandGeneratorLcdConfigurator()

class SnmpQueryError(Exception):
    pass


def printList(lst):
    print('\n'.join(map(str, lst)))


class OidVarbindCache(object):
    '''
    This will add an extra property to the oid
    '''

    
    def get(engine, oid):
        assert isinstance(oid, ObjectType)
        cache = getattr(engine, 'oidVarbindCache', None)
        if cache is None:
            cache = {}
            setattr(oid, 'oidVarbindCache', cache)

        varbinds = None
        oidTuple = getattr(oid, 'oidTupleCache', None)
        if oidTuple is None:
            if not oid.isFullyResolved():
                varbinds = vbProcessor.makeVarBinds(engine, (oid,))
            oidTuple = oid[0][:]._value


        if oidTuple in cache:
            varbinds = cache[oidTuple]
        else:
            varbinds = varbinds or vbProcessor.makeVarBinds(engine, (oid,))
            cache[oidTuple] = varbinds

        return varbinds, oidTuple


def testErrInTable(errorIndication, errorStatus, errorIndex, varBindTable):
    if errorIndication:
        raise SnmpQueryError('%s' % errorIndication)
    elif errorStatus:
        raise SnmpQueryError(
            '%s at %s' % (
                errorStatus.prettyPrint(),
                errorIndex and varBindTable[int(errorIndex) - 1] or '?'
            )
        )

def __bulkSubTreeCmdcbFun(
        snmpEngine,
        sendRequestHandle,
        errorIndication,
        errorStatus,
        errorIndex,
        varBindTable,
        cbCtx):
    lookupMib, deferred, result, oidTuple, oidTupleLen, logValues = cbCtx

    try:
        testErrInTable(errorIndication, errorStatus, errorIndex, varBindTable)
    except Exception as ex:
        deferred.errback(Failure(ex))
    else:
        if varBindTable:
            moreLeft = True
            while varBindTable and (oidTuple != varBindTable[-1][0][0]._value[:oidTupleLen] or varBindTable[-1][0][0].isSameTypeWith(endOfMibView)):
                moreLeft = False
                varBindTable.pop()

            if logValues:
                for item in varBindTable:
                    print('{} = {}'.format(item[0][0].prettyPrint(), item[0][1].prettyPrint()))

            result.extend(varBindTable)
        else:
            moreLeft = False
        if not moreLeft:
            deferred.callback(result)
        return moreLeft


def bulkSubTreeCmd(
        snmpEngine,
        authData,
        transportTarget,
        contextData,
        nonRepeaters,
        maxRepetitions,
        oid,
        logValues=False,
        lookupMib=False):

    deferred = Deferred()

    varbinds, oidTuple = OidVarbindCache.get(snmpEngine, oid)
    oidTupleLen = len(oidTuple)
    addrName, paramsName = lcd.configure(snmpEngine, authData, transportTarget)

    cmdgen.BulkCommandGenerator().sendVarBinds(
        snmpEngine,
        addrName,
        contextData.contextEngineId,
        contextData.contextName,
        nonRepeaters,
        maxRepetitions,
        varbinds,
        __bulkSubTreeCmdcbFun,
        (lookupMib, deferred, [], oidTuple, oidTupleLen, logValues)
    )
    return deferred

engine = SnmpEngine()
transportTarget = UdpTransportTarget(transportAddr=('1.2.3.4', 161))
blankContextData = ContextData()

dfd = bulkSubTreeCmd(
    engine,
    CommunityData('public'),
    transportTarget,
    blankContextData,
    0, 15,
    ObjectType(ObjectIdentity('1.3.6.1.2.1'))
)
dfd.addCallback(printList)
dfd.addErrback(println)
dfd.addCallback(lambda _: reactor.stop())

reactor.run()

'''
[(ObjectName('1.3.6.1.2.1.1.1.0'), OctetString('Nimble Storage...'))]
[(ObjectName('1.3.6.1.2.1.1.2.0'), ObjectIdentifier('1.3.6.1.4.1.37447.3.1'))]
[(ObjectName('1.3.6.1.2.1.1.3.0'), TimeTicks(523565506))]
[(ObjectName('1.3.6.1.2.1.1.4.0'), OctetString('nowhere'))]
[(ObjectName('1.3.6.1.2.1.1.5.0'), OctetString('nimblexx'))]
[(ObjectName('1.3.6.1.2.1.1.6.0'), OctetString('Unknown'))]
[(ObjectName('1.3.6.1.2.1.1.7.0'), Integer(76))]
'''

Notice that the amount of pre-config code is drastically less than in the previous version. They really did a good job re-architecting pySnmp. Without pre-configuring the engine I could use:

  • udp transport
  • hosts
  • community strings

Sure I had to instantiate them but you can now easily switch out stuff.

Conclusion

Of course this is just one way of implementing a snmp walk. It was a bit opinionated because I wanted:

  • to prevent nested functions and callback hell
  • to use @inlineCallbacks for readability
  • exceptions no strange error flags

I hope you enjoyed it.