Tracking alarms by ID

I'm working on something to do external alarm notifications.  When I dispatch an alarm, it has to have some kind of unique ID about the occurrence of the alarm.  So for instance if I had an alarm tag named BADALARM and this occurs three times, I want an ID for each individual occurrence.  The reason I want this is that I want to be able to acknowledge the alarm event later.

When you call AlarmFirstTagRec you get back an ID:

Return Value
The alarm record identifier or -1 if no match is found.

Is that ID kind of like an auto-number of the occurrence of the alarm - an ID that keeps increasing forever, one for each alarm activation ?

-C

Parents
  • AlarmFirstTagRec is part of legacy alarm browsing functions. It will return an unique identifier of a alarm defined in the alarm object or table. For instance, "ALARM_1" is defined in Example project and its record identifier is 26 (might change if the project runs on different machine). The identified record contains all info about "ALARM_1", such as Alarm Tag, Alarm Name, Category, Comment and etc, and use AlarmGetFieldRec, to retrieve them.

    These legacy alarm browse functions can be superseded by those new ones starting with "AlmBrowse" that have better filtering capability (see Help for details).

    I don't think that you could acknowledge an old occurrence of a alarm. However you could use SOEEventAdd to insert an event into the Event Journal to represent your ack action on an alarm tag.

    There is an unique of record Id for each event in SOE data table, represented by VarChar(33). 

    Hope this would help.

  • Should I use something that's named "Legacy"?

  • For a green project, I would start using the new browse functions. Of course, all AlarmFirstTagRec related Cicode functions still work maintaining backward compatibility.

  • OK.  I think I am getting that you are saying I first get the list of alarm tags for a particular type of alarm (like 'digital' or 'analog').  Then once I have those tags, I would have to poll each one individually using an AlmQuery and that would retrieve for me an interpolated history of each individual alarm, one at a time.

    This almost answers my question but not quite.  I was thinking as if you acknowledge each individual alarm record that came back from AlmQuery so I was looking for an ID for each of those records.  I am starting to think that's not the case, though, and that you are really ACKing the alarm tag, not the individual occurrence.

    That's probably OK for what I'm doing.  I think the main issue is it's a race condition.  If I get an alarm, then I acknowledge it on my phone, and somewhere in between the alarm fired again, I might be acknowledging both the original occurrence and the new one.

    I guess that's unavoidable but the risk is relatively small, it's only an ack and that ack is only for the purpose of telling people the alarm has been seen by somebody.

  • I don't see a way to get a unique ID for each alarm occurrence. It's a little confusing since there are functions that deal with alarm screens, summary screens, alarm filtering, alarm definitions, and the alarm summary table (history). Each type of function has different numbers for identifying alarms or alarm instances. And, the Browse functions don't even use ID numbers since you're just browsing the results of a query.

    As you found, the Record ID from AlarmFirstTagRec and related functions is a unique ID for each alarm tag definition. There is also an alarm version which you combine with the RecID to get a unique identifier for each alarm instance, however, I don't think you can read the version directly...it just comes from the alarm screen filtering feature. The alarm screen functions also use an alarm instance record number which I believe is only unique within the current list displayed on that screen. The AlarmSum functions use an index number to refer to unique instances of each alarm, but the index is only valid until you call another AlarmSum function that returns an index. So, I believe the index is not truly unique, just unique within the current search.

    So, I believe to uniquely identify each alarm instance, you would have to use either:

    Tagname and OnTimeDate

    or

    RecID and OnTimeDate

    You could append the On MS to the timestamp, but I don't believe you can normally have 2 instances of an alarm within the same second. For example the AlarmSumFind() function looks up the index for an alarm instance, using just the RecID and OnTime (UTC time in seconds). It doesn't need or allow you to specify the MS.

    Here's some code where I found the RecID for an alarm tag, then browsed the alarm summary for that alarm and printed the RecID, ONUTC time, and Index for each instance to the Kernel.

    FUNCTION AlarmTrace(STRING sTag = "ALARM_1")
    
    	INT		nRecID;
    	INT		nOnUTC;
    	INT		hSummary;
    	INT		nError;
    	INT		nIndex;
    	
    	nRecID = AlarmFirstTagRec(sTag, "", "");
    
    	hSummary = AlmSummaryOpen("TAG=" + sTag, "ONUTC");
    
    	IF hSummary = -1 THEN
    		TraceMsg("Failed to open alarm summary");
    		RETURN;
    	END
    	
    	AlmSummaryFirst(hSummary);
    	
    	WHILE nError = 0 DO
    
    		nOnUTC = AlmSummaryGetField(hSummary, "ONUTC");
    		nError = IsError();
    		nIndex = AlarmSumFind(nRecID, nOnUTC);
    		
    		IF nError = 0 THEN
    			TraceMsg("Tag: " + sTag + " RecID: " + nRecID:# + " ONUTC: " + nONUTC:# + " Index: " + nIndex:#);
    			AlmSummaryNext(hSummary);
    		END
    	END
    
    	AlmSummaryClose(hSummary);
    		
    	TraceMsg("End of records for " + sTag);
    END
    

    However, you may not even need a unique ID for each instance. As you found, you can only acknowledge the latest instance of an alarm. For example I had several instances of an alarm on the summary page in the Example project. When I right-clicked the oldest instance and chose Acknowledge, it ack'd the newest instance. If you're really concerned about an operator ack'ing the wrong instance of the alarm, you could make your own Alarm Ack Cicode function something like this pseudocode:

    function AlarmAckInstance(alarmTag, onTime)

      currentOntime = Look up Alarm latest Ontime

      if onTime = currentOnTime then Ack alarmTag else return an error

    end

    Personally, I think it would just annoy operators if an alarm is going in and out of alarm state and they can't ack it because it re-triggers too quickly. I guess it could be an issue if the operator normally sees the alarm, fixes the problem, and then ack's the alarm. They could have failed to fix it properly and not realize it re-occurred again. However, they would see the alarm is still active in the alarm screen, and if they're receiving notifications, they could be notified every x minutes as long as the alarm is active. However, if they Ack the alarm first and then fix the problem, I don't see an issue if it re-occurs before they ack it.

Reply
  • I don't see a way to get a unique ID for each alarm occurrence. It's a little confusing since there are functions that deal with alarm screens, summary screens, alarm filtering, alarm definitions, and the alarm summary table (history). Each type of function has different numbers for identifying alarms or alarm instances. And, the Browse functions don't even use ID numbers since you're just browsing the results of a query.

    As you found, the Record ID from AlarmFirstTagRec and related functions is a unique ID for each alarm tag definition. There is also an alarm version which you combine with the RecID to get a unique identifier for each alarm instance, however, I don't think you can read the version directly...it just comes from the alarm screen filtering feature. The alarm screen functions also use an alarm instance record number which I believe is only unique within the current list displayed on that screen. The AlarmSum functions use an index number to refer to unique instances of each alarm, but the index is only valid until you call another AlarmSum function that returns an index. So, I believe the index is not truly unique, just unique within the current search.

    So, I believe to uniquely identify each alarm instance, you would have to use either:

    Tagname and OnTimeDate

    or

    RecID and OnTimeDate

    You could append the On MS to the timestamp, but I don't believe you can normally have 2 instances of an alarm within the same second. For example the AlarmSumFind() function looks up the index for an alarm instance, using just the RecID and OnTime (UTC time in seconds). It doesn't need or allow you to specify the MS.

    Here's some code where I found the RecID for an alarm tag, then browsed the alarm summary for that alarm and printed the RecID, ONUTC time, and Index for each instance to the Kernel.

    FUNCTION AlarmTrace(STRING sTag = "ALARM_1")
    
    	INT		nRecID;
    	INT		nOnUTC;
    	INT		hSummary;
    	INT		nError;
    	INT		nIndex;
    	
    	nRecID = AlarmFirstTagRec(sTag, "", "");
    
    	hSummary = AlmSummaryOpen("TAG=" + sTag, "ONUTC");
    
    	IF hSummary = -1 THEN
    		TraceMsg("Failed to open alarm summary");
    		RETURN;
    	END
    	
    	AlmSummaryFirst(hSummary);
    	
    	WHILE nError = 0 DO
    
    		nOnUTC = AlmSummaryGetField(hSummary, "ONUTC");
    		nError = IsError();
    		nIndex = AlarmSumFind(nRecID, nOnUTC);
    		
    		IF nError = 0 THEN
    			TraceMsg("Tag: " + sTag + " RecID: " + nRecID:# + " ONUTC: " + nONUTC:# + " Index: " + nIndex:#);
    			AlmSummaryNext(hSummary);
    		END
    	END
    
    	AlmSummaryClose(hSummary);
    		
    	TraceMsg("End of records for " + sTag);
    END
    

    However, you may not even need a unique ID for each instance. As you found, you can only acknowledge the latest instance of an alarm. For example I had several instances of an alarm on the summary page in the Example project. When I right-clicked the oldest instance and chose Acknowledge, it ack'd the newest instance. If you're really concerned about an operator ack'ing the wrong instance of the alarm, you could make your own Alarm Ack Cicode function something like this pseudocode:

    function AlarmAckInstance(alarmTag, onTime)

      currentOntime = Look up Alarm latest Ontime

      if onTime = currentOnTime then Ack alarmTag else return an error

    end

    Personally, I think it would just annoy operators if an alarm is going in and out of alarm state and they can't ack it because it re-triggers too quickly. I guess it could be an issue if the operator normally sees the alarm, fixes the problem, and then ack's the alarm. They could have failed to fix it properly and not realize it re-occurred again. However, they would see the alarm is still active in the alarm screen, and if they're receiving notifications, they could be notified every x minutes as long as the alarm is active. However, if they Ack the alarm first and then fix the problem, I don't see an issue if it re-occurs before they ack it.

Children
  • Right.  What I'm doing here is setting up a Squadcast.com interface so that alarms get dispatched to a real dispatching system.  Squadcast supports all the nice stuff like shifts, knowing if someone is on vacation, routing to groups/teams, and automatic escalation.

    So I like your thought about the on time,.  It makes me wonder if I even need the record ID at all.  (Now I'm not even sure why I would care about it).  I can probably just use the tag.  I can dispatch the alarm externally using a key of TAG+ONTIME then if an ack comes back the other way I can see if the current ONTIME matches the one that came back, and if they don't, then don't do an ack because I would be acking the wrong thing.  Maybe this is what you were saying: since you can only ACK the newest instance anyway, I just make sure it's the same instance that I'm ACKing.  If ONTIME has changed, it's not, so do nothing.

    Thanks for the the time and thought you put into your response - very appreciated.