monitoring hp jet direct, graphs don't looks right

BryanRagon

09-06-2004 12:11:41

I'm trying to monitor the page output of our HP printers. I setup two new SNMP tests
HP - Page Count .1.3.6.1.2.1.43.10.2.1.4.1.1
HP - Pages Count (reboot) .1.3.6.1.2.1.43.10.2.1.5.1.1

I then created a new device, new group, and added two monitors for those two SNMP tests. of type Counter, Min of undefined, maximum of 200000.

I then let it run for a half hour or so. The last value field (shown when in the groups->Devicename->subdevice page, which lists the monitors), shows the proper number, in my case 196.96k pages printed. However the graph doesn't look right at all. The Y axis has 1m 2m and 3m, and the graph goes up to 3.2M, but only when a page has printed, between prints it goes back to zero.

By using 'rrdtool dump mon_##.rrd' I get the following points of interest

[code121784ba43c]
<version> 0001 </version>
<step> 300 </step> <!-- Seconds -->
<lastupdate> 1086796801 </lastupdate> <!-- 2004-06-09 12:00:01 EDT -->

<ds>
<name> mon_19 </name>
<type> COUNTER </type>
<minimal_heartbeat> 600 </minimal_heartbeat>
<min> NaN </min>
<max> 2.0000000000e+05 </max>

<!-- PDP Status -->
<last_ds> 196967 </last_ds>
<value> 0.0000000000e+00 </value>
<unknown_sec> 0 </unknown_sec>
</ds>

SNIP


<!-- 2004-06-09 11:35:00 EDT / 1086795300 --> <row><v> NaN </v></row>
<!-- 2004-06-09 11:40:00 EDT / 1086795600 --> <row><v> 0.0000000000e+00 </v></row>
<!-- 2004-06-09 11:45:00 EDT / 1086795900 --> <row><v> 0.0000000000e+00 </v></row>
<!-- 2004-06-09 11:50:00 EDT / 1086796200 --> <row><v> 3.3333333333e-03 </v></row>
<!-- 2004-06-09 11:55:00 EDT / 1086796500 --> <row><v> 0.0000000000e+00 </v></row>
<!-- 2004-06-09 12:00:00 EDT / 1086796800 --> <row><v> 0.0000000000e+00 </v></row>
[/code121784ba43c]

Any thoughts?

silfreed

09-06-2004 13:45:39

My best guess (without reading the mib) is that the 'page count' is a guage, not a counter. The spikes on your graphs sound like counter rollover events. If this is the case, you'll have to delete your monitor files to get the change from counter to guage to take effect.

-Doug

BryanRagon

09-06-2004 13:54:59

[code12052d6246d]
[root@ranger rrd]# snmpwalk -c pubstring -v 1 ptr-01-hp01.ze.zapeng.com .1.3.6.1.2.1.43.10.2.1.4.1.1
SNMPv2-SMI::mib-2.43.10.2.1.4.1.1 = Counter32: 196969
[/code12052d6246d]

From my understanding, that means the MIB has defined the UID as a counter, not a guage.... or is that wrong?

Thanks for the fast reply!!!!

Bryan

silfreed

09-06-2004 14:05:40

Yes, you're correct. I think I know what's going on here.

With counters, the values gathered are averaged out over the gathered period. ie, if you're displaying graphs on a 5 minute interval and you only print one page in that five minutes, you're going to see a value of 1/300.
What we've done in the past here is increase the multiple in our template/custom graph for that monitor to something higher; usually a multiple of 60 (for 1 minute). You might want to try chaning it to 60, and if that's not high enough to see your data (ie, you don't print 1 page per minute), change it to 300.
Hopefully that will help you see your data here. I think the problem is just that the data isn't changing very often.

-Doug

BryanRagon

09-06-2004 14:39:54

So what you're saying is that it's trying to graph pages printed per minute? I was hoping for just a graph of how many pages have been printed. So the graph is constantly increasing as time goes on.

Average pages per minute (or hour) would possibly be another graph I'd like to do at some point also.

silfreed

09-06-2004 14:49:43

Correct; that's how counters work. If you want to see an ever-increasing graph of pages printed, convert it to a guage. Again, delete your monitors first.

-Doug

silfreed

09-06-2004 14:50:54

Er, let me qualify 'ever-increasing'. It will probably work on your HP printer since they keep their page count internally; most other devices have counters that 'roll over', that is, they will goto zero when they get too large (usually over 2^32).

-Doug

BryanRagon

09-06-2004 15:17:34

ah gotcha.... I was thinking of coutners/guages the other way around...

I was thinking a counter is tell me what that current count is

gauge made me think of a tachometer or speedometer.... pages/per minute, that kind of thing...

I'll change the montior, blow out the mon_XX.rrd give it some time and let you know.

Again, thanks!

Bryan

BryanRagon

09-06-2004 15:33:41

D

Thanks, works like a charm.

Now to keep setting up the rest of the printers in the office -D


we have some historical data from when we used mrtg, is there any way to import that data using rrdtool from the commandline?

Thanks!

balleman

09-06-2004 15:58:34

It should be possible import this data, however we have never written out a procedure for it.

You will want to first create monitors for the files, so that you have a monitor number to reference. You can then rename the rrd file, and use "rrdtool tune" to change the DS name to mon_1234 (or whatever).

There are a lot of potential issues. NetMRG only supports one DS per RRD file, but other programs (and probably MRTG) use more than one per file. I am not sure if these can be split easily... you might need to "rrdtool dump", split the DSes out of the XML, and then "rrdtool restore" them into separate files. Another issue is that other products will have a different geometry for their RRAs, but rrdtool should be pretty good at hiding that in most cases.

In any case, it is possible, but not trivial. Good luck, and let us know how it comes out if you decide to attempt it.

BryanRagon

09-06-2004 16:51:59

Sure enough did.

[code1c709b54ef7]
rrdtool dump old_data.rrd > old_data.xml
[/code1c709b54ef7]

Sure enough, it recorded two values (in my example, page count, and page count since reboot).. I needed to split them into two different ones, so I did page count first).

There were two <ds> sections, so I removed the second. I also renamed the first one from <name> ds0 </name> to <name> mon_19 </name>.

Then down in the <rra><cdp_prep> there were two <ds> tags in there, I removed the second set in each <cdp_prep> in the xml file (there were only 8 so I did it manually).

Then in each of the <database> tags (8 database tags) there were hundres of rows of the form

[code1c709b54ef7]
<!-- 2004-06-06 21:35:00 EDT / 1086572100 --> <row><v> NaN </v><v> NaN </v></row>
[/code1c709b54ef7]

I needed to remove the second <v> tags, so I ran it through the following perl one-liner

[code1c709b54ef7]

perl -p -e 's/<row><v>(.*)<\/v><v>(.*)<\/v><\/row>/<row><v>$1<\/v><\/row>/g' old_data.xml > old_data_next.xml
[/code1c709b54ef7]

Then all I had to do was remove the "original" mon_19.rrd created by netmrg, and do a

[code1c709b54ef7]
rrdtool restore old_data_next.xml /var/lib/netmrg/mon_19.rrd
[/code1c709b54ef7]

Viola.

The only thing I noticed was that netmrg had several <cf> tags, four each of AVERAGE, LAST, and MAX. My rrd from the old program had only AVERAGE and MAX... everything appears to be working, but is it possible to use rrdtune to add the LAST tags? netmrg_gatherer has run several times and not "auto-added" it like I thought it might.