ND - Network Integrity and Diagnostics

The values in parentheses are the default values provided by RCT.

Many tests begin by searching for the DUT address. This requires two settings to be correct. * Identifying DGN. Pick a status DGN that the device transmits frequently automatically. The value is in hexadecimal (e.g. WATERHEATER_STATUS -> 1FFF7). * Identifying Instance. Used in combination with the Identifying DGN to find the DUT. If the DGN does not have an Instance field, enter FF.

Bus Traffic Evaluation

Test NDBE is intended to answer the basic question - is this RV-C network prone to "traffic jams"? It doesn't just detect bursts as they occur but measures how messages are spread out over time, which is tells you how efficient the network is being utilyzed and therefore how much more traffic it can bear.

This is not an official RV-C compatibility test. It is intended for use on a full RV network in actual use, not just one device on the bench.

The test requires 30 seconds, during which time the network should be exercised as it might normally be used. While it is sampling the network, perform routine operations such as turning lights on and off, raising and lowering shades, starting the generator, etc.. Do not attempt any diagnostic operations, firmward downloads, or the other activities beyond the normal operation of the RV.

To analyze the bus activity, the NDBE test creates five data sets. Each data set breaks up the timeline into windows of a particular length. At the conclusion of the test, it calculates the probable number of messages that a random window might contain. The most critical data set looks at 5ms windows. Note that the RV-C bus can reliably accommodate no more than 8 messages in that time frame, so if the test registers a non-zero probability that 7 or more messages exist in a random window, it reports a Critical Burst.

Similarly, if the data set tracking 10ms windows sees 12 or more messages in one window, it reports a Serious Burst. If the 25ms tracker sees 26 or more messages in one window, it reports an Extreme Load. If the 50ms tracker sees 41 or more, it reports High Load.

The test also examines how well spread the messages are by comparing the data statistically with what would be expected on a fully randomized network. Note that this is a very low bar - almost any attempt at all to avoid bursts and smooth out transmissions over time will beat random luck. If the distribution of messages within 5ms windows is worse than random, the test reports Poor Gap Management. If the distribution over the 100ms window is worse than random, the test reports Poor Load Variation Management.

Whether or not any of these messages will translate to actual malfunction depends on the capacity of each network device to buffer data and tolerate the additional latency. This capacity has little to do with the power of the microprocessor or the amount of RAM it has, but rather to details in their CAN implementation. In a burst, a device might drop incoming messages, outgoing transmissions, or both. Occasional losses are usually harmless, but sometimes lead to a meaningful malfunction - and they are always difficult to diagnose. High load conditions suggest that, even if bursts are not occuring in normal operation, they are likely to occur during diagnostics or in special circumstances. Poor Management indicates that the network may have problems scaling up - the practical capacity of the network falls short of what it theoretically should be able to handle.

The data collected during the test is included in the log and can be cut-and-pasted into a spreadsheet for more detailed analysis. The data is comma-delimited, and can be saved as a CSV file or directly pasted in.
Each line of data includes the statistical summary and an array of n+1 numbers indicating the probabilities of a window of the particular size containing n messages.

DGN Request tests

Tests ND-10 through ND-30 require the following settings.

  • Unsupported DGN 1
  • Unsupported DGN 2
  • Unsupported DGN 3 - These three DGNs should be valid status DGNs, all different, and all not supported by the DUT. The default DGNs are LOCK_STATUS, FLOOR_HEAT_STATUS, and VEHICLE_SEAT_STATUS.
  • Supported DGN 1
  • Supported DGN 2
  • Supported DGN 3 - These three DGNs should be items supported by the DUT. If the device does not support three distinct DGNs, it is acceptable to use the same DGN more than once.

RCT cannot distinguish between messages sent in response to a request and messages sent on their normal operating schedule. Therefore: * When selecting DGNs, avoid DGNs that are broadcast frequently.
* Certain tests require manual analysis of the test report. RCT cannot reliably determine when a "wrong DGN" is broadcast or when multiple responses are required.

Note that the official tests do not use the Instance field. Support for the Instance field is not specifically required - the field was added to the RV-C specification relatively late.
The unofficial Instanced test uses Supported DGN 1. It monitors the DUT briefly to determine which instances to use in the test.

Product ID

Test ND-40 does not require any particular settings. Note that the test does not specifically check that the contents of the Product ID are useful. Failing to put useful data in the fundamental identification DGN requires a level of stupidity that the authors of these tests do not wish to contemplate could exist.

DM_RV Tests

Tests ND-50 through ND-70 require

  • DSAs Reported. This is a string of codes, per the table below.
  • Statically Addressed. Set to Yes if the device uses its DSA as its SA.

Note that RCT does not check whether the DSA list is actually correct. The DSA does not match device function. failure must be checked by the user.

For the five second and one second timing intervals, a 20% tolerance is allowed.

To determine whether the broadcasts are appropriately staggered, RCT uses the same time-window analysis as for test NDBE (Bus Traffic Evaluation) above. It uses a window, comparing the device reports against a random schedule. Note that this is a very low bar - almost any attempt to stagger the transmissions will beat purely random broadcasts.

Test ND-120 automatically finds the DUT on the network and waits for it to send a fault. The test only checks for one fault - if the device displays multiple faults the results might not be valid.

ND-60, ND-70

Development on these two tests is postponed until certain issues are clarified in the RVIA document.

Operational Fundamentals

These tests are among the most difficult to write and implement, as they each attempt to address a fundamental principle rather than a specific glitch. RCT can't always try every possible combination of values that might trigger a failure, so success is rarely absolute.

Device Traffic Evaluation

This is not an official test but a preliminary evaluation that preceded the Sample of Network Traffic test. It samples device transmissions over a 30 second interval and provides guidance regarding whether the product is ready for formal testing and what global message gap is appropriate for the device. Ideally the test should be conducted on a functioning real-world network, with the device being used in its normal manner. Motion devices should not be moved - motion status DGNs are allowed to override the general gap requirements that govern ordinary activity.

The procedure evaluates device activity over various time windows from 10ms to 250ms. Multiple messages within 10ms windows are flagged - RV-C allows such bursts when the extra messages are in response to requests or commands, but not for normal status broadcasts.

The procedure also evaluates the overall bus traffic. In general, devices are expected to transmit their status DGNs with a minimum of a 50 ms gap between messages. Some headroom, though, is appropriate to accommodate extra DM_RV messages and on-change status DGNs. The RV-C document does not specify a particular amount of headroom - RCT suggests that devices reserve 50% of their bandwidth for these extra messages. Devices that support a large number of on-change status messages may require more. Devices that generate a lot of high-frequency messages and few others (e.g. transfer switch) need less.

If a device is too busy to provide a 50ms gap between messages and still have headroom (i.e. more than 10 m/s in normal operation), the gaps for specific status DGNs should be adjusted upwards, to values no higher than their listed maximum message gaps. Only after the schedule has been adjusted as much as possible should the general gap be reduced.

After all gap adjustments have been made, if the recommended gap is still less than 50ms, you can change the Overall Transmission Gap setting. RCT will then use that parameter when evaluating device traffic.

Proprietary Messages

In this test, RCT sends 2500 random proprietary messages. For each message, if it receives any sort of response within 10ms, it considers that a candidate failure. Each candidate failure gets retested up to four more times, and if a response is noted each time a failure is reported. False positives are possible for very busy devices.

Sample of Network Traffic

This is the most general test of RV-C compatibility. It answers three basic questions:

  • Are packets properly constructed?
  • Is the data within the packets properly encoded?
  • Are the packets properly distributed over time?

The tests for packet construction are straightforward. The test for data encoding uses the same "sniffing" dictionary as Omniscope's standalone RV-C Sniffer. During the test, the sniffer parses every incoming messsage and displays the values. The "sniff box" updates continuously and can be resized and sorted on the fly. The sniffer can parse every message but it can't verify that the values correspond to reality. It's up to the humans in the system to watch the sniff box and verify that the values are correct. At the end of the test RCT asks for the result.

The test checks both the overall message gap timing and the gap times for specific DGNs. It uses the Overall Transmission Gap, with some additional leeway to account for variation in bus access time. The leeway is calculated proportionately using the 50MS Timing Tolerance setting. For example, if 50MS Timing Tolerance=10ms and Overall Transmission Gap=30ms, RCT will provide 6ms of leeway in the test.

There are two ways to fail the general gap test. Occasional violations are tolerated, as responses to commands and requests are allowed to be immediate. RCT does not track all possible message triggers - it merely tracks the number of these single-extra message violations and if they exceed 10% of the total traffic RCT flags it as a failure. It immediately flags as a failure any time three or more messages appear within the gap time.

There are two ways to fail the DGN-specific gap test. Occasional violations of the minimum gap are tolerated to account for commands and requests, but any more than one violation within the gap window is flagged as a violation. Any violation of the maximum gap requirement is flagged as a failure.

The device should be tested in as close to a real-world environment as possible - ideally, in a working RV. The device should be operated as it would be routinely used in the RV. For example, if the DUT is an air conditioner, the test should included cycling through the possible operating modes, adjusting the thermostat, and watching the unit start and stop. But some operations should be avoided as they will likely trigger a false positive.

  • Moving devices (e.g. slide, awning) should not be moved.
  • Control panels (e.g. keypads, touchscreens) should not be used to control movement.
  • Test, diagnostic, and configuration operations should not be performed.
  • Do not add new devices to the network during the test. Some devices poll the network for data upon startup, which can cause false-positives on the gaps tests.

Ignore Source Address, Empty Commands, Incorrect Instance

Each of these tests each require devising two different commands that the device supports.

Command A Command B Notes
ND-100 Ignore Source Address Supported Command 1 Supported Command 2 Uses all data bytes. Best if command B "undoes" command A.
ND-110 Empty Commands Supported Command 1 Supported Command 3 Uses only the instance bytes. The DGNs must be different, if possible.
ND-140 Incorrect Instance Supported Command 1 Supported Command 3 Uses all data bytes but instance. The DGNs must be different, if possible.

Each command is entered in a hexadecimal string, with the DGN followed by the eight data bytes, separated by spaces. For example, 1FEF9 01 F3 FF FF FF FF FF FF represents THERMOSTAT_COMMAND_1, with Instance 1, Operating Mode 3, and all other fields ignored.

For the responses, (Expected Response 1, etc.) only the DGN is required. RCT only checks the instances of the responses, not the data fields.

Test ND-140 only randomizes the standard instance byte (byte 0). It does not randomize the secondary instances for DGNs such as THERMOSTAT_SCHEDULE_COMMAND_1.

NAK of Unsupported Commands

Test ND-130 requires five commands to be broadcast with invalid values. The DUT is expected to respond with a NAK for each. The default message ("Unsupported Command X) is the DATE_TIME_COMMAND, set to 13-31-99 at 25:61:61 - an example of a legitimate command being broadcast with invalid values. Substitute commands applicable to the DUT. If there are not five different sets of command values that apply, you may duplicate messages - RCT does not check for duplication.

There are three principle ways in which a command may trigger a NAK.

  • The command may have impossible values (such as setting the month to 13).
  • The command may have values that are meaningful but are not supported by the device (e.g. setting a furnace heat source to engine heat, when only combustion and electric are supported.)*
  • The current conditions do not allow the command to be processed. (e.g. extending the moving device when a safety lock prevents it.)

In the first two cases there is some discretion allowed. Rather than fail with a NAK, a device is allowed to "fix" the errant value when safe and sppropriate. For example, the device replace an out-of-bounds value with the closest acceptable value. No such discretion is allowed in the third case - feedback is often crucial.

The following cases do not require a NAK.

  • The command has an incorrect instance. (This allows other devices to possibly process the command.)
  • The problematic data field is not supported by the device at all.

This test only checks for the NAK. It does not monitor any status DGNs.

DSA Abbreviations

DSA Description
AAS 126 Active Air Susp
AC 103 Air Conditioner
ACF 78 AC Fault Monitor
ACL 137 AC Load Control
ACM 77 AC Load Monitor
ACS 140 Generic AC Source
AFZ 109 Aux Freezer
AGS 65 AutoGenStart
AHT 97 Aux Heat
ALM 144 Alarm
ALV 83 Air Leveler System
ARF 108 Aux Refrigerator
ATS 79 Auto XFer Switch
AUD 112 Audio Entertainment
AWN 130 Awning
BAT 70 Battery
BKR 149 AC Breaker/Panel
BRD 253 Network Bridge
CBT 71 Chassis Battery
CHB 252 Chassis Bridge
CHG 76 Charge Controller
CLK 250 Clock
CON 74 Converter
DCL 146 DC Load Control
DIM 131 DC Dimmer
DMP 129 Waste Dump
DSC 139 DC Disconnect
EXT 143 External Interface
FAN 142 Vent Fan
FIL 128 Tank AutoFill
FUR 94 Furnace
GAS 120 Gas Detector
GEN 64 Generator
GPS 136 GPS
HYD 100 Hydronic Furnace
ICE 110 Icemaker
INV 66 Inverter
KPD 132 DC Input/Keypad
LCK 135 Door Lock
LOG 251 Data Logger
LPG 73 LPG System
LVL 81 Leveler System
MTR 138 DC Motor
PAN 68 Control Panel
PMP 127 Water Pump
REF 107 Refrigerator
SHD 134 Window Shade
SLD 84 Slide Room
SOC 69 StateOfCharge Monitor
SOL 141 Solar Controller
STL 249 Service Tool
STO 111 Stove
STP 147 Step Controller
THM 88 Thermostat
TNK 72 Tank System
TPM 133 Tire Monitor
TVL 118 TV Lift
VID 115 Video Entertainment
VLV 148 Plumbing Valve
VST 150 Vehical Seat
WDW 145 Window Controller
WEA 80 Weather Station
WHT 101 Water Heater

How RCT Evaluates Burstiness

"Burstiness" refers to the tendency for poorly programmed nodes to send out their messages in bursts, which may seem harmless unless one realizes that other devices may also be as "bursty", and their bursts may just happen to coincide.

To measure burstiness, we consider the probability that a second device attempting to access the network will encounter excessive traffic. For each situation, we define a "burst" in terms of how many messages are tolerated within a particular time window. For example, the definition in test ND-90 is no more than one message in each 50ms window. (RCT adds some tolerance through the 50MS Timing Tolerance setting.)

In addition to the PASS/DNP result, RCT reports the results of its evaluation with an array of numbers suitable for cut-and-pasting into a spreadsheet. Here is a typical array.

45,"ms",16.123,"sec",3,"max",0.823,0.170,0.004,0.002
  • The time window used in the test. 45 ms.
  • The duration of the test. 16.123 ms.
  • The largest "burst" seen. 3 max
  • The probability that when a second device attempts to access the network, in the 45ms beforehand there were 0 messages. 0.823 (82.3%)
  • The probability that when a second device attempts to access the network, in the 45ms beforehand there was 1 message. 0.170 (17.0%)
  • The probability that when a second device attempts to access the network, in the 45ms beforehand there were 2 messages. 0.004 (0.4%)
  • The probability that when a second device attempts to access the network, in the 45ms beforehand there were 3 messages. 0.002 (0.2%)

The RVIA Test Procedures document does not spell out specific degrees of leniency, though it states that due to the nature of the CAN bus arbitration process, some leniency is required. RCT allows some leniency through the 50MS Timing Tolerance and 250MS Timing Tolerance settings. Additional leniency may be provided in the tests as documented in these web pages.