ASN.1 PER Implementation Guidelines (Q115)

The information in this article applies to:

In early 2004, Abstract Syntax Notation One (ASN.1) was implicated in news reports (see here and here) as the source of flaws. However, the flaws were due to poor implementations and not any inherent weakness in ASN.1. In order to improve current and future implementations, we put together the following set of cautionary guidelines for implementing the run-time component of the basic aligned variant of the Packed Encoding Rules (PER). We don't discuss the offline aspects such as compiling an ASN.1 syntax tree into a run-time representation.

We chose PER and not, for example, BER, because of PER's use by H.323, a popular packet-based multimedia protocol. However, many of the things mentioned here are applicable to other encoding rules and even non-ASN.1 encoding schemes. The guidelines are further limited to just the aspects of ASN.1 PER that are used by H.323.

Notice that the guidelines focus on decoding (input) as opposed to encoding (output). That is typical for I/O subsystems because systems in general have much less control over input than output, so input processing requires much greater care.

The reader must have some basic understanding of ASN.1 PER--this is not a tutorial. Finally, if you can think of something we missed, please let us know.

Decode

Check before allocating memory. When allocating dynamic memory from the heap, a buffer pool, etc., ALWAYS check whether the attempt succeeded before attempting to use the memory. Dynamic memory is often used for variable-length data such as character, octet, and bit strings and object identifiers. It might even be prudent to implement a size sanity check on top of the dynamic memory manager. For example, reject requests to allocate a buffer of size greater than 50 kilobytes, even though there are sufficient resources to fulfill the request.
Low-level decoding. Localize the low-level reading of bits and octets to just a few functions and be careful about reading past the end of the encoded data. Take special care about reads that require octet alignment.
Sanity checks. There are various high-level sanity checks that are redundant with often iterative, low-level checks. However, they are probably worth doing anyway for increased performance and robustness. Once you have determined how much iterative data to read from the encoded data via the length determinant, adjust for octet alignment, convert to bytes, and then compare that with the number of bytes remaining in the encoded data. If there is insufficient data, there is no point in looping through the encoded data until you reach the end of the data. This technique can be used when reading, for example, character, octet, and bit strings and object identifiers.

When calculating how much data you need to read, be careful about how "wide" each datum is. For example, unconstrained BMP characters are two octets wide and characters with a permitted alphabet are bit strings.
Empty open type isn't. Remember that an empty open type is encoded as a single octet whose value is 0. Sometimes implementers assume that no data at all is encoded for an empty open type.
Constrained character strings. For a fixed-width character string (for example, UTF8String is not fixed-width) with a permitted alphabet (e.g., IA5String (FROM("0123456789#*,"))), make sure that the encoded value of each character is no greater than the size of the alphabet. If it isn't, the type was encoded incorrectly.
Integral precision. Be careful about decoding integral values greater than what your variables can hold (this assumes you do not support indefinite-precision integrals). For example, if you use 32-bit integrals, the (zero-based) length determinant for an unconstrained integer must be less than four, and since the length determinant itself is variable length, make sure that you do not overflow the internal variable holding the length determinant.

Encode

Low-level encoding. As with reading, localize the low-level writing of bits and octets to just a few functions and be careful about writing past the end of the encoded-data store. Take special care about writes that require octet alignment.
Constrained character strings. For a character string with a permitted alphabet, make sure the character to be encoded is actually in the alphabet--explicitly test for this. For example, "G" cannot be encoded for IA5String (FROM("ABC")).