String Array Zoo

Let's look at the different ways to store constant character data in arrays occupying different objects.

A 1D array

In C/C++, a one-dimensional character string can be passed by pointer:

struct S
   {
   const char * astring;
   };

int main(void)
   {
   S p;
   p.astring = "one";
   }

A more mainstream way to store array information in C++ is to pre-allocate a fixed-length buffer. But note that C/C++ disallows direct array-to-array copies, so you must work around it:

  1. Place the destination array into a struct and make it the only element.
  2. Place the source string into braces (either one or two pairs) creating an initializer list
  3. Assign the list to the struct directly.

A deep copy of the contents of the initializer list (an implicit, temporary struct) is assgned to the struct directly. This is performed by the implicit copy constructor.

One quirk: double-braces seem to result in the same thing as single-braces.

#include <iostream>

struct S
   {
   char astring[4];
   };

int main(void)
   {
   S p = {"one"};
   S r = {{"ONE"}};
   }   

A 2D Array

#include <iostream>

struct S
   {
   char astring[2][4];
   };

int main(void)
   {
   S p = {"one", "TWO"};
   }

Quirk: Either 1 or 2 braces around the initializer still work.

Limitation: There is no practical way to directly assign to a pointer array from an initializer, above 1 dimension, especially across objects.

Although you can direct-assign something like the following locally, it won't work when assigning to another object. Also notice how the array storage syntax is decaying and becoming harder to understand:

#include <iostream>

int main(void)
   {
   char const * astring[] = {"one", "two"};
   }

Pointer-pointer (**) arrays will not take an assignment from an initializer list because the type system identifies the type of the list as const char[4] or similar, depending on the number of items defined in the array. Due to this variability, the type of the array being assigned-to must be a matching const char[x] type, and must be larger or equal to the input.

Initializer lists are manually unparsable. In general, you can only assign them to standard array types, which embed nulls and waste a lot of space.

A good strategy to save memory is to allocate a standard array of nulls on the heap, take input from an initializer to this heap array, copy the contents from the standard array into a pointer array, then delete the standard array.

A 3d Arrays

Here is a 3D array in C++. In order to use the initializer, you must wrap everything in two braces. Unlike 1D or 2D arrays, it is not optional. Working from outermost to innermost braces:

  1. The outer pair corresponds to the struct being assigned-to
  2. The next pair corresponds to the name of the destination array in the struct
  3. The next level of braces corresponds to the 1st index allocation of the array of the destination
  4. The maximum count of strings for any braced group within, corresponds to the 2nd array index allocation of the destination
  5. The maximum count of characters for any string (including null terminator) corresponds to the last index allocation.
#include <iostream>

struct S
   {
   char astring[2][2][4];
   };

int main(void)
   {
   S p = {{ {"one", "ONE"},{"two","TWO"} }};
   }

In summary, C++ multidimensional array assignment across objects is tricky.

  1. Standard dimension-specified arrays (linear, rectangual, cubic, etc) are supported by the language. There are bonuses for using them, like free deep-copy from initializer lists.
  2. Standard arrays need to have their dimensions declared before reading, writing, passing or storage. Pointer arrays don't necessarily need known dimension information if all dimensions are bouned by nulls.
  3. Pointer arrays are best used with persistent memory, and can be passed and stored easily by(*, **, or ***, etc.)
  4. Standard arrays require dimension information to be known at compile-time if used on the stack.
  5. Standard arrays should be parsed with indexing rather than pointer arithmetic.
  6. When assigning strings to standard arrays, you need to include space for a null terminator for each string.
  7. Pointer arrays need an extra slot in each dimension to place null terminators after the last element. (NULL for the strings, nullptr for the 1st and 2nd dimensions). These nulls make the array possible to parse without knowing dimension information.
  8. Stardard arrays tend to be wasteful for permanent string storage. Pointer arrays are much more economical in memory. It's generally better to treat standard arrays as temporary because of this.
  9. As a rule of thumb with C/C++ arrays it is generally best to consider assignment, passing and storage as three different operations that need to be aligned before you write a lot of code. Write small test code first in a separate file to understand what you are doing.
  10. Pointer arrays are best thought-of as composites of 1D arrays. They require some support code to parse, but you don't need to know their dimensions.