ChatGPT解决这个技术问题 Extra ChatGPT

Is using flexible array members in C bad practice?

I recently read that using flexible array members in C was poor software engineering practice. However, that statement was not backed by any argument. Is this an accepted fact?

(Flexible array members are a C feature introduced in C99 whereby one can declare the last element to be an array of unspecified size. For example: )

struct header {
    size_t len;
    unsigned char data[];
};

M
Manos Nikolaidis

It is an accepted "fact" that using goto is poor software engineering practice. That doesn't make it true. There are times when goto is useful, particularly when handling cleanup and when porting from assembler.

Flexible array members strike me as having one main use, off the top of my head, which is mapping legacy data formats like window template formats on RiscOS. They would have been supremely useful for this about 15 years ago, and I'm sure there are still people out there dealing with such things who would find them useful.

If using flexible array members is bad practice, then I suggest that we all go tell the authors of the C99 spec this. I suspect they might have a different answer.


goto is also useful when we want to implement a recursive implementation of an algorithm using a non recursive implementation in those cases where recursion could raise an additional overhead on the compiler.
@pranavk You should probably be using while, then.
Network programming is another, you have the header as a struct, and the packet(or what it is called in the layer you in..) as the flexible array. Calling the next layer, you strip of the header, and pass the packet. Do this for each layer in the network stack. (You case the data from lower revived from lower layer to struct for layer you are inn)
@pranavk goto is not for loops.
"There are times when goto is useful" See, this is why I sometimes shudder while thinking some kid who's just learning to program will resort to StackOverflow for learning best practices.
R
Remo.D

PLEASE READ CAREFULLY THE COMMENTS BELOW THIS ANSWER

As C Standardization move forward there is no reason to use [1] anymore.

The reason I would give for not doing it is that it's not worth it to tie your code to C99 just to use this feature.

The point is that you can always use the following idiom:

struct header {
  size_t len;
  unsigned char data[1];
};

That is fully portable. Then you can take the 1 into account when allocating the memory for n elements in the array data :

ptr = malloc(sizeof(struct header) + (n-1));

If you already have C99 as requirement to build your code for any other reason or you are target a specific compiler, I see no harm.


Thanks. I left the n-1 since it might not be used as a string.
The 'following idiom' is not fully portable, which is why flexible array members were added to the C99 standard.
@Remo.D: minor point: the n-1 does not accurately accounts for the extra allocation, because of alignment: on most 32-bit machines, sizeof(struct header) will be 8 (to remain multiple of 4, since it has a 32-bit field which prefers/requires such alignment). The "better" version is: malloc(offsetof(struct header, data) + n)
In C99 using unsigned char data[1] isn't portable because ((header*)ptr)->data + 2 -- even if enough space was allocated -- creates a pointer that points outside the length-1 array object (and not the sentinel one past the end). But per C99 6.5.6p8, "If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined" (emphasis added). Flexible arrays (6.7.2.2p16) act like an array filling the allocated space to not hit UB here.
*WARNING: Using [1] has been shown to cause GCC to generate incorrect code: lkml.org/lkml/2015/2/18/407
m
maxschlepzig

No, using flexible array members in C is not bad practice.

This language feature was first standardized in ISO C99, 6.7.2.1 (16). In the following revision, ISO C11, it is specified in Section 6.7.2.1 (18).

You can use them like this:

struct Header {
    size_t d;
    long v[];
};
typedef struct Header Header;
size_t n = 123; // can dynamically change during program execution
// ...
Header *h = malloc(sizeof(Header) + sizeof(long[n]));
h->n = n;

Alternatively, you can allocate like this:

Header *h = malloc(sizeof *h + n * sizeof h->v[0]);

Note that sizeof(Header) includes eventual padding bytes, thus, the following allocation is incorrect and may yield a buffer overflow:

Header *h = malloc(sizeof(size_t) + sizeof(long[n])); // invalid!

A struct with a flexible array members reduces the number of allocations for it by 1/2, i.e. instead of 2 allocations for one struct object you need just 1. Meaning less effort and less memory occupied by memory allocator bookkeeping overhead. Furthermore, you save the storage for one additional pointer. Thus, if you have to allocate a large number of such struct instances you measurably improve the runtime and memory usage of your program (by a constant factor).

In contrast to that, using non-standardized constructs for flexible array members that yield undefined behavior (e.g. as in long v[0]; or long v[1];) obviously is bad practice. Thus, as any undefined-behaviour this should be avoided.

Since ISO C99 was released in 1999, more than 20 years ago, striving for ISO C89 compatibility is a weak argument.


R
Roddy

You meant...

struct header
{
 size_t len;
 unsigned char data[];
}; 

In C, that's a common idiom. I think many compilers also accept:

  unsigned char data[0];

Yes, it's dangerous, but then again, it's really no more dangerous than normal C arrays - i.e., VERY dangerous ;-) . Use it with care and only in circumstances where you truly need an array of unknown size. Make sure you malloc and free the memory correctly, using something like:-

  foo = malloc(sizeof(header) + N * sizeof(data[0]));
  foo->len = N;

An alternative is to make data just be a pointer to the elements. You can then realloc() data to the correct size as required.

  struct header
    {
     size_t len;
     unsigned char *data;
    }; 

Of course, if you were asking about C++, either of these would be bad practice. Then you'd typically use STL vectors instead.


provided that you are coding on a system where STL is supported!
C++ but no STL... That's not a pleasant thought!
Name one compiler that accepts zero-length arrays. (If the answer was GCC, now name another.) It is not sanctioned by the C standard.
I've worked in a C++ but no STL environment - we had our own containers which provided the commonly used functionality without the full generality of the STL iterator system. They were easier to understand and had good performance. However, this was in 2001.
@JonathanLeffler Accepted by GCC and Clang, which covers two out of the three main compilers in use today. (MSVC is the other big one, and that's only really relevant on one — admittedly very common — platform.)
N
Nyan

I've seen something like this: from C interface and implementation.

  struct header {
    size_t len;
    unsigned char *data;
};

   struct header *p;
   p = malloc(sizeof(*p) + len + 1 );
   p->data = (unsigned char*) (p + 1 );  // memory after p is mine! 

Note: data need not be last member.


Indeed this has the advantage that data need not be the last member, but it also incurs an extra dereference every time data is used. Flexible arrays replace that dereference with a constant offset from the main struct pointer, which is free on some particularly common machines and cheap elsewhere.
@R.. Although, considering the target address is necessarily the byte directly after the pointer, it is approximately 100% guaranteed to already be in L1 cache, giving the entire dereference something like half a cycle of overhead. However, the point stands that flexible arrays are a better idea here.
With unsigned char *, p->data = (unsigned char*) (p + 1 ) is OK. Yet with double complex *, p->data = (double complex *) (p + 1 ) may cause alignment problems.
This answer is technically irrelevant, as it does something different (it lays out the data differently in memory). While the pattern it describes is often useful, that doesn't mean that it can be a replacement for the other.
d
diapir

As a side note, for C89 compatibility, such structure should be allocated like :

struct header *my_header
  = malloc(offsetof(struct header, data) + n * sizeof my_header->data);

Or with macros :

#define FLEXIBLE_SIZE SIZE_MAX /* or whatever maximum length for an array */
#define SIZEOF_FLEXIBLE(type, member, length) \
  ( offsetof(type, member) + (length) * sizeof ((type *)0)->member[0] )

struct header {
  size_t len;
  unsigned char data[FLEXIBLE_SIZE];
};

...

size_t n = 123;
struct header *my_header = malloc(SIZEOF_FLEXIBLE(struct header, data, n));

Setting FLEXIBLE_SIZE to SIZE_MAX almost ensures this will fail :

struct header *my_header = malloc(sizeof *my_header);

Overly complex and there's no benefit over using [1] for C89 compatibility, if it's even needed...
Optimising compilers can correctly assume that an index into an array of length 1 must be zero. Kaboom!
C
Chef Gladiator

There are some downsides related to how structs are sometimes used, and it can be dangerous if you don't think through the implications.

For your example, if you start a function:

void test(void) {
  struct header;
  char *p = &header.data[0];

  ...
}

Then the results are undefined (since no storage was ever allocated for data). This is something that you will normally be aware of, but there are cases where C programmers are likely used to being able to use value semantics for structs, which breaks down in various other ways.

For instance, if I define:

struct header2 {
  int len;
  char data[MAXLEN]; /* MAXLEN some appropriately large number */
}

Then I can copy two instances simply by assignment, i.e.:

struct header2 inst1 = inst2;

Or if they are defined as pointers:

struct header2 *inst1 = *inst2;

This however won't work for flexible array members, since their content is not copied over. What you want is to dynamically malloc the size of the struct and copy over the array with memcpy or equivalent.

struct header3 {
  int len;
  char data[]; /* flexible array member */
}

Likewise, writing a function that accepts a struct header3 will not work, since arguments in function calls are, again, copied by value, and thus what you will get is likely only the first element of your flexible array member.

 void not_good ( struct header3 ) ;

This does not make it a bad idea to use, but you do have to keep in mind to always dynamically allocate these structures and only pass them around as pointers.

 void good ( struct header3 * ) ;