ChatGPT解决这个技术问题 Extra ChatGPT

What happens to global and static variables in a shared library when it is dynamically linked?

I'm trying to understand what happens when modules with globals and static variables are dynamically linked to an application. By modules, I mean each project in a solution (I work a lot with visual studio!). These modules are either built into *.lib or *.dll or the *.exe itself.

I understand that the binary of an application contains global and static data of all the individual translation units (object files) in the data segment (and read only data segment if const).

What happens when this application uses a module A with load-time dynamic linking? I assume the DLL has a section for its globals and statics. Does the operating system load them? If so, where do they get loaded to?

And what happens when the application uses a module B with run-time dynamic linking?

If I have two modules in my application that both use A and B, are copies of A and B's globals created as mentioned below (if they are different processes)?

Do DLLs A and B get access to the applications globals?

(Please state your reasons as well)

Quoting from MSDN:

Variables that are declared as global in a DLL source code file are treated as global variables by the compiler and linker, but each process that loads a given DLL gets its own copy of that DLL's global variables. The scope of static variables is limited to the block in which the static variables are declared. As a result, each process has its own instance of the DLL global and static variables by default.

and from here:

When dynamically linking modules, it can be unclear whether different libraries have their own instances of globals or whether the globals are shared.

Thanks.

By modules you probably mean libs. There is a proposal to add modules to the C++ standard with a more precise definition of what a module would be and different semantics than regular libraries as of now.
Ah, should have clarified that. I consider different projects in a solution (i work a lot with visual studio) as modules. These modules are built into *.lib or *.dll 's.
@DavidRodríguez-dribeas The term "module" is the correct technical term for standalone (fully linked) executable files, including: executable programs, dynamic-link libraries (.dll) or shared objects (.so). It is perfectly appropriate here, and the meaning is correct and well understood. Until there is a standard feature named "modules", the definition of it remains the traditional one, as I explained.

M
Mikael Persson

This is a pretty famous difference between Windows and Unix-like systems.

No matter what:

Each process has its own address space, meaning that there is never any memory being shared between processes (unless you use some inter-process communication library or extensions).

The One Definition Rule (ODR) still applies, meaning that you can only have one definition of the global variable visible at link-time (static or dynamic linking).

So, the key issue here is really visibility.

In all cases, static global variables (or functions) are never visible from outside a module (dll/so or executable). The C++ standard requires that these have internal linkage, meaning that they are not visible outside the translation unit (which becomes an object file) in which they are defined. So, that settles that issue.

Where it gets complicated is when you have extern global variables. Here, Windows and Unix-like systems are completely different.

In the case of Windows (.exe and .dll), the extern global variables are not part of the exported symbols. In other words, different modules are in no way aware of global variables defined in other modules. This means that you will get linker errors if you try, for example, to create an executable that is supposed to use an extern variable defined in a DLL, because this is not allowed. You would need to provide an object file (or static library) with a definition of that extern variable and link it statically with both the executable and the DLL, resulting in two distinct global variables (one belonging to the executable and one belonging to the DLL).

To actually export a global variable in Windows, you have to use a syntax similar to the function export/import syntax, i.e.:

#ifdef COMPILING_THE_DLL
#define MY_DLL_EXPORT extern "C" __declspec(dllexport)
#else
#define MY_DLL_EXPORT extern "C" __declspec(dllimport)
#endif

MY_DLL_EXPORT int my_global;

When you do that, the global variable is added to the list of exported symbols and can be linked like all the other functions.

In the case of Unix-like environments (like Linux), the dynamic libraries, called "shared objects" with extension .so export all extern global variables (or functions). In this case, if you do load-time linking from anywhere to a shared object file, then the global variables are shared, i.e., linked together as one. Basically, Unix-like systems are designed to make it so that there is virtually no difference between linking with a static or a dynamic library. Again, ODR applies across the board: an extern global variable will be shared across modules, meaning that it should have only one definition across all the modules loaded.

Finally, in both cases, for Windows or Unix-like systems, you can do run-time linking of the dynamic library, i.e., using either LoadLibrary() / GetProcAddress() / FreeLibrary() or dlopen() / dlsym() / dlclose(). In that case, you have to manually get a pointer to each of the symbols you wish to use, and that includes the global variables you wish to use. For global variables, you can use GetProcAddress() or dlsym() just the same as you do for functions, provided that the global variables are part of the exported symbol list (by the rules of the previous paragraphs).

And of course, as a necessary final note: global variables should be avoided. And I believe that the text you quoted (about things being "unclear") is referring exactly to the platform-specific differences that I just explained (dynamic libraries are not really defined by the C++ standard, this is platform-specific territory, meaning it is much less reliable / portable).


Great answer, thank you! I have a follow up: Since the DLL is a self-contained piece of code & data, does it have a data segment section similar to executables? I'm trying to understand where and how this data is loaded (to) when the shared library is used.
@Raja Yes, the DLL has a data segment. In fact, in terms of the files themselves, executables and DLLs are virtually identical, the only real difference is a flag that is set in the executable to say that it contains a "main" function. When a process loads a DLL, its data segment is copied somewhere into the process' address space, and the static initialization code (which would initialize non-trivial global variables) is also run within the process' address space. The loading is the same as for the executable, except that the process address space is expanded instead of created a new one.
How about the static variables defined inside a inline function of a class? e.g define "class A{ void foo() { static int st_var = 0; } }" in the header file and include it in module A and module B, will A/B shared the same st_var or each will has its own copy?
@camino If the class is exported (i.e. defined with __attribute__((visibility("default")))), then A/B will share the same st_var. But if the class is defined with __attribute__((visibility("hidden"))), then module A and module B will have its own copy, not shared.
@camino __declspec(dllexport)
D
Deckard 5 Pegasus

The answer left by Mikael Persson, although very thorough, contains a severe error (or at least misleading), in regards to the global variables, that needs to be cleared up. The original question asked if there were seperate copies of the global variables or if global variables were shared between the processes.

The true answer is the following: There are seperate (multiple) copies of the global variables for each process, and they are not shared between processes. Thus by stating the One Definition Rule (ODR) applies is also very misleading, it does not apply in the sense they are NOT the same globals used by each process, so in reality it is not "One Definition" between processes.

Also even though global variables are not "visible" to the process,..they are always easily "accesible" to the process, because any function could easily return a value of a global variable to the process, or for that matter, a process could set a value of a global variable through a function call. Thus this answer is also misleading.

In reality, "yes" the processes do have full "access" to the globals, at the very least through the funtion calls to the library. But to reiterate, each process has it's own copy of the globals, so it won't be the same globals that another process is using.

Thus the entire answer relating to external exporting of globals really is off topic, and unnecessary and not even related to the original question. Because the globals do not need extern to be accessed, the globals can always be accessed indirectly through function calls to the library.

The only part that is shared between the processes, of course, is the actual "code". The code only loaded in one place in physical memory (RAM), but that same physical memory location of course is mapped into the "local" virtual memory locations of each process.

To the contrary, a static library has a copy of the code for each process already baked into the executable (ELF, PE, etc.), and of course, like dynamic libraries has seperate globals for each process.


Thank you! I was so confused what ODR and name visibility had to do with anything.
c
choppe

In unix systems:

It is to be noted , that the linker does not complain if two dynamic libraries export same global variables. but during execution a segfault might arise depending on access violations. A usual number exhibiting this behavior would be segmentation fault 15

segfault at xxxxxx ip xxxxxx sp xxxxxxx error 15 in a.out