ChatGPT解决这个技术问题 Extra ChatGPT

What do 'statically linked' and 'dynamically linked' mean?

I often hear the terms 'statically linked' and 'dynamically linked', often in reference to code written in C, C++ or C#. What are they, what exactly are they talking about, and what are they linking?


p
paxdiablo

There are (in most cases, discounting interpreted code) two stages in getting from source code (what you write) to executable code (what you run).

The first is compilation which turns source code into object modules.

The second, linking, is what combines object modules together to form an executable.

The distinction is made for, among other things, allowing third party libraries to be included in your executable without you seeing their source code (such as libraries for database access, network communications and graphical user interfaces), or for compiling code in different languages (C and assembly code for example) and then linking them all together.

When you statically link a file into an executable, the contents of that file are included at link time. In other words, the contents of the file are physically inserted into the executable that you will run.

When you link dynamically, a pointer to the file being linked in (the file name of the file, for example) is included in the executable and the contents of said file are not included at link time. It's only when you later run the executable that these dynamically linked files are bought in and they're only bought into the in-memory copy of the executable, not the one on disk.

It's basically a method of deferred linking. There's an even more deferred method (called late binding on some systems) that won't bring in the dynamically linked file until you actually try to call a function within it.

Statically-linked files are 'locked' to the executable at link time so they never change. A dynamically linked file referenced by an executable can change just by replacing the file on the disk.

This allows updates to functionality without having to re-link the code; the loader re-links every time you run it.

This is both good and bad - on one hand, it allows easier updates and bug fixes, on the other it can lead to programs ceasing to work if the updates are incompatible - this is sometimes responsible for the dreaded "DLL hell" that some people mention in that applications can be broken if you replace a dynamically linked library with one that's not compatible (developers who do this should expect to be hunted down and punished severely, by the way).

As an example, let's look at the case of a user compiling their main.c file for static and dynamic linking.

Phase     Static                    Dynamic
--------  ----------------------    ------------------------
          +---------+               +---------+
          | main.c  |               | main.c  |
          +---------+               +---------+
Compile........|.........................|...................
          +---------+ +---------+   +---------+ +--------+
          | main.o  | | crtlib  |   | main.o  | | crtimp |
          +---------+ +---------+   +---------+ +--------+
Link...........|..........|..............|...........|.......
               |          |              +-----------+
               |          |              |
          +---------+     |         +---------+ +--------+
          |  main   |-----+         |  main   | | crtdll |
          +---------+               +---------+ +--------+
Load/Run.......|.........................|..........|........
          +---------+               +---------+     |
          | main in |               | main in |-----+
          | memory  |               | memory  |
          +---------+               +---------+

You can see in the static case that the main program and C runtime library are linked together at link time (by the developers). Since the user typically cannot re-link the executable, they're stuck with the behaviour of the library.

In the dynamic case, the main program is linked with the C runtime import library (something which declares what's in the dynamic library but doesn't actually define it). This allows the linker to link even though the actual code is missing.

Then, at runtime, the operating system loader does a late linking of the main program with the C runtime DLL (dynamic link library or shared library or other nomenclature).

The owner of the C runtime can drop in a new DLL at any time to provide updates or bug fixes. As stated earlier, this has both advantages and disadvantages.


Please correct me if I'm wrong, but on Windows, software tends to include its own libraries with the install, even if they're dynamically linked. On many Linux systems with a package manager, many dynamically linked libraries ("shared objects") are actually shared between software.
@PaulF: things like the Windows common controls, DirectX, .NET and so on ship a lot with the applications whereas on Linux, you tend to use apt or yum or something like that to manage dependencies - so you're right in that sense. Win Apps that ship their own code as DLLs tend not to share them.
There's a special place reserved in the ninth circle of hell for those that update their DLLs and break backward compatibility. Yes, if interfaces disappear or are modified, then the dynamic linking will fall in a heap. That's why it shouldn't be done. By all means add a function2() to your DLL but don't change function() if people are using it. Best way to handle that is to recode function() in such a way the it calls function2(), but don't change the signature of function().
@Paul Fisher, I know this is late but... the library that ships with a Windows DLL isn't the full library, it's just a bunch of stubs that tell the linker what the DLL contains. The linker can then automatically put the information into the .exe for loading the DLL, and the symbols don't show up as undefined.
@Santropedro, you're correct on all counts re the meaning of the lib, import and DLL names. The suffix is convention only so don't read too much into that (for example, the DLL may have a .dll or .so extension) - think of the answer as explaining the concepts rather than being an exact description. And, as per the text, this is an example showing static and dynamic linking for just the C runtime files so, yes, that's what `crt indicates in all of them.
P
Peter Mortensen

I think a good answer to this question ought to explain what linking is.

When you compile some C code (for instance), it is translated to machine language. Just a sequence of bytes which, when run, causes the processor to add, subtract, compare, "goto", read memory, write memory, that sort of thing. This stuff is stored in object (.o) files.

Now, a long time ago, computer scientists invented this "subroutine" thing. Execute-this-chunk-of-code-and-return-here. It wasn't too long before they realised that the most useful subroutines could be stored in a special place and used by any program that needed them.

Now in the early days programmers would have to punch in the memory address that these subroutines were located at. Something like CALL 0x5A62. This was tedious and problematic should those memory addresses ever need to be changed.

So, the process was automated. You write a program that calls printf(), and the compiler doesn't know the memory address of printf. So the compiler just writes CALL 0x0000, and adds a note to the object file saying "must replace this 0x0000 with the memory location of printf".

Static linkage means that the linker program (the GNU one is called ld) adds printf's machine code directly to your executable file, and changes the 0x0000 to the address of printf. This happens when your executable is created.

Dynamic linkage means that the above step doesn't happen. The executable file still has a note that says "must replace 0x000 with the memory location of printf". The operating system's loader needs to find the printf code, load it into memory, and correct the CALL address, each time the program is run.

It's common for programs to call some functions which will be statically linked (standard library functions like printf are usually statically linked) and other functions which are dynamically linked. The static ones "become part" of the executable and the dynamic ones "join in" when the executable is run.

There are advantages and disadvantages to both methods, and there are differences between operating systems. But since you didn't ask, I'll end this here.


I did too, however I only get to choose 1 answer.
Artelius, i am looking some in depth about your explanation about how these crazy low level things works. please reply with what books we must read to get indepth knowledge about the above things. thank you.
Sorry, I can't suggest any books. You should learn assembly language first. Then Wikipedia can give a decent overview of such topics. You may want to look at the GNU ld documentation.
J
John D. Cook

Statically linked libraries are linked in at compile time. Dynamically linked libraries are loaded at run time. Static linking bakes the library bit into your executable. Dynamic linking only bakes in a reference to the library; the bits for the dynamic library exist elsewhere and could be swapped out later.


佚名

Because none of the above posts actually show how to statically link something and see that you did it correctly so I will address this issue:

A simple C program

#include <stdio.h>

int main(void)
{
    printf("This is a string\n");
    return 0;
}

Dynamically link the C program

gcc simpleprog.c -o simpleprog

And run file on the binary:

file simpleprog 

And that will show it is dynamically linked something along the lines of:

"simpleprog: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.26, BuildID[sha1]=0xf715572611a8b04f686809d90d1c0d75c6028f0f, not stripped"

Instead let us statically link the program this time:

gcc simpleprog.c -static -o simpleprog

Running file on this statically linked binary will show:

file simpleprog 

"simpleprog: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for GNU/Linux 2.6.26, BuildID[sha1]=0x8c0b12250801c5a7c7434647b7dc65a644d6132b, not stripped"

And you can see it is happily statically linked. Sadly however not all libraries are simple to statically link this way and may require extended effort using libtool or linking the object code and C libraries by hand.

Luckily many embedded C libraries like musl offer static linking options for nearly all if not all of their libraries.

Now strace the binary you have created and you can see that there are no libraries accessed before the program begins:

strace ./simpleprog

Now compare with the output of strace on the dynamically linked program and you will see that the statically linked version's strace is much shorter!


a
artificialidiot

(I don't know C# but it is interesting to have a static linking concept for a VM language)

Dynamic linking involves knowing how to find a required functionality which you only have a reference from your program. You language runtime or OS search for a piece of code on the filesystem, network or compiled code cache, matching the reference, and then takes several measures to integrate it to your program image in the memory, like relocation. They are all done at runtime. It can be done either manually or by the compiler. There is ability to update with a risk of messing up (namely, DLL hell).

Static linking is done at compile time that, you tell the compiler where all the functional parts are and instruct it to integrate them. There are no searching, no ambiguity, no ability to update without a recompile. All your dependencies are physically one with your program image.