Shared objects: sonames, real names, and link names
A client had a complaint about an internal library (libbar.so) they were consuming as a dependency of their own product (libfoo.so). Very frequently they would need to relink their library because symbols from bar would be missing at runtime. This happened every time a new version of bar was released and any time their cache was relocated. The actual reason for this was that the soname was not being set for libbar.so when it was built. But what does that actually mean?
I wrote a few notes on this topic because I thought it was interesting and I hope this will be useful for anyone wanting a TL;DR on sonames, real names, and link names.
ELF headers
A good place to start this discussion is with shared libraries. Object files, shared libraries, and programs are specific variants of files that conform to the Executable and Linkable (ELF) file-format. These are also called ELF-compliant files or just ELF files. Specifically, we’re interested in the header structure; so, as a simplification we’ll only say for now that ELF files contain a header structure whose fields provide various pieces of meta data. For example, the ‘Type’ field in the Elf header identifies the file as an object, shared library, or program.
For those files that are shared libraries or programs, the header also may contain a program header table structure. The fields of this table provide the runtime linker/loader with the parameters necessary for loading the file in a process.
As an example, here’s the header from FreeBSD 13 /lib/libpcap.so.8
:
ELF Header:
Magic: 7f 45 4c 46 02 01 01 09 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: FreeBSD
ABI Version: 0
Type: DYN (Shared object file)
Machine: Advanced Micro Devices x86-64
Version: 0x1
Entry point address: 0
Start of program headers: 64 (bytes into file)
Start of section headers: 353816 (bytes into file)
Flags: 0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 10
Size of section headers: 64 (bytes)
Number of section headers: 30
Section header string table index: 29
libpcap is a shared library so it has a DYN
type, and importantly for this problem, it has a dynamic section:
Dynamic section at offset 0x559b0 contains 26 entries:
Tag Type Name/Value
0x0000000000000001 NEEDED Shared library: [libibverbs.so.1]
0x0000000000000001 NEEDED Shared library: [libmlx5.so.1]
0x0000000000000001 NEEDED Shared library: [libc.so.7]
0x000000000000000e SONAME Library soname: [libpcap.so.8]
The NEEDED
field specifies dependencies on the object. Meanwhile the SONAME
field specifies the logical object name.
shared library names
Shared libraries essentially have three names:
- soname (logical name)
- real name
- link name
These are all just different naming schemes for referring to the same objects. Ultimately, the actual object is at the real name but the soname and link name provide convenient references to that. The point of the soname is to allow dynamic linking to objects by providing a logical name for the object.
soname
The soname follows this naming scheme: lib<library name>.so.<version number>
.
For example, our soname might be libbar.so.1
. The fully qualified soname would include a prefix of the directory that it’s at (eg /usr/lib/libbar.so.1
). As mentioned above, on working systems that fully qualified soname is a symbolic link to the real name. Typically a linker utility like ldconfig
is responsible for maintaining the set of paths to search. It also examines and establishes the sonames as symbolic links to the real names.
real name
The real name is the file path containing the library code. That has a base filename consisting of the soname plus .<minor number>.<release number>
(although the .<release number>
is optional. If our soname could be libbar.so.1
, the real name might be something like /path/to/libfoo.so.1.1234
.
link name
The link name is the soname without any version numbering. So, in this case libfoo.so
might be that name. This is the name used by the compiler when requesting the library. In most cases this is a symlink to the ’latest’ soname (eg libfoo.so.latest
) or else the real name if the soname is not set. These are not actually setup by the linker though, generally they are setup during library installation.
example libcc1.so
We can see the above in action if we take a look at something like the gcc cc1 library.
$ objdump -p /usr/lib/gcc/x86_64-linux-gnu/10/libcc1.so
/usr/lib/gcc/x86_64-linux-gnu/10/libcc1.so: file format elf64-x86-64
Program Header:
LOAD off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**12
filesz 0x00000000000045e0 memsz 0x00000000000045e0 flags r--
LOAD off 0x0000000000005000 vaddr 0x0000000000005000 paddr 0x0000000000005000 align 2**12
filesz 0x00000000000152a1 memsz 0x00000000000152a1 flags r-x
LOAD off 0x000000000001b000 vaddr 0x000000000001b000 paddr 0x000000000001b000 align 2**12
filesz 0x00000000000055fd memsz 0x00000000000055fd flags r--
LOAD off 0x00000000000206d0 vaddr 0x00000000000216d0 paddr 0x00000000000216d0 align 2**12
filesz 0x0000000000000f10 memsz 0x00000000000010b8 flags rw-
DYNAMIC off 0x0000000000020d68 vaddr 0x0000000000021d68 paddr 0x0000000000021d68 align 2**3
filesz 0x0000000000000200 memsz 0x0000000000000200 flags rw-
NOTE off 0x0000000000000238 vaddr 0x0000000000000238 paddr 0x0000000000000238 align 2**2
filesz 0x0000000000000024 memsz 0x0000000000000024 flags r--
EH_FRAME off 0x000000000001c460 vaddr 0x000000000001c460 paddr 0x000000000001c460 align 2**2
filesz 0x0000000000000964 memsz 0x0000000000000964 flags r--
STACK off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4
filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw-
RELRO off 0x00000000000206d0 vaddr 0x00000000000216d0 paddr 0x00000000000216d0 align 2**0
filesz 0x0000000000000930 memsz 0x0000000000000930 flags r--
Dynamic Section:
NEEDED libstdc++.so.6
NEEDED libc.so.6
NEEDED libgcc_s.so.1
SONAME libcc1.so.0
INIT 0x0000000000005000
FINI 0x000000000001a298
INIT_ARRAY 0x00000000000216d0
INIT_ARRAYSZ 0x0000000000000008
FINI_ARRAY 0x00000000000216d8
FINI_ARRAYSZ 0x0000000000000008
GNU_HASH 0x0000000000000260
STRTAB 0x0000000000001158
SYMTAB 0x0000000000000438
STRSZ 0x0000000000000d57
SYMENT 0x0000000000000018
PLTGOT 0x0000000000022000
PLTRELSZ 0x0000000000000810
PLTREL 0x0000000000000007
JMPREL 0x0000000000003dd0
RELA 0x00000000000020a8
RELASZ 0x0000000000001d28
RELAENT 0x0000000000000018
VERNEED 0x0000000000001fc8
VERNEEDNUM 0x0000000000000003
VERSYM 0x0000000000001eb0
RELACOUNT 0x000000000000011a
Version References:
required from libgcc_s.so.1:
0x0b792650 0x00 12 GCC_3.0
required from libc.so.6:
0x06969194 0x00 07 GLIBC_2.14
0x0d696913 0x00 04 GLIBC_2.3
0x09691a75 0x00 03 GLIBC_2.2.5
required from libstdc++.so.6:
0x0297f870 0x00 11 GLIBCXX_3.4.20
0x0bafd178 0x00 10 CXXABI_1.3.8
0x02297f89 0x00 09 GLIBCXX_3.4.9
0x0bafd179 0x00 08 CXXABI_1.3.9
0x056bafd3 0x00 06 CXXABI_1.3
0x0297f871 0x00 05 GLIBCXX_3.4.21
0x08922974 0x00 02 GLIBCXX_3.4
’libcc1.so’ is the link name which is a symlink to the soname in the header (’libcc1.so.0’):
$ ls -la /usr/lib/gcc/x86_64-linux-gnu/10/libcc1.so
lrwxrwxrwx 1 root root 37 Jan 10 2021 /usr/lib/gcc/x86_64-linux-gnu/10/libcc1.so -> ../../../x86_64-linux-gnu/libcc1.so.0
and this is a symlink to the real name:
$ ls -la /usr/lib/x86_64-linux-gnu/libcc1.so.0
lrwxrwxrwx 1 root root 15 Jan 10 2021 /usr/lib/x86_64-linux-gnu/libcc1.so.0 -> libcc1.so.0.0.0
Example
In the example mentioned when there was a missing soname, we can examine the dynamic section of the header of the dependent library libfoo.so
:
objdump -p libfoo.so | grep NEEDED
NEEDED /root/.conan/data/foo/some/path/lib/libbar.so.1.1234
We can see that this is actually a real name with a full path as opposed to a soname (compare this to the NEEDED fields of libcc1.so
). If this path changes then foo would require relinking to bar, which is exactly what the soname is meant to resolve: The NEEDED
field should be the soname of bar which we’d expect to be a symlink to the real name.
In order to use a soname for bar, the SONAME
field of the ELF header of bar needs to be set. The linker (as a link editor) will set that value to the NEEDED
field of the header of foo at link time. In this case we’re using ld. Something to note is that this sometimes causes confusion about what entity sets this: in many gcc projects the -Wl,soname,libbar.so
option would be used to set the soname leading to the belief that gcc is setting those fields. Actually -Wl
passes arguments to the linker. In this case that gcc option provides -soname=libbar.so
to ld.
In this case, the bar-team was setting
LD = $(CC) -shared
but LDFLAGS_all
and LDFLAGS
were not set in the makefiles themselves anywhere. In short, the bar-team were not providing a soname when creating the library. Modifying the LD line was all that was needed:
LD = $(CC) -shared -Wl,-soname,$(ARTIFACT)
where $(ARTIFACT)
expands to libbar.so.1
. Rebuilding this sets the name in the header as required. Now we have a fully-qualified soname set /usr/lib/libbar.so.1
which is symlinked by ldconfig to the real name. The link name is /usr/lib/libbar.so
(often referenced in the link line input as something like ‘-lbar’ if it’s on the path) which symlinks to the soname /usr/lib/libbar.so.1
.
We can demonstrate that operation with a dummy library:
- bar.c
int test(void)
{
return 1;
}
- bar.h
#ifndef bar_h__
#define bar_h__
extern int test(void);
#endif
Building with the following command will set the soname field:
$ gcc -shared -fPIC -Wl,-soname,libbar.so.1 -o libbar.so.1.0.0 bar.c
$ readelf -a libbar.so.1.0.0 | grep SONAME
0x000000000000000e (SONAME) Library soname: [libbar.so.1]
More references
- https://gcc.gnu.org/onlinedocs/gcc/Link-Options.html#index-Wl
- https://man.freebsd.org/cgi/man.cgi?ld(1)
- https://refspecs.linuxfoundation.org/elf/TIS1.1.pdf