I study the Linux kernel and found out that for x86_64 architecture the interrupt int 0x80
doesn't work for calling system calls1.
For the i386 architecture (32-bit x86 user-space), what is more preferable: syscall
or int 0x80
and why?
I use Linux kernel version 3.4.
Footnote 1: int 0x80
does work in some cases in 64-bit code, but is never recommended. What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?
int 0x80
? Can you specify some files?
int 0x80
works on the x86-64
kernel directly for backwards compatibility. And the Intel manual says that syscall
is invalid in 32-bit mode.
syscall is the default way of entering kernel mode on x86-64. This instruction is not available in 32 bit modes of operation on Intel processors.
sysenter is an instruction most frequently used to invoke system calls in 32 bit modes of operation. It is similar to syscall, a bit more difficult to use though, but that is the kernel's concern.
int 0x80 is a legacy way to invoke a system call and should be avoided.
The preferred way to invoke a system call is to use vDSO, a part of memory mapped in each process address space that allows to use system calls more efficiently (for example, by not entering kernel mode in some cases at all). vDSO also takes care of more difficult, in comparison to the legacy int 0x80
way, handling of syscall
or sysenter
instructions.
My answer here covers your question.
In practice, recent kernels are implementing a VDSO, notably to dynamically optimize system calls (the kernel sets the VDSO to some code best for the current processor). So you should use the VDSO, and you'll better use, for existing syscalls, the interface provided by the libc.
Notice that, AFAIK, a significant part of the cost of simple syscalls is going from user-space to kernel and back. Hence, for some syscalls (probably gettimeofday
, getpid
...) the VDSO might avoid even that (and technically might avoid doing a real syscall). For most syscalls (like open
, read
, send
, mmap
....) the kernel cost of the syscall is large enough to make any improvement of the user-space to kernel space transition (e.g. using SYSENTER
or SYSCALL
machine instructions instead of INT
) insignificant.
Beware of this before changing : system call numbers differ when doing 0x80 or syscall, e.g sys_write is 4 with 0x80 and 1 with syscall.
http://docs.cs.up.ac.za/programming/asm/derick_tut/syscalls.html for 32 bits or 0x80 http://blog.rchapman.org/post/36801038863/linux-system-call-table-for-x86-64 for syscall
syscall
in 32-bit mode where it invokes the 32-bit ABI (only possible on AMD CPUs, and behaves differently than the 64-bit-userspace version). But yes, the 64-bit syscall ABI is different in call numbers and in calling convention.
int 0x80
is a better terminology to indicate its a system call to the kernel to tell it to do something.
The meaning and interpretation are interchangeble, 'make a syscall' or 'issue int 80h'.
It is no different to the days of DOS:
invoke int 21h to get DOS to do something depedning on the AX register and optionally ES:DX register pair,
int 13h is the BIOS hard disk handler.
int 10h is the EGA/VGA screen.
int 09h is the keyboard handler.
What is the common theme here is this, when a interrupt/syscall is invoked, the kernel checks the state of the registers to see what type of system call is required. By looking at for example, eax
register, for example, and determine what to perform, internally context switches into kernel space, carry out the procedure and context switch back to user-space, with an option to return back the results of the call, i.e. was it successful or was it in failure.
syscall
instruction. In this respect, int 0x80
is only used for a system call in a 32-bit Linux environment.
Success story sharing
int 0x80
which is not true. The preferable way to make a syscall is to use VDSO.syscall
ABI directly, except for system calls likeclock_gettime()
andgetpid()
where a user-space implementation is exported in the VDSO.