New RS and new signal handling for system processes.
UPDATING INFO: 20100317:
/usr/src/etc/system.conf updated to ignore default kernel calls: copy
it (or merge it) to /etc/system.conf.
The hello driver (/dev/hello) added to the distribution:
# cd /usr/src/commands/scripts && make clean install
# cd /dev && MAKEDEV hello
KERNEL CHANGES:
- Generic signal handling support. The kernel no longer assumes PM as a signal
manager for every process. The signal manager of a given process can now be
specified in its privilege slot. When a signal has to be delivered, the kernel
performs the lookup and forwards the signal to the appropriate signal manager.
PM is the default signal manager for user processes, RS is the default signal
manager for system processes. To enable ptrace()ing for system processes, it
is sufficient to change the default signal manager to PM. This will temporarily
disable crash recovery, though.
- sys_exit() is now split into sys_exit() (i.e. exit() for system processes,
which generates a self-termination signal), and sys_clear() (i.e. used by PM
to ask the kernel to clear a process slot when a process exits).
- Added a new kernel call (i.e. sys_update()) to swap two process slots and
implement live update.
PM CHANGES:
- Posix signal handling is no longer allowed for system processes. System
signals are split into two fixed categories: termination and non-termination
signals. When a non-termination signaled is processed, PM transforms the signal
into an IPC message and delivers the message to the system process. When a
termination signal is processed, PM terminates the process.
- PM no longer assumes itself as the signal manager for system processes. It now
makes sure that every system signal goes through the kernel before being
actually processes. The kernel will then dispatch the signal to the appropriate
signal manager which may or may not be PM.
SYSLIB CHANGES:
- Simplified SEF init and LU callbacks.
- Added additional predefined SEF callbacks to debug crash recovery and
live update.
- Fixed a temporary ack in the SEF init protocol. SEF init reply is now
completely synchronous.
- Added SEF signal event type to provide a uniform interface for system
processes to deal with signals. A sef_cb_signal_handler() callback is
available for system processes to handle every received signal. A
sef_cb_signal_manager() callback is used by signal managers to process
system signals on behalf of the kernel.
- Fixed a few bugs with memory mapping and DS.
VM CHANGES:
- Page faults and memory requests coming from the kernel are now implemented
using signals.
- Added a new VM call to swap two process slots and implement live update.
- The call is used by RS at update time and in turn invokes the kernel call
sys_update().
RS CHANGES:
- RS has been reworked with a better functional decomposition.
- Better kernel call masks. com.h now defines the set of very basic kernel calls
every system service is allowed to use. This makes system.conf simpler and
easier to maintain. In addition, this guarantees a higher level of isolation
for system libraries that use one or more kernel calls internally (e.g. printf).
- RS is the default signal manager for system processes. By default, RS
intercepts every signal delivered to every system process. This makes crash
recovery possible before bringing PM and friends in the loop.
- RS now supports fast rollback when something goes wrong while initializing
the new version during a live update.
- Live update is now implemented by keeping the two versions side-by-side and
swapping the process slots when the old version is ready to update.
- Crash recovery is now implemented by keeping the two versions side-by-side
and cleaning up the old version only when the recovery process is complete.
DS CHANGES:
- Fixed a bug when the process doing ds_publish() or ds_delete() is not known
by DS.
- Fixed the completely broken support for strings. String publishing is now
implemented in the system library and simply wraps publishing of memory ranges.
Ideally, we should adopt a similar approach for other data types as well.
- Test suite fixed.
DRIVER CHANGES:
- The hello driver has been added to the Minix distribution to demonstrate basic
live update and crash recovery functionalities.
- Other drivers have been adapted to conform the new SEF interface.
Thomas Veerman [Fri, 12 Mar 2010 15:58:41 +0000 (15:58 +0000)]
- Add support for the ucontext system calls (getcontext, setcontext,
swapcontext, and makecontext).
- Fix VM to not erroneously think the stack segment and data segment have
collided when a user-space thread invokes brk().
- Add test51 to test ucontext functionality.
- Add man pages for ucontext system calls.
Add a set of declarations to math.h. Since we don't actually have
implementations for these functions, we lean on GNU builtin functions
for using them, so these declarations are also conditional on using
a GNU compiler.
Arun Thomas [Tue, 9 Mar 2010 09:41:14 +0000 (09:41 +0000)]
Move archtypes.h, fpu.h, and stackframe.h
Move archtypes.h to include/ dir, since several servers require it. Move
fpu.h and stackframe.h to arch-specific header directory. Make source
files and makefiles aware of the new header locations.
Arun Thomas [Mon, 8 Mar 2010 11:04:59 +0000 (11:04 +0000)]
Include directory reorg and makefile updates.
-Convert the include directory over to using bsdmake
syntax
-Update/add mkfiles
-Modify install(1) so that it can create symlinks
-Update makefiles to use new install(1) options
-Rename /usr/include/ibm to /usr/include/i386
-Create /usr/include/machine symlink to arch header files
-Move vm_i386.h to its new home in the /usr/include/i386
-Update source files to #include the header files at their
new homes.
-Add new gnu-includes target for building GCC headers
Ben Gras [Fri, 5 Mar 2010 15:05:11 +0000 (15:05 +0000)]
panic() cleanup.
this change
- makes panic() variadic, doing full printf() formatting -
no more NO_NUM, and no more separate printf() statements
needed to print extra info (or something in hex) before panicing
- unifies panic() - same panic() name and usage for everyone -
vm, kernel and rest have different names/syntax currently
in order to implement their own luxuries, but no longer
- throws out the 1st argument, to make source less noisy.
the panic() in syslib retrieves the server name from the kernel
so it should be clear enough who is panicing; e.g.
panic("sigaction failed: %d", errno);
looks like:
at_wini(73130): panic: sigaction failed: 0
syslib:panic.c: stacktrace: 0x74dc 0x2025 0x100a
- throws out report() - printf() is more convenient and powerful
- harmonizes/fixes the use of panic() - there were a few places
that used printf-style formatting (didn't work) and newlines
(messes up the formatting) in panic()
- throws out a few per-server panic() functions
- cleans up a tie-in of tty with panic()
merging printf() and panic() statements to be done incrementally.
Move cp_grant_id_t to a more central header file, and uses it more
extensively.
Fix casts that cast the grand id field of some messages to the wrong
type.
Ben Gras [Wed, 3 Mar 2010 15:32:26 +0000 (15:32 +0000)]
New P_BLOCKEDON for kernel - a macro that encodes the "who is this
process waiting for" logic, which is duplicated a few times in the
kernel. (For a new feature for top.)
Introducing it and throwing out ESRCDIED and EDSTDIED (replaced by
EDEADSRCDST - so we don't have to care which part of the blocking is
failing in system.c) simplifies some code in the kernel and callers that
check for E{DEADSRCDST,ESRCDIED,EDSTDIED}, but don't care about the
difference, a fair bit, and more significantly doesn't duplicate the
'blocked-on' logic.
Ben Gras [Mon, 1 Mar 2010 15:53:57 +0000 (15:53 +0000)]
slight tuning of /etc/mk situation when making release.
- Make the bootstrap /etc/mk be populated from the newly checked out source
- Don't chmod 755 all of /etc
- For the 'real' /etc/mk installing, let the /etc/mk ownership and permission
come from the mtree file, delete the contents of /etc/mk, then copy the .mk
files over and set reasonable permissions and ownership. (So that the .mk
get updated from the real usr/src/ copies, and no other junk if anything,
after the bootstrap phase, whatever happened there.)
Tomas Hruby [Wed, 10 Feb 2010 15:36:54 +0000 (15:36 +0000)]
Time accounting based on TSC
- as thre are still KERNEL and IDLE entries, time accounting for
kernel and idle time works the same as for any other process
- everytime we stop accounting for the currently running process,
kernel or idle, we read the TSC counter and increment the p_cycles
entry.
- the process cycles inherently include some of the kernel cycles as
we can stop accounting for the process only after we save its
context and we start accounting just before we restore its context
- this assumes that the system does not scale the CPU frequency which
will be true for ... long time ;-)
Lots of small code cleanup: make symbols local, remove unused symbols,
fixed a typo, removed a now unused header file.
Use #include <..> for header files that represent libraries.
Tomas Hruby [Tue, 9 Feb 2010 15:22:43 +0000 (15:22 +0000)]
No CLOCK task
- no kernel tasks are runnable
- clock initialization moved to the end of main()
- the rest of the body of clock_task() is moved to bsp_timer_int_handler() as
for now we are going to handle this on the bootstrap cpu. A change later is
possible.
Tomas Hruby [Tue, 9 Feb 2010 15:20:09 +0000 (15:20 +0000)]
Removal of the system task
* Userspace change to use the new kernel calls
- _taskcall(SYSTASK...) changed to _kernel_call(...)
- int 32 reused for the kernel calls
- _do_kernel_call() to make the trap to kernel
- kernel_call() to make the actuall kernel call from C using
_do_kernel_call()
- unlike ipc call the kernel call always succeeds as kernel is
always available, however, kernel may return an error
* Kernel side implementation of kernel calls
- the SYSTEm task does not run, only the proc table entry is
preserved
- every data_copy(SYSTEM is no data_copy(KERNEL
- "locking" is an empty operation now as everything runs in
kernel
- sys_task() is replaced by kernel_call() which copies the
message into kernel, dispatches the call to its handler and
finishes by either copying the results back to userspace (if
need be) or by suspending the process because of VM
- suspended processes are later made runnable once the memory
issue is resolved, picked up by the scheduler and only at
this time the call is resumed (in fact restarted) which does
not need to copy the message from userspace as the message
is already saved in the process structure.
- no ned for the vmrestart queue, the scheduler will restart
the system calls
- no special case in do_vmctl(), all requests remove the
RTS_VMREQUEST flag
Tomas Hruby [Tue, 9 Feb 2010 15:15:45 +0000 (15:15 +0000)]
copy_msg_from_user() and copy_msg_to_user()
- copies a mesage from/to userspace without need of translating
addresses
- the assumption is that the address space is installed, i.e. ldt and
cr3 are loaded correctly
- if a pagefault or a general protection occurs while copying from
userland to kernel (or vice versa) and error is returned which gives
the caller a chance to respond in a proper way
- error happens _only_ because of a wrong user pointer if the function
is used correctly
- if the prerequisites of the function do no hold, the function will
most likely fail as the user address becomes random
Tomas Hruby [Tue, 9 Feb 2010 15:13:52 +0000 (15:13 +0000)]
Early address space switch
- switch_address_space() implements a switch of the user address space
for the destination process
- this makes memory of this process easily accessible, e.g. a pointer
valid in the userspace can be used with a little complexity to
access the process's memory
- the switch does not happed only just before we return to userspace,
however, it happens right after we know which process we are going
to schedule. This happens before we start processing the misc flags
of this process so its memory is available
- if the process becomes not runnable while processing the mics flags
we pick a new process and we switch the address space again which
introduces possibly a little bit more overhead, however, it is
hopefully hidden by reducing the overheads when we actually access
the memory
Tomas Hruby [Tue, 9 Feb 2010 15:12:20 +0000 (15:12 +0000)]
System task initialization moved to main()
- the system task initialization code does not really need to be part
of the system task process. An earlier initialization in kernel is
cleaner as it does not only initialize the syscalls but also irq
hooks etc.
Fixes for truncate system calls:
- VFS: check for negative sizes in all truncate calls
- VFS: update file size after truncating with fcntl(F_FREESP)
- VFS: move pos/len checks for F_FREESP with l_len!=0 from FS to VFS
- MFS: do not zero data block for small files when fully truncating
- MFS: do not write out freed indirect blocks after freeing space
- MFS: make truncate work correctly with differing zone/block sizes
- tests: add new test50 for truncate call family
Ben Gras [Wed, 3 Feb 2010 13:29:14 +0000 (13:29 +0000)]
small asmconv cleanups.
- put asmconv in /usr/bin so it can be invoked without absolute path
- make it ignore .end in gnu output mode so that it can be invoked
without '|| true' in the gnu lib makefiles and it doesn't produce the
messy error message
Statistical profiling fixes:
- PM: get rid of umap warning
- sprofalyze.pl: update with recently added servers and drivers
- sprofalyze.pl: properly truncate process names for sample matching
Tomas Hruby [Wed, 3 Feb 2010 09:04:48 +0000 (09:04 +0000)]
This patch removes the global variables who_p and who_e from the
kernel (sys task). The main reason is that these would have to become
cpu local variables on SMP. Once the system task is not a task but a
genuine part of the kernel there is even less reason to have these
extra variables as proc_ptr will already contain all neccessary
information. In addition converting who_e to the process pointer and
back again all the time will be avoided.
Although proc_ptr will contain all important information, accessing it
as a cpu local variable will be fairly expensive, hence the value
would be assigned to some on stack local variable. Therefore it is
better to add the 'caller' argument to the syscall handlers to pass
the value on stack anyway. It also clearly denotes on who's behalf is
the syscall being executed.
This patch also ANSIfies the syscall function headers.
Last but not least, it also fixes a potential bug in virtual_copy_f()
in case the check is disabled. So far the function in case of a
failure could possible reuse an old who_p in case this function had
not been called from the system task.
virtual_copy_f() takes the caller as a parameter too. In case the
checking is disabled, the caller must be NULL and non NULL if it is
enabled as we must be able to suspend the caller.