Ben Gras [Tue, 18 Sep 2012 11:17:52 +0000 (13:17 +0200)]
VM: full munmap
complete munmap implementation; single-page references made
a general munmap() implementation possible to write cleanly.
. memory: let the MIOCRAMSIZE ioctl set the imgrd device
size (but only to 0)
. let the ramdisk command set sizes to 0
. use this command to set /dev/imgrd to 0 after mounting /usr
in /etc/rc, so the boot time ramdisk is freed (about 4MB
currently)
Ben Gras [Tue, 18 Sep 2012 11:17:49 +0000 (13:17 +0200)]
VM: only single page chunks
. only reference single pages in process data structures
to simplify page faults, copy-on-write, etc.
. this breaks the secondary cache for objects that are
not one-page-sized; restored in a next commit
Ben Gras [Tue, 18 Sep 2012 11:17:44 +0000 (13:17 +0200)]
libc/libminc malloc reorganization
. rename minix malloc sources to minix-* so Makefile
references aren't ambiguous
. throw out malloc source file copies in libminc
. make libminc use phkmalloc instead of minix malloc (slightly faster)
Thomas Veerman [Tue, 28 Aug 2012 14:06:51 +0000 (14:06 +0000)]
VFS: make all IPC asynchronous
By decoupling synchronous drivers from VFS, we are a big step closer to
supporting driver crashes under all circumstances. That is, VFS can't
become stuck on IPC with a synchronous driver (e.g., INET) and can
recover from crashing block drivers during open/close/ioctl or during
communication with an FS.
In order to maintain serialized communication with a synchronous driver,
the communication is wrapped by a mutex on a per driver basis (not major
numbers as there can be multiple majors with identical endpoints). Majors
that share a driver endpoint point to a single mutex object.
In order to support crashes from block drivers, the file reopen tactic
had to be changed; first reopen files associated with the crashed
driver, then send the new driver endpoint to FSes. This solves a
deadlock between the FS and the block driver;
- VFS would send REQ_NEW_DRIVER to an FS, but he FS only receives it
after retrying the current request to the newly started driver.
- The block driver would refuse the retried request until all files
had been reopened.
- VFS would reopen files only after getting a reply from the initial
REQ_NEW_DRIVER.
When a character special driver crashes, all associated files have to
be marked invalid and closed (or reopened if flagged as such). However,
they can only be closed if a thread holds exclusive access to it. To
obtain exclusive access, the worker thread (which handles the new driver
endpoint event from DS) schedules a new job to garbage collect invalid
files. This way, we can signal the worker thread that was talking to the
crashed driver and will release exclusive access to a file associated
with the crashed driver and prevent the garbage collecting worker thread
from dead locking on that file.
Also, when a character special driver crashes, RS will unmap the driver
and remap it upon restart. During unmapping, associated files are marked
invalid instead of waiting for an endpoint up event from DS, as that
event might come later than new read/write/select requests and thus
cause confusion in the freshly started driver.
When locking a filp, the usage counters are no longer checked. The usage
counter can legally go down to zero during filp invalidation while there
are locks pending.
DS events are handled by a separate worker thread instead of the main
thread as reopening files could lead to another crash and a stuck thread.
An additional worker thread is then necessary to unlock it.
Finally, with everything asynchronous a race condition in do_select
surfaced. A select entry was only marked in use after succesfully sending
initial select requests to drivers and having to wait. When multiple
select() calls were handled there was opportunity that these entries
were overwritten. This had as effect that some select results were
ignored (and select() remained blocking instead if returning) or do_select
tried to access filps that were not present (because thrown away by
secondary select()). This bug manifested itself with sendrecs, but was
very hard to reproduce. However, it became awfully easy to trigger with
asynsends only.
Instead of using a loop to find a matching ipc (inter process
communication) system call type, the offset in the call table can be
simply calculated in constant time.
Also, when the interprocess communication server receives an ipc
system call from a process, ipc should tell VM to watch the process
only once. This patch fixes that also.
(Patch and commit message slightly edited by committer.)
- use one single library instead of loose library files
- we don't have ftime() anymore
- shmat(non-NULL) is currently broken, fix shmt test set to bypass this
- some other small issues
Coverity was flagging a recursive include between kernel.h and
cpulocals.h. As cpulocals.h also included proc.h, we can move that
include statement into kernel.h, and clean up the source files'
include statements accordingly.
Ben Gras [Fri, 10 Aug 2012 16:27:23 +0000 (18:27 +0200)]
libexec: add load_offset feature, used for ld.so
. ld.so is linked at 0 but it can relocate itself; we
wish to load ld.so higher though to trap NULL dereferences.
if we know we have to execute ld.so, vfs tells libexec to put it
higher.
This patch adds the sprofdiff tool, which compares two sets of profiling
output files. It sorts processes and symbols by difference in average
number of samples, placing those that took more time on the left first
and those that took more time on the right last. If multiple runs are
combined, a standard deviation is computed and this is used to compute
the significance level, which gives an indication of which differences
are likely to be due to chance.
This tool is run not on the raw profiling files, but on the output of
sprofalyze -d (a new option). Though having to use two tools and an
intermediate file seems a bit awkward, the advantage is that the
original source tree is not needed to resolve the symbols. For
comparisons, this is very useful. Also, the intermediate file is in a
text format that can easily be processed by scripts, which may be useful
for other purposes as well.
Ben Gras [Wed, 8 Aug 2012 13:47:45 +0000 (15:47 +0200)]
vm: ignore RS pin (pre-allocate) requests for now
. done by RS to reduce/remove dependency on VM for recovery
. RS has the default stack size of 64MB since the nosegments
change, using a huge amount of unused memory to pre-allocate
. ignore these requests until actually required (i.e. being able
to survive VM crashes)
Thanks to pikpik for investigating why RS was so huge.