- POLLRDBAND is reported by select(2) as errorfd, not readfd;
- POLLERR is not the same as errorfd of select(2);
- flags that are not requested should not be returned.
The callback, which was dropped in commit git-842c4ed, allows drivers
to fetch the interrupt status once and save it locally for subsequent
calls to drv_int().
This patch adds strace-like support for a -t command line option,
which causes a timestamp to be printed at the beginning of each line.
If the option is given more than once, the output will also include
microseconds.
Antoine Leca [Fri, 4 Nov 2016 18:19:14 +0000 (19:19 +0100)]
Fix the process for GNU tools on MINIX
This is a fix over commit a150b26ee803b20080
On a MINIX station, the tools are not usually built and
on a first-time building of the tree, the fetching script
of texinfo was not triggered in some cases. Let force it.
Reported on minix3 googlegroup by Chris Card.
As of change git-87c599d, when processing CLOCK notifications, PM no
longer set the current process pointer 'mp'. That pointer is however
used when delivering signals through check_sig(), to see whether the
current process may deliver a signal to the target process. As a
result, delivering SIGALARM signals used a previous pointer in these
checks, causing alarm signals not to be delivered in some cases.
This patch ensures that alarm signals are again delivered with PM as
current process.
With this patch, it is now possible to generate coverage information
for MINIX3 system services with LLVM. In particular, the system can
be built with MKCOVERAGE=yes, either with a native "make build" or
with crosscompilation. Either way, MKCOVERAGE=yes will build the
MINIX3 system services with coverage profiling support, generating a
.gcno file for each source module. After a reboot it is possible to
obtain runtime coverage data (.gcda files) for individual system
services using gcov-pull(8). The combination of the .gcno and .gcda
files can then be inspected with llvm-cov(1).
For reasons documented in minix.gcov.mk, only system service program
modules are supported for now; system service libraries (libsys etc.)
are not included. Userland programs are not affected by MKCOVERAGE.
The heart of this patch is the libsys code that writes data generated
by the LLVM coverage hooks into a serialized format using the routines
we already had for GCC GCOV. Unfortunately, the new llvm_gcov.c code
is LLVM ABI dependent, and may therefore have to be updated later when
we upgrade LLVM. The current implementation should support all LLVM
versions 3.x with x >= 4.
The rest of this patch is mostly a light cleanup of our existing GCOV
infrastructure, with as most visible change that gcov-pull(8) now
takes a service label string rather than a PID number.
Antoine Leca [Fri, 15 Jul 2016 15:27:36 +0000 (17:27 +0200)]
Adjust .gitignore for MINIX file system
Some files in LLVM have more than Minix-maximum of 60 characters.
Also drop some obsolete stuff, and add obj which are symlinks added
to every directory when using /usr/obj as OBJDIR (hinted in wiki.)
The way these options work is by creating files that contain debugging
symbols and stashing them in a dedicated set. The minix-debug set has
been created for this purpose, but it will probably have to be refined
since it has been tested only with the default options with an i386
cross-build.
LSC: Amended to support many combination of MKDEBUG, MKDEBUGLIB, with
and without X11, for both intel and arm.
Antoine Leca [Mon, 29 Aug 2016 11:48:31 +0000 (13:48 +0200)]
Improve the process for GNU tools
Split the process to fetch GNU tools (until now embedded
within tools/Makefile.gnuhost) into a new Makefile.fetchgnu,
MINIX-specific hence relocated, which is to be also used
to fetch sources even when not building the tools.
Use it for binutils too.
Improve documentation.
Also do not run configure on each run when MKUPDATE=yes
The .WAIT serialization instruction between fetching and other
configure sources was raising a new run of configure at each
compilation. Avoid it by using two rules.
The warnings in test47 seem to be a symptom of a larger problem,
i.e., not an issue with the test set code but rather with the GCC
configuration. Hopefully the switch to LLVM will resolve those.
Ever since a VM allocation strategy change, this test is fully
dysfunctional. It should be repaired and added to the regular
test set, but that will require some work.
libmthread: resolve memory leaks on exception path
If libmthread runs into a memory allocation failure while attempting
to enlarge its thread pool, it does not free up any preliminary
allocations made so far.
All functions prefixed with bdev_ are moved into bdev.c, and those
prefixed with cdev_ are now in cdev.c. The code in both files are
converted to KNF. The little (IOCTL-related) code left in device.c
is also cleaned up but should probably be moved into other existing
source files. This is left to a future patch. In general, VFS is
long overdue for a source code rebalancing, and the patch here is
only a step in the right direction.
Previously, VFS would use various subsets of a number of fproc
structure fields to store state when the process is blocked
(suspended) for various reasons. As a result, there was a fair
amount of abuse of fields, hidden state, and confusion as to
which fields were used with which suspension states.
Instead, the suspension state is now split into per-state
structures, which are then stored in a union. Each of the union's
structures should be accessed only right before, during, and right
after the fp_blocked_on field is set to the corresponding blocking
type. As a result, it is now very clear which fields are in use
at which times, and we even save a bit of memory as a side effect.
Any attempt to use open(2) to open a socket file now fails with
EOPNOTSUPP, as is common and in the process of being standardized.
The behavior and error code is now tested in test56.
Any attempt to open a file of which the type is not known to VFS
(e.g., as a result of bogus file system contents) now fails with EIO.
For now, this is a safety feature, to prevent VFS tripping over such
types in unchecked cases. In the future, a proper VFS code audit
should determine whether we can lift this restriction again, although
it does not seem particularly useful to be able to open files of
unknown types anyway. Another error code may be assigned to this case
later, too.
By now it has become clear that the VFS select code has an unusually
high concentration of bugs, and there is no indication that any form
of convergence to a bug-free state is in sight. Thus, for now, it
may be helpful to be able to dump the contents of the select tables
in order to track down any bugs in the future. Hopefully that will
allow the next bugs to be resolved slightly after than before.
The debug dump can be triggered with "svrctl vfs get print_select".
- it was querying a character or socket device that, at the start of
the select query, was not known to be ready for the requested
operations;
- this device could not be checked immediately, due to another ongoing
query to the same character or socket driver;
- the select query had a timer that triggered before the device could
be checked, thereby changing the select query to non-blocking.
In this situation, a missing flag check would cause the select code to
conclude erroneously that the operations which it flagged for later,
were satisfied. At the same time, the same flag remained set, so that
the select query would continue to wait for that device. This
resulted in a deadlock. The same bug could most likely be triggered
through other scenarios that were even less likely to occur.
This patch fixes the race condition and puts in a hopefully slightly
more informative comment for the affected block of code.
In practice, the bug could be triggered fairly reliably by generating
lots of output in tmux.
Now that clock_t is an unsigned value, we can also allow the system
uptime to wrap. Essentially, instead of using (a <= b) to see if time
a occurs no later than time b, we use (b - a <= CLOCK_MAX / 2). The
latter value does not exist, so instead we add TMRDIFF_MAX for that
purpose.
We must therefore also avoid using values like 0 and LONG_MAX as
special values for absolute times. This patch extends the libtimers
interface so that it no longer uses 0 to indicate "no timeout".
Similarly, TMR_NEVER is now used as special value only when
otherwise a relative time difference would be used. A minix_timer
structure is now considered in use when it has a watchdog function set,
rather than when the absolute expiry time is not TMR_NEVER. A few new
macros in <minix/timers.h> help with timer comparison and obtaining
properties from a minix_timer structure.
This patch also eliminates the union of timer arguments, instead using
the only union element that is only used (the integer). This prevents
potential problems with e.g. live update. The watchdog function
prototype is changed to pass in the argument value rather than a
pointer to the timer structure, since obtaining the argument value was
the only current use of the timer structure anyway. The result is a
somewhat friendlier timers API.
The VFS select code required a few more invasive changes to restrict
the timer value to the new maximum, effectively matching the timer
code in PM. As a side effect, select(2) has been changed to reject
invalid timeout values. That required a change to the test set, which
relied on the previous, erroneous behavior.
Finally, while we're rewriting significant chunks of the timer code
anyway, also covert it to KNF and add a few more explanatory comments.
Antoine Leca [Tue, 2 Aug 2016 21:07:15 +0000 (23:07 +0200)]
Adapt MINIX-specific part of tools/installboot
This is necessary to enable correct compilation of the tools version
of installboot_nbsd(8)when cross-compiling on a system close enough
to MINIX, like NetBSD 7.0.1 for example.
At a point not too far in the future, we will be switching from the
hardcoded MINIX3 implementation of the getifaddrs(3) libc routine to
the proper NetBSD implementation. The latter uses the
net.route.rtable sysctl functionality to obtain its information. In
order make the transition as painless as possible, this patch adds
basic support for that net.route.rtable functionality to INET and
LWIP, using the remote MIB (RMIB) facility.
With this patch, the IPC service is changed to use the new RMIB
facility to register and handle the "kern.ipc" sysctl subtree itself.
The subtree was previously handled by the MIB service directly. This
change improves locality of handling: especially the
kern.ipc.sysvipc_info node has some peculiarities specific to the IPC
service and is therefore better handled there. Also, since the IPC
service is essentially optional to the system, this rearrangement
yields a cleaner situation when the IPC service is not running: in
that case, the MIB service will expose a few basic kern.ipc nodes
indicating that no SysV IPC facilities are present. Those nodes will
be overridden through RMIB when the IPC service is running.
It should be easier to add the remaining (from NetBSD) kern.ipc nodes
as well now.
Test88 is extended with a new subtest that verifies that sysctl-based
information retrieval for semaphore sets works as expected.
MIB/libsys: support for remote MIB (RMIB) subtrees
Most of the nodes in the general sysctl tree will be managed directly
by the MIB service, which obtains the necessary information as needed.
However, in certain cases, it makes more sense to let another service
manage a part of the sysctl tree itself, in order to avoid replicating
part of that other service in the MIB service. This patch adds the
basic support for such delegation: remote services may now register
their own subtrees within the full sysctl tree with the MIB service,
which will then forward any sysctl(2) requests on such subtrees to the
remote services.
The system works much like mounting a file system, but in addition to
support for shadowing an existing node, the MIB service also supports
creating temporary mount point nodes. Each have their own use cases.
A remote "kern.ipc" would use the former, because even when such a
subtree were not mounted, userland would still expect some of its
children to exist and return default values. A remote "net.inet"
would use the latter, as there is no reason to precreate nodes for all
possible supported networking protocols in the MIB "net" subtree.
A standard remote MIB (RMIB) implementation is provided for services
that wish to make use of this functionality. It is essentially a
simplified and somewhat more lightweight version of the MIB service's
internals, and works more or less the same from a programmer's point
of view. The most important difference is the "rmib" prefix instead
of the "mib" prefix. Documentation will hopefully follow later.
Overall, the RMIB functionality should not be used lightly, for
several reasons. First, despite being more lightweight than the MIB
service, the RMIB module still adds substantially to the code
footprint of the containing service. Second, the RMIB protocol not
only adds extra IPC for sysctl(2), but has also not been optimized for
performance in other ways. Third, and most importantly, the RMIB
implementation also several limitations. The main limitation is that
remote MIB subtrees must be fully static. Not only may the user not
create or destroy nodes, the service itself may not either, as this
would clash with the simplified remote node versioning system and
the cached subtree root node child counts. Other limitations exist,
such as the fact that the root of a remote subtree may only be a
node-type node, and a stricter limit on the highest node identifier
of any child in this subtree root (currently 4095).
The current implementation was born out of necessity, and therefore
it leaves several improvements to future work. Most importantly,
support for exit and crash notification is missing, primarily in the
MIB service. This means that remote subtrees may not be cleaned up
immediately, but instead only when the MIB service attempts to talk
to the dead remote service. In addition, if the MIB service itself
crashes, re-registration of remote subtrees is currently left up to
the individual RMIB users. Finally, the MIB service uses synchronous
(sendrec-based) calls to the remote services, which while convenient
may cause cascading service hangs. The underlying protocol is ready
for conversion to an asynchronous implementation already, though.
A new test set, testrmib.sh, tests the basic RMIB functionality. To
this end it uses a test service, rmibtest, and also reuses part of
the existing test87 MIB service test.
Instead, filter it in libc for old networking implementations, as
those do not support sending SIGPIPE to user processes anyway. This
change allows newer socket drivers to implement the flag as per the
specification.
We do not support any PF functionality, nor do we intend to. However,
some NetBSD utilities rely on the presence of these files. Not all of
the files are installed. The NetBSD source seems rather inconsistent
in where from to include these files. We simply follow what NetBSD
does, though.
While still a small subset of the NetBSD headers, this set should
allow various additional utilities to be compiled without too many
MINIX3-specific changes (even if those utilities will not yet work).
While MFS failing to do I/O on a block is generally fatal, reading
the superblock at mount time is an exception: this case may occur
when the given partition is too small to contain the superblock.
Therefore, MFS should not crash or even report anything in this
case, but rather refuse to mount cleanly.
SEF: identity transfer only after controlled crash
Transparent (endpoint-preserving) restarts with identity transfer
are meant to exercise the crash recovery system only. After *real*
crashes, such restarts are useless at best and dangerous at worst,
because no state integrity can be guaranteed afterwards. Thus,
except after a controlled crash, it is best not to perform such
restarts at all. This patch changes SEF such that identity transfer
is successful only if the old process was the subject of a crash
induced through "service fi". As a result, testrelpol.sh should
continue to be able to use identity transfers for testing purposes,
but any real crash will be handled more appropriately.
The new asserts from git-29e004d exposed an issue in how VFS handles
aborting file system (FS) requests that are queued for a FS (as
opposed to sent to it) when that FS crashes. In that scenario, the
queued worker has its w_task set to NONE, because there is no ongoing
communication. However, worker_stop() is called on it regardless,
which used to abort the request only if w_task was not set to NONE,
leading to an improperly aborted request, a warning, and a VFS crash a
bit later. This patch changes worker_stop() so that w_task need not
be set to a valid endpoint for FS requests to be properly aborted.
Scripts for generating boot-to-ramdisk images are now available. These
can be used for example to boot from PXE or from a USB stick, as the
ramdisk are self-contained and do not rely on any block devices after
being loaded into RAM.
The image generation framework has also been slightly cleaned up in
order to better accomodate tarball sets bundling in images.
Some functions in lib/libc/net were disabled on MINIX3 only, but with
a few added header files they build just fine, even though some of
them rely on system functionality that has not yet been implemented.
Since the functionality is unlikely to be used in practice (because
it typically requires the use of protocol families that themselves are
not yet supported, such as IPv6), already enabling it right now helps
in building packages that rely on the functionality being present at
compile time, while not posing any practical risk of breaking the same
packages at run time.
This patch aims to synchronize the basic process user and group ID
management, as well as the set[ug]id(2) and sete[ug]id(2) behavior,
with NetBSD. As it turns out, the main issue was missing support for
saved user and group IDs. This support is now added.
Since NetBSD's userland, which we are importing, may rely on NetBSD
specifics when it comes to security, we choose not to deviate from
NetBSD's behavior in any way here. A new test, test89, verifies the
correct behavior - it has been confirmed to pass on NetBSD as is.
Commit git-c38dbb9 inadvertently broke local MINIX3-on-MINIX3 builds,
since its libc changes relied on VFS being upgraded already as well.
As a result, after installing the new libc, networking ceased to work,
leading to curl(1) failing later on in the build process. This patch
introduces transitional code that is necessary for the build process
to complete, after which it is obsolete again.
For a reason currently unknown to us, the qemu-linaro emulator
sometimes produces a Prefetch Abort exception with a fault location
(IFAR) rather different from the location of the instruction being
executed (LR corrected by 4). So far it has been observed in the
__udivmodsi4 routine of various processes, where the fault address is
for the first byte of the next page after the current instruction,
which itself is 44-64 bytes away from the start of that next page.
The affected instruction does not perform any sort of memory access.
Short of debugging qemu-linaro itself, we have no choice but to
disable the assert that previously went off in case the IFAR and
corrected LR are not equal. Since we have not yet observed this case
on actual hardware, the kernel prints a warning when detecting such a
mismatch for the first time. For the qemu-linaro case, the kernel's
actual page fault handling logic already handles this strange case
just fine.
- fix the reinstallation (preserve-/home) option;
- remove support for just reinstalling the bootloader, as the main
purpose of this option (allowing an upgrade from the old MINIX
boot monitor to the NetBSD bootloader) is no longer needed and was
already broken;
- do not try to copy over /etc/motd.install: it no longer exists.
Currently, the BSD socket API is implemented in libc, translating the
API calls to character driver operations underneath. This approach
has several issues:
- it is inefficient, as most character driver operations are specific
to the socket type, thus requiring that each operation start by
bruteforcing the socket protocol family and type of the given file
descriptor using several system calls;
- it requires that libc itself be changed every time system support
for a new protocol is added;
- various parts of the libc implementations violate the asynchronous
signal safety POSIX requirements.
In order to resolve all these issues at once, the plan is to turn the
BSD socket calls into system calls, thus making the BSD socket API the
"native" ABI, removing the complexity from libc and instead letting
VFS deal with the socket calls.
The overall change is going to break all networking functionality. In
order to smoothen the transition, this patch introduces the fifteen
new BSD socket system calls, and makes libc try these first before
falling back on the old behavior. For now, the VFS implementations of
the new calls fail such that libc will always use the fallback cases.
Later on, when we introduce the actual implementation of the native
BSD socket calls, all statically linked programs will automatically
use the new ABI, thus limiting actual application breakage.
In other words: by itself, this patch does nothing, except add a bit
of transitional overhead that will disappear in the future. The
largest part of the patch is concerned with adding full support for
the new BSD socket system calls to trace(1) - this early addition has
the advantage of making system call tracing output of several socket
calls much more readable already.
Both the system call interfaces and the trace(1) support have already
been tested using code that will be committed later on.
There is no reason to use a single message for nonoverlapping requests
and replies combined, and in fact splitting them out allows reuse of
messages and avoids various problems with field layouts. Since the
upcoming socketpair(2) system call will be using the same reply as
pipe2(2), split up the single message used for the latter. In order
to keep the used parts of messages at the front, start a transitional
phase to move the pipe(2) flags field to the front of its request.
Previously, the libc sendto(3) and recvfrom(3) implementations would
blindly assume that any unrecognized socket is a raw-IP socket. This
is not only inconsistent but also messes with returned error codes.
Lionel Sambuc [Sun, 7 Feb 2016 21:13:15 +0000 (22:13 +0100)]
Cross-compilation fixes:
- GCC >4.9:
Newer toolchains trigger more warnings, which by default are
treated as errors. Disable this while building the tools as this
will be a recurring problem in the future.
close #105
- FreeBSD: Fix some fetch.sh scripts which fails as FreeBSD's patch
fails to patch two .info files. We ignore this for the time being.
Lionel Sambuc [Sat, 6 Feb 2016 11:38:52 +0000 (12:38 +0100)]
Fix usage of parenthesis in Makefiles
While BSD make support both $() and ${} around variables, the NetBSD
source tree uses only ${} by convention.
Imported software is left as is, and sometimes $() is used when the
containing Makefile/Makefile fragment is used both by GNU make and BSD
make, as it can happen for the tools, and other parts as well which are
compiled using the host make tool.