TOP

Product Requirements Document

TODO:">TODO:">TODO: this document declares the names of features and system calls which aren't yet written. Inasmuch, these names are merely suggestions. After implementation, please remember to update this document with their official names.

Background

Under emulation, Mantle (which is currently and typically part of the emulator) is expected to provide a number of services on behalf of a running application that would normally be handled by real device drivers in a physical machine.

Mantle services are invoked using the processor's ecall instruction. (This is why most Mantle service labels start with ec; i.e., ecQuitVM or ecEndPaint.) However, these services are frequently synchronous in nature. After loading register a7 with the operation desired and executing the ecall instruction, the next instruction typically holds the results of the operation. The time taken to execute the operation is variable; but execution is always synchronous unless documented otherwise.

This relationship is not expected to hold indefinitely, however. In the future, Mantle's role is expected to evolve into more of a hypervisor role, and less of an OS kernel. Tripos itself will, as the stack evolves, increasingly assume kernel responsibilities.

Therefore, we need a device driver interface that is scalable from single-tasking, purpose-built kernels to one capable of supporting the time-sharing user experience that Tripos is known for.

Packets

Tripos uses message passing and a client/server relationship to support just about all filesystem and I/O transactions. A packet is a specially crafted message intended to be used for inter-process communications. A packet nearly always has the following layout:

+------------+
| pkt_next   |
+------------+
| pkt_id     |
+------------+
| pkt_op     |
+------------+
| pkt_flags  |
+------------+
| pkt_res1   |
+------------+
| pkt_res2   |
+------------+
| pkt_arg1   |
+------------+
| pkt_arg2   |
+------------+
| pkt_arg3   |
+------------+
| pkt_arg4   |
+------------+
| pkt_arg5   |
+------------+

The precise meaning of these fields is device and/or task specific, and so cannot be elaborated upon here. In classical Tripos implementations, filing systems and device drivers have unique command sets. However, some common patterns do exist.

One point of difference between classic Tripos and the Kestrel's implementation lies in the pkt_next field. Historically, this field is used to maintain a linked list for a queue. Today, it is currently completely ignored. Instead of exchanging pointers, qpkt will copy the packet address into the target's ring buffer, usually a dynamically-managed vector of pointers with a separate head and tail indicator.

Why? I ran some tests by modifying this queue performance test to use an 80-byte data structure, roughly approximating the size of the packet structure above. I ran the test for 16 million iterations on an Intel Core i7-9750H processor at 2.6GHz. The linked list performance results were always slower than the ring buffer approach. In the best case, linked lists were only marginally slower. In the worst case, linked lists tended to perform about twice as poorly. Thus, at least on modern CPUs with complicated caches, we can more or less confirm that it's actually cheaper to copy values up to about 80 bytes from a holding buffer into a working structure (and then back out) than it is to maintain pointer linkage for a linked list for heap-allocated records. However, the field remains reserved for those platforms where this assumption does not hold.

The pkt_id field is used to indicate who will receive the packet. Both tasks and devices have work queues. When you invoke the qpkt system call, the provided packet is placed onto the work queue of its recipient, and the pkt_id field is updated with the ID of the sender. This allows the receiver to reply to the message upon completion of its task using the same qpkt system call.

The pkt_op field indicates the desired operation from the server task. This is where similarities with classic Tripos conclude, however. With Kestrel's interpretation of Tripos, operation codes which are positive represent requests, while negative codes indicate replies to previous requests. Unless documented otherwise, a request for operation N will be replied with an operation of -N. Use pkt_res1 and pkt_res2 to provide error information.

Why? In conventional Tripos systems, a packet has only a single pkt_id field that is dual-purpose and the pkt_op field has no specialized bits. Prior to sending a packet, pkt_id is used to refer to the recipient of the message, while pkt_op requests the desired operation. After processing the message, the pkt_id field always refers to the original sender. (qpkt takes care of making the swap on behalf of the driver.) But, since there is no additional channel to distinguish a response from a command, tasks which are both clients and servers need a solution to this problem. For such a task, all messages returned from taskwait will have a pkt_id field that always refers to the task itself. So that rules out checking the pkt_id field to see where a message was originally addressed to. Typically, a complex server task will use a convoluted key/value store associating coroutines with packets issued. Upon receiving a packet, if the packet's address is not in the key/value store, then you know it is a new request. In handling that request, you may need to invoke services from other tasks or devices. You would register the original out-bound packet in a key/value store (on Cintpos versions of Tripos, this also means you often "hot-patch" system calls like read or findinput so you can intercept the calls to qpkt these routines make on your behalf!) Then, when handling the reply via taskwait, you'd find the packet is in the key/value store, and you should then switch to the coroutine responsible for handling that reply. It is a clever solution; however, it is a significant complexity that could have been easily avoided for all but the most complicated of server tasks. For this reason, Kestrel's implementation of Tripos uses a more ergonomic convention for the use of the pkt_op field..

Influence: The use of a designated bit to distinguish command from response comes from two inspirations: the HDLC network frame layout and the 9P protocol used in Plan 9. Both of these sources rely on bit 0 to distinguish command from response, however.

The pkt_flags field is new with Kestrel's implementation of Tripos, and is currently reserved; it must be set to zero for future compatibility.

Device Control Blocks

A Device Control Block (DCB) serves as the rendezvous point where a thread and a device driver can synchronize.

A DCB typically has the following layout:

+----------------+
| dcb_name       |
+----------------+
| dcb_nitb       |
+----------------+
| fn dcb_init    |
+----------------+
| fn dcb_expunge |
+----------------+
| fn dcb_startio |
+----------------+
| fn dcb_abortio |
+----------------+

Making I/O Requests

One of the first things one might notice is that the DCB lacks a built-in work queue. Instead, device drivers are expected to maintain their own local queues as they see fit.

Why? In classical Tripos, all DCBs have a built-in work queue to keep track of requests. The qpkt system call would hang a packet onto this queue before invoking any of the DCB's procedures. However, this typically leads to situations where the packet would be queued, and the driver would then dequeue the packet right after to place it onto an internally managed queue. All this queueing and dequeueing is just a waste of effort. It's also completely superfluous to devices that are entirely synchronous in nature.

When a task submits an I/O request via a packet, qpkt first checks to see if you're addressing a device (versus a task). If it is, it then notifies the driver of the event by invoking its dcb_startio function. It is the responsibility of this function to enqueue the packet however it sees fit, if it is even necessary to do so. The dcb_startio function is also responsible for starting any I/O operations if the driver was previously idle.

Note: If the I/O request is simple enough, the dcb_startio function may actually complete the I/O request on its own. In this case, the driver may recursively invoke qpkt to place the packet back onto the calling thread's work queue. (qpkt will have already updated the pkt_id field by now.) This facilitates operations that are inherently quick and synchronous in nature, such as status queries and calls into Mantle.

Aborting I/O Requests

A queued I/O request can be aborted by invoking the dqpkt system call. This function will attempt to remove the packet from whatever queue it can.

If the packet is queued up onto the calling task's work queue, it is removed with no further processing, and a success result is returned.

If the packet belongs to another task's work queue, it is removed with no further processing, and a success result is returned.

If the packet belongs to a device's work queue, however, then the dcb_abortio function is invoked. This gives the driver the chance to make whatever hardware changes are necessary to cancel the operation, assuming it has already started working on the request. This includes the possibility of the driver removing the packet from whatever queues it has placed the request on.

In all cases, it is possible for dqpkt to return a failure. In this case, the calling task has no choice but to wait for the reply to the packet to arrive before being able to reuse the packet.

Registering a Device Driver

When adddevice is invoked, the provided DCB is assigned a unique device ID. In addition, the dcb_init function is invoked to complete any device-specific initialization tasks. The dcb_init call may fail, in which case the whole adddevice call will also fail.

If a device driver needs to be unloaded, the remdevice system call is invoked. Eventually, this translates to a call to dcb_expunge. This procedure is responsible for cleaning up any resources acquired during a call to dcb_init. This call must not ever fail.

The device driver need not have a name to be successfully assigned a device ID. However, names are strongly encouraged for the benefit of Tripos MOUNT operations, as the DEVS:MountList file will refer to device drivers by name. The dcb_name field points to a combined BCPL/C name field in memory.

+-----+-----+-----+-----+-----+-----+-----+-----+
|  6  | 'm' | 'a' | 'n' | 't' | 'l' | 'e' |  0  |
+-----+-----+-----+-----+-----+-----+-----+-----+
   |   \_________________________________/   |
   |                    |                    |

BCPL String         Name Bytes          Terminating
  Length              (UTF-8)               NULL

Why? The combined BCPL/C naming convention is intended to ensure maximum compatibility between a variety of different languages: BCPL, C, Forth, BASIC, Pascal/Modula-family languages, and assembly language.

Interrupt Driven I/O

At the moment, in the emulation environment, Tripos devices run entirely in user-mode under the Mantle environment. Therefore, there is no mechanism for handling interrupts at this time.

However, future real-hardware Kestrel-2/EX implementations may permit drivers to access interrupts since access to real hardware will then be available. This facility is not yet specified formally; I anticipate a solution similar to Metacomco's Interrupt Transfer Blocks (ITBs) will be used.

For now, set dcb_nitb to zero for future compatibility. This field may find use when interrupt support is completed.

Tripos Kernel Calls

Several function calls are required to be implemented:

In addition, functions for manipulating queues would be here. They would be included by reference in a future PRD (possibly PRD-0003).

The following sections provides the minimum viable behaviors, and assumes a single-tasking subset of Tripos with a compatible calling interface.

The behavior illustrated below is expressed in a kind of program design language; however, I'm using the method less rigorously than proper PDL use would otherwise suggest. Maybe, someday, I'll fix this.

qpkt

TO queue a packet (p) DO
	Confirm pkt_next is properly cleared.
	IF not THEN
		RETURN failure code for packet link error.
	END
	
	IF packet is addressing a device THEN
		Locate device control block.
		IF dcb is valid THEN
			Switch pkt_id of packet.
			Call device's dcb_startio callback.
			RETURN results from upcall.
		END
		RETURN failure code for invalid destination.
	END

	-- p must be addressing a task, possibly the current task.
	
	IF the packet is not addressing a valid task THEN
		RETURN failure code for invalid destination.
	END
	
	Place packet onto tail of task's work queue.
	IF task is currently waiting for a task THEN
		Mark task as ready to run.
	END
	
	IF task has higher priority than current task THEN
		Switch to the addressed task.
	END
	
	RETURN result for success.
END

dqpkt

TO dequeue a packet (p) from (id) DO
	IF we're dequeueing from a device THEN
		IF device ID is valid THEN
			Call device's dcb_abortio callback.
			IF successful THEN
				Clear pkt_next.
			ELSE
				Recursively dequeue a packet p from current task.
			END
			RETURN results either way.
		ELSE
			RETURN failure code for invalid destination.
		END
	END
	
	-- we must be dequeueing from a task's work queue.
	
	IF task ID is valid THEN
		Locate task control block.
		Try to remove packet from task's work queue.
		RETURN result for success.
	END
	
	RETURN result for invalid destination.
END

taskwait

TO wait for a message DO
	Locate current task control block.
	IF task has at least one message pending THEN
		Pop the head of the work queue.
		Clear the packet's pkt_next link.
		RETURN the retrieved packet.
	END
	
	Enter state 'waiting for message.'
	Search for work starting with current task.
	
	RETURN but take care not to damage register A0.
END

It should be noted that, yes, a call to taskwait can block indefinitely. This is perhaps a design concern; however, historically, this has not presented a problem.

findtask

TO find the current task DO
	RETURN current task ID.
END

Design issue: historically, FindTask() in AmigaDOS answers with the current task control block. findtask() in cintpos does similarly. However, it seems like finding the current id is more meaningful, since most APIs use the ID of the task, not the address of its TCB. Will changing this semantic be a concern going forward?

srchwk

This system call seems to require a privileged mode to function, since register manipulation at this level requires visibility into the current task's running state.

TO search for work starting with task (tcb) DO
	Preserve register set for currently running task.

	Start with provided task control block.
	WHILE task reference remains valid DO
		IF currently selected task is waiting for a message
		AND has a pending message THEN
			Change task state to 'ready to run'.
			Set task register A0 to packet at head of work queue.
			Pop the work queue.
			-- fall through to next IF clause intended.
		END
		IF currently selected task is ready to run THEN
			Restore all registers, including stack and PC.
			-- implicitly returns back into userspace.
		END
		
		Advance to the next lower-priority task.
	END
	-- Idle task is ALWAYS runnable, so impossible for loop to fail.
END

adddevice

TO add a new device (dcb) DO
	Locate a free device ID.
	IF none found THEN
		RETURN results for out of resources.
	END
	
	Call dcb's initialization procedure.
	IF not successful THEN
		RETURN results.
	END
	
	Bind provided dcb with the device ID.
	RETURN new device ID.
END

remdevice

TO remove a device (id) DO
	IF device ID is invalid THEN
		RETURN results for invalid device ID.
	END
	
	IF device ID is already marked unused THEN
		RETURN results for success.
	END
	
	Locate corresponding dcb.
	Call device's expunge function.
	IF successful THEN
		Mark device ID as invalid.
	END
	RETURN results.
END

finddevice

TO find a device driver named (name) DO
	FOR EACH valid dcb DO
		IF dcb's name matches that provided THEN
			RETURN ID of corresponding dcb.
		END
	END
	RETURN results for failure to find named resource.
END

Idle Task

In addition to the above system calls, we need an idle task which sits in a tight loop. Its purpose is to "return" to Mantle when there's nothing to do, and dispatch events to various system-recognized "device drivers".

TASK idle DO
	Initialize idle task environment.
	
	FOREVER DO
		Invoke Mantle function ecNextEvent.
		Dispatch event appropriately.
	END
END

Note that dispatching events implies the possibility of calling srchwk with a task that has higher priority than the idle task (directly or indirectly). For example, and I'm just hand-waving here, let's suppose the idle task gets an mtKeyDown event that some other task is waiting for. The control flow for this might look something like this:

-- hypothetical code!!  Beware!

	IF event_type = mtKeyDown THEN
		-- we're properly part of the "rawkey" device driver here.
		IF rawkeydevice.has_pending_read_request() THEN
			read_request := pop_rawkey_queue()
			ReplyToPacket(read_request, ...other stuff here...);
			qpkt(read_request);  -- remember qpkt() already switched pkt_id for us!!
			-- because qpkt() wakes up a higher priority task,
			-- we switch tasks right here and now.
		END
	END