US7360221B2 - Task swap out in a multithreaded environment - Google Patents

Task swap out in a multithreaded environment Download PDF

Info

Publication number
US7360221B2
US7360221B2 US10/659,407 US65940703A US7360221B2 US 7360221 B2 US7360221 B2 US 7360221B2 US 65940703 A US65940703 A US 65940703A US 7360221 B2 US7360221 B2 US 7360221B2
Authority
US
United States
Prior art keywords
stream
task
operating system
team
master
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime, expires
Application number
US10/659,407
Other versions
US20040088711A1 (en
Inventor
Gail A. Alverson
II Charles David Callahan
Susan L. Coatney
Brian D. Koblenz
Richard D. Korry
Burton J. Smith
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cray Inc
Original Assignee
Cray Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cray Inc filed Critical Cray Inc
Priority to US10/659,407 priority Critical patent/US7360221B2/en
Publication of US20040088711A1 publication Critical patent/US20040088711A1/en
Application granted granted Critical
Publication of US7360221B2 publication Critical patent/US7360221B2/en
Assigned to TERA COMPUTER COMPANY reassignment TERA COMPUTER COMPANY ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: SMITH, BURTON J., ALVERSON, GAIL A., CALLAHAN, II, CHARLES DAVID, COATNEY, SUSAN L., KOBLENZ, BRIAN D., KORRY, RICAHRD D.
Assigned to CRAY INC. reassignment CRAY INC. CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: TERA COMPUTER COMPANY
Adjusted expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/485Task life-cycle, e.g. stopping, restarting, resuming execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/461Saving or restoring of program or task context
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4812Task transfer initiation or dispatching by interrupt, e.g. masked
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/48Indexing scheme relating to G06F9/48
    • G06F2209/481Exception handling

Definitions

  • the present invention relates to an interface between a user program and an operating system and, more particularly, to such an interface in a multithreaded environment.
  • Parallel computer architectures generally provide multiple processors that can each be executing different tasks simultaneously.
  • One such parallel computer architecture is referred to as a multithreaded architecture (MTA).
  • MTA multithreaded architecture
  • the MTA supports not only multiple processors but also multiple streams executing simultaneously in each processor.
  • the processors of an MTA computer are interconnected via an interconnection network. Each processor can communicate with every other processor through the interconnection network.
  • FIG. 1 provides a high-level overview of an MTA computer.
  • Each processor 101 is connected to the interconnection network and memory 102 .
  • Each processor contains a complete set of registers 101 a for each stream.
  • each processor also supports multiple protection domains 101 b so that multiple user programs can be executing simultaneously within that processor.
  • Each MTA processor can execute multiple threads of execution simultaneously. Each thread of execution executes on one of the 128 streams supported by an MTA processor. Every clock time period, the processor selects a stream that is ready to execute and allows it to issue its next instruction. Instruction interpretation is pipelined by the processor, the network, and the memory. Thus, a new instruction from a different stream may be issued in each time period without interfering with other instructions that are in the pipeline. When an instruction finishes, the stream to which it belongs becomes ready to execute the next instruction. Each instruction may contain up to three operations (i.e., a memory reference operation, an arithmetic operation, and a control operation) that are executed simultaneously.
  • the state of a stream includes one 64-bit Stream Status Word (“SSW”), 32 64-bit General Registers (“R 0 -R 31 ”), and eight 32-bit Target Registers (“T 0 -T 7 ”).
  • SSW Stream Status Word
  • R 0 -R 31 32 64-bit General Registers
  • T 0 -T 7 32-bit Target Registers
  • the MTA uses program addresses that are 32 bits long.
  • the lower half of an SSW contains the program counter (“PC”) for the stream.
  • the upper half of the SSW contains various mode flags (e.g., floating point rounding, lookahead disable), a trap disable mask (e.g., data alignment and floating point overflow), and the four most recently generated condition codes.
  • the 32 general registers are available for general-purpose computations. Register R 0 is special, however, in that it always contains a 0. The loading of register R 0 has no effect on its contents.
  • the instruction set of the MTA processor uses the eight target registers as branch targets. However, most control transfer operations only use the low 32 bits to determine a new program counter.
  • One target register (T 0 ) points to the trap handler, which may be an unprivileged program.
  • the trapping stream starts executing instructions at the program location indicated by register T 0 .
  • Trap handling is lightweight and independent of the operating system and other streams.
  • a user program can install trap handlers for each thread to achieve specific trap capabilities and priorities without loss of efficiency.
  • Each MTA processor supports as many as 16 active protection domains that define the program memory, data memory, and number of streams allocated to the computations using that processor.
  • Each executing stream is assigned to a protection domain, but which domain (or which processor, for that matter) need not be known by the user program.
  • the MTA divides memory into program memory, which contains the instructions that form the program, and data memory, which contains the data of the program.
  • the MTA uses a program mapping system and a data mapping system to map addresses used by the program to physical addresses in memory.
  • the mapping systems use a program page map and a data segment map.
  • the entries of the data segment map and program page map specify the location of the segment in physical memory along with the level of privilege needed to access the segment.
  • the number of streams available to a program is regulated by three quantities: the stream limit (slim), the current number of streams (scur), and the number of reserved streams (sres) associated with each protection domain.
  • the current numbers of streams executing in the protection domain is indicated by scur; it is incremented when a stream is created and decremented when a stream quits.
  • a create can only succeed when the incremented scur does not exceed sres, the number of streams reserved in the protection domain.
  • the operations for creating, quitting, and reserving streams are unprivileged. Several streams can be reserved simultaneously.
  • the stream limit slim is an operating system limit on the number of streams the protection domain can reserve.
  • a stream executes a CREATE operation to create a new stream
  • the operation increments scur, initializes the SSW for the new stream based on the SSW of the creating stream and an offset in the CREATE operation, loads register (T 0 ), and loads three registers of the new stream from general purpose registers of the creating stream.
  • the MTA processor can then start executing the newly created stream.
  • a QUIT operation terminates the stream that executes it and decrements both sres and scur.
  • a QUIT_PRESERVE operation only decrements scur, which gives up a stream without surrendering its reservation.
  • the MTA supports four levels of privilege: user, supervisor, kernel, and IPL.
  • the IPL level is the highest privilege level. All levels use the program page and data segment maps for address translation, and represent increasing levels of privilege.
  • the data segment map entries define the minimum levels needed to read and write each segment, and the program page map entries define the exact level needed to execute from each page. Each stream in a protection domain may be executing at a different privileged level.
  • a “LEVEL_ENTER lev” operation sets the current privilege level to the program page map level if the current level is equal to lev.
  • the LEVEL_ENTER operation is located at every entry point that can accept a call from a different privilege level.
  • a trap occurs if the current level is not equal to lev.
  • the “LEVEL_RETURN lev” operation is used to return to the original privilege level.
  • a trap occurs if lev is greater than the current privilege level.
  • An exception is an unexpected condition raised by an event that occurs in a user program, the operating system, or the hardware. These unexpected conditions include various floating point conditions (e.g., divide by zero), the execution of a privileged operation by a non-privileged stream, and the failure of a stream create operation.
  • Each stream has an exception register. When an exception is detected, then a bit in the exception register corresponding to that exception is set. If a trap for that exception is enabled, then control is transferred to the trap handler whose address is stored in register T 0 . If the trap is currently disabled, then control is transferred to the trap handler when the trap is eventually enabled assuming that the bit is still set in the exception register.
  • the operating system can execute an operation to raise a domain_signal exception in all streams of a protection domain. If the trap for the domain_signal is enabled, then each stream will transfer control to its trap handler.
  • Each memory location in an MTA computer has four access state bits in addition to a 64-bit value.
  • These access state bits allow the hardware to implement several useful modifications to the usual semantics of memory reference.
  • These access state bits are two data trap bits, one full/empty bit, and one forward bit.
  • the two data trap bits allow for application-specific lightweight traps, the forward bit implements invisible indirect addressing, and the full/empty bit is used for lightweight synchronization.
  • the behavior of these access state bits can be overridden by a corresponding set of bits in the pointer value used to access the memory.
  • the two data trap bits in the access state are independent of each other and are available for use, for example, by a language implementer. If a trap bit is set in a memory location, then an exception will be raised whenever that location is accessed if the trap bit is not disabled in the pointer. If the corresponding trap bit in the pointer is not disabled, then a trap will occur.
  • the forward bit implements a kind of “invisible indirection.” Unlike normal indirection, forwarding is controlled by both the pointer and the location pointed to. If the forward bit is set in the memory location and forwarding is not disabled in the pointer, the value found in the location is interpreted as a pointer to the target of the memory reference rather than the target itself. Dereferencing continues until either the pointer found in the memory location disables forwarding or the addressed location has its forward bit cleared.
  • the full/empty bit supports synchronization behavior of memory references.
  • the synchronization behavior can be controlled by the full/empty control bits of a pointer or of a load or store operation.
  • the four values for the full/empty control bits are shown below.
  • VALUE MODE LOAD STORE 0 normal read regardless write regardless and set full 1 reserved reserved 2 future wait for full wait for full and leave full and leave full 3 sync wait for full wait for empty and set empty and set full
  • loads and stores wait for the full/empty bit of memory location to be accessed to be set to full before the memory location can be accessed.
  • loads are treated as “consume” operations and stores are treated as “produce” operations.
  • a load waits for the full/empty bit to be set to full and then sets the full/empty bit to empty as it reads, and a store waits for the full/empty bit to be set to empty and then sets the full/empty bit to full as it writes.
  • a forwarded location i.e., its forward bit is set
  • that is not disabled i.e., by the access control of a pointer
  • that is empty i.e., full/empty bit is set to empty
  • the full/empty bit may be used to implement arbitrary indivisible memory operations.
  • the MTA also provides a single operation that supports extremely brief mutual exclusion during “integer add to memory.”
  • the FETCH_ADD operation loads the value from a memory location and stores the sum of that value and another value back into the memory location.
  • Each protection domain has a retry limit that specifies how many times a memory access can fail in testing full/empty bit before a data blocked exception is raised. If the trap for the data blocked exception is enabled, then a trap occurs. The trap handler can determine whether to continue to retry the memory access or to perform some other action. If the trap is not enabled, then the next instruction after the instruction that caused the data blocked exception is executed.
  • the appendix contains the “Principles of Operation” of the MTA, which provides a more detailed description of the MTA.
  • Embodiments of the present invention provide a method system for placing a task with multiple threads in a known state, such as a quiescent state.
  • a known state such as a quiescent state.
  • each thread of the task is notified that it should enter the known state.
  • each of the threads enter the known state.
  • the known state of the task may be the execution of idle instructions by each of the threads or by stopping the execution of instructions by the threads (e.g., quitting the streams).
  • the notification may be by raising a domain signal for the protection domain in which the task is executing.
  • the notification may also be initiated by the task itself by, for example, sending a request to the operating system.
  • the threads Prior to entering the known state, the threads may save their state information so that when the known state is exited the threads can restore their saved state and continue execution.
  • the task in response to receiving the notification, may also notify the operating system that the task is blocked from further productive use of the processor until an event occurs. In this way, rather than having the task continue to execute idle instructions (e.g., instructions looping checking for an event to occur), the operating system may assign the processor to another task. The operating system may also defer re-assigning the processor to the task until an event occurs that is directed to that task.
  • various actions can be performed relative to the task. For example, the operating system may assign the processor resources used to by that task to another task.
  • a debugger which may be executing as one of the threads of the task, can access the state information saved by the other threads of the task.
  • a designated thread of the task may also process operating system signals when the other threads of the task are in the known state. After the signals are processed by the thread, the other threads can be allowed to exit the known state. More generally, after the actions to be performed while the task is in the known state, then the threads of the task can exit the known state.
  • a task that has entered a known state may exit the known state by receiving a notification to exit the known state. Upon receiving the notification, each thread exits the known state by executing instructions that were to be executed prior to entering the known state or more generally continuing with productive work (e.g., non-idle instructions).
  • one thread may be designated as a master thread for causing the other threads to exit their known state (e.g., creating streams). The master thread may also perform signal processing prior to allowing the other threads to exit their known state.
  • One embodiment of the present invention provides a method in a multithreaded computer for preparing a task to be “swapped out” from processor utilization by placing the task in a known state.
  • the computer has a processor with multiple streams for executing threads of the task.
  • the task designates one stream that is executing a thread to be a master stream.
  • the task then saves the state of each stream that is executing a thread.
  • the task quits the stream.
  • the task Under control of the master stream, the task notifies the operating system that the task is ready to be swapped out.
  • the operating system can then swap the task out from processor utilization.
  • the method prepares a task that is executing on a computer with multiple processors.
  • the task has one or more “teams” of threads where each team represents threads executing on a single processor.
  • the task designates, for each stream, one stream that is executing a thread to be a team master stream.
  • the task then designates one stream that is executing a thread to be a task master stream.
  • For each team master stream the task notifies the operating system that the team is ready to be swapped out when each other thread of the team has quit its stream. Finally, for the task master stream, the task notifies the operating system that the task is ready to be swapped out when each of the other teams have notified the operating system that that team is ready to be swapped out.
  • the server initially assigns a resource to a client.
  • the server receives notification from the client assigned to the resource that the client is waiting for an occurrence of an event before the resource can be productively used.
  • the server upon receiving the notification, assigns the resource from the client and does not reassign that resource to the client until after the event occurs.
  • the server is an operating system
  • the clients are tasks
  • the resource is a processor or protection domain.
  • the server may receive the notification in response to a request that the task save its state information prior to having that resource un-assigned. After that external event occurs, the server can then reassign the resource to the task.
  • Another aspect of the present invention provides a method in a computer system for returning to a task a stream that is executing an operating system call that is blocked.
  • the computer system has a processor with multiple streams.
  • the operating system executing on a stream invokes a function provided by the task.
  • the invoked function then executes instructions on that stream to effect the return of the stream to the task.
  • the operating system then notifies the task when the operating system call is complete.
  • the task can then continue the execution of the thread that invoked the blocking operating system call.
  • the present invention assigns a processor resource to a task after a thread of the task invokes an operating system call that will block waiting for the occurrence of an event.
  • the operating system invokes a routine of the task so that that routine can assign the processor resource to another thread of the task. In this way, the task can continue to execute other threads even though one of its threads may be blocked on operating system call.
  • Another aspect of the present invention provides, a method in a computer system for performing an inter-thread long jump from a long jump thread to a set jump thread.
  • the long jump thread receives an indication of a set jump location that was set by the set jump thread.
  • the long jump thread determines whether the set jump thread is the same thread that is currently executing.
  • the long jump thread sets the state of the set jump thread to next execute a long jump indicating the set jump location.
  • an intra-thread long jump is performed.
  • FIG. 1 provides a high-level overview of the MTA.
  • FIG. 2 is a block diagram illustrating components of the operating system and user programs in one embodiment.
  • FIG. 3 is a flow diagram of the primary exception handler routine.
  • FIG. 4 is a flow diagram of the domain_signal_handler routine.
  • FIGS. 5A and 5B are flow diagrams of the last_stream_domain_signal_handler routine.
  • FIG. 6 is a flow diagram of the work_of_final_stream_in_task function.
  • FIG. 7 is a flow diagram of the process_signals function.
  • FIG. 8 is a flow diagram of the swap_restart_stream function.
  • FIG. 9 is a flow diagram of the slave_return_from_swap routine.
  • FIG. 10 is a block diagram of data structures used when swapping a task.
  • FIG. 11 is a flow diagram of the user_entry_stub routine.
  • FIG. 12 is a flow diagram of the rt_return_vp function.
  • FIG. 13 is a flow diagram of the rt_return_thread function.
  • FIG. 14 is a flow diagram of the tera_return_stream operating system call.
  • FIG. 15 is a flow diagram of a trap handler routine for handling data blocked exceptions that are raised when waiting for an operating system call to complete.
  • FIG. 16A is a diagram illustrating the synchronization of the user program and the operating system when the user program invokes an operating system call that blocks.
  • FIG. 16B illustrates the Upcall Transfer (ut) data structure.
  • FIG. 17 is a flow diagram of the basic longjmp routine.
  • FIG. 18 is a flow diagram of the indirect_longjmp routine.
  • FIG. 19 is a flow diagram of the processing performed when the state of the thread is “blocked.”
  • FIG. 20 is a flow diagram of the processing performed when the state of the thread is running.
  • FIG. 21 is a flow diagram of the check_on_blocked_os_call routine.
  • Embodiments of the present invention provide an interface between a user program and an operating system in an MTA computer.
  • the user program cooperates with the operating system in saving the state of the user program when the operating system wants to allocate the protection domain in which the user program is executing to another user program so that the other user program may start executing its instructions.
  • the operating system allows each user program to execute for a certain time slice or quantum before “swapping out” the user program from its protection domain.
  • the operating system notifies the user program when the quantum expires.
  • Each stream that is allocated to that user program receives the notification. Upon receiving the notification, each stream saves its state and quits except for one stream that is designated as a master stream.
  • the master stream saves its state and waits for all the other streams to quit.
  • the master stream then notifies the operating system that the user program is ready to be swapped out of its protection domain.
  • the master stream also notifies the operating system of the number of streams that were created (or alternatively reserved) when the quantum expired.
  • the operating system decides to allow the user program to start executing again (i.e., be “swapped in”), the operating system restarts the thread that was executing in the master stream. That thread then creates the other streams and restarts each of the threads executing where they left off using the saved state.
  • the operating system may defer swapping in the user program until sufficient streams (as indicated by the user program when it was swapped out) are available so that when the user program is swapped in, it can create the same number of streams it quit when swapping out.
  • the operating system returns streams to the user program when the thread that was executing on the stream is blocked on an operating system call.
  • Each user program may be limited to a certain number of streams by the operating system.
  • a user program can create streams up to this limit and start different threads executing in each of the created streams.
  • the operating system starts executing on the same stream on which the thread was executing.
  • the operating system call blocks (e.g., waiting for user input)
  • the operating system returns that stream to the user program so that the user program can schedule another thread to execute on that stream.
  • the operating system eventually notifies the user program when the operating system call completes, and the user program can restart the thread that was blocked on that operating system call. In this way, the user program can continue to use all of its created streams even though a thread is blocked on an operating system call.
  • Unix-type set jump and long jump inter-thread behavior is supported.
  • a set jump function stores the current state of the stream in a set jump buffer.
  • the current state includes the return address for that invocation of the set jump function.
  • the long jump function deallocates memory (e.g., stack frames) allocated since the set jump function was invoked, restores the stream state stored in the set jump buffer, and jumps to the return address.
  • the long jump function is invoked by a thread (“the long jump thread”) different from the thread (“the set jump thread”) that invoked the set jump function
  • the long jump function first locates the state information for the set jump thread.
  • the long jump function sets the program counter in that state information to point to an instruction that invokes the long jump function passing the set jump buffer.
  • an intra-thread long jump is performed.
  • FIG. 2 is a block diagram illustrating components of the operating system 210 and user programs 220 in one embodiment.
  • the operating system includes a processor scheduler 211 , a task list 212 , and instructions implementing various operation system calls 213 .
  • the processor scheduler assigns tasks to execute in the protection domains of a processor, such tasks are referred to as active tasks.
  • the term “task” refers to a running user program that may currently be either active or inactive. Periodically (e.g., when a time quantum expires), the processor scheduler determines whether an inactive task should be made active. If all the protection domains are already assigned to active tasks, then the operating system will swap out an active task, making it inactive, and swap in an inactive task making it active.
  • a task comprises one or more teams, and each team comprises one or more threads of execution.
  • Each user program 220 includes user code 221 and a user runtime 222 .
  • the user code is the application-specific code of the user program, and the user runtime is code provided to assist the user program in managing the scheduling of threads to streams.
  • the user runtime includes virtual processor code 223 and a thread list 224 .
  • the virtual processor code is responsible for deciding which thread to assign to the stream on which the virtual processor code is executing. When a task creates a stream, the virtual processor code is executed to select which thread should be assigned to that stream. When a thread completes, the virtual processor code also is executed to determine the next thread to assign to that stream. If threads are not currently available to assign to the stream, the virtual processor code may quit the stream so that the stream can be assigned to another task.
  • the user runtime also provides standard trap handlers for handling various exceptions with a standard behavior. The user code can override the standard behaviors by providing customized trap handlers for various exceptions.
  • the processor scheduler of the operating system coordinates the allocation of the processor to the various tasks that are currently ready to be executed. As described above, each processor has 16 protection domains and can thus be simultaneously executing up to 15 tasks with the operating system being executed in the other domain.
  • the processor scheduler allows each task to execute for a certain time quantum. When the time quantum expires for a task, the processor scheduler raises the domain_signal for the protection domain of that task to initiate a swap out for that task.
  • the swapping in and swapping out of tasks requires cooperation on the part of the task.
  • the operating system asks the task to save its state and quit all its streams, but one. The one remaining stream then notifies the operating system that the state of the task has been saved and that another task can be swapped into that protection domain. If the task ignores the notification, then the operating system can abort the task.
  • the operating system notifies the task of the impending swap out by raising the domain_signal, which causes each stream of that task to trap (assuming the domain_signal trap is enabled) and to start executing its primary trap handler, whose address is stored in register T 0 .
  • the primary trap handler saves the state of the thread executing on that stream and then invokes a domain_signal_handler routine.
  • the task may be executing on multiple streams and on multiple processors. To ensure that the state of all executing threads are properly saved and that the task quits all its streams in an orderly manner, each team of the task designates one of the streams executing a thread of the task to be a team master stream, and the team master streams designate one of the team master streams to be a task master stream.
  • the team master stream is the thread that first increments a team master variable
  • the task master stream is that team master stream that first notifies (or alternatively that last notifies) the operating system that its team is ready to be swapped out.
  • Each team master stream waits for all other streams of the team to quit and then performs some clean-up processing before notifying the operating system that all the other streams of the team have quit and that the team is ready to be swapped out.
  • the task master stream waits until all the team master streams have notified the operating system and performs some clean-up processing for the task before notifying the operating system that the task is ready to be swapped out.
  • the team master streams and the task master stream notify the operating system by invoking an operating system call. The operating system then takes control of the last stream in each team and can start another task executing on that stream as part of swapping in that other task.
  • a task master stream processes any Unix signals that have arrived and then releases all the other team master streams to restore the saved states.
  • Each team master stream creates a stream for each thread that was running when the task was swapped out and sets the state of the created streams to the saved states of the threads.
  • FIGS. 3-11 illustrate the saving and restoring of a task state when the task is swapped out and then swapped in.
  • this saving and restoring is provided by the user runtime, which is started when the domain_signal_handler routine is invoked.
  • the data structures used when the task state is saved and restored are shown in FIG. 10 .
  • FIG. 3 is a flow diagram of the primary exception handler routine.
  • the address of the primary exception handler routine is stored in register T 0 .
  • the primary exception handler routine determines which exception has been generated and invokes an appropriate secondary exception handler to process the exception.
  • the routine saves the state of the thread in a save_area data structure and disables the domain signal trap.
  • the primary exception handler may save only partial thread state information depending on the type of exception. For example, if the exception is a data blocked exception, then the primary exception handler may save very little state information so that the handling can be lightweight if the secondary handler decides to retry access to the blocked memory location.
  • routine if a domain_signal exception has been raised, then routine continues at step 303 , else the routine continues to check for other exceptions.
  • the routine invokes the domain_signal_handler routine to process the exception.
  • the domain_signal_handler routine returns after the task has been swapped out and then swapped in.
  • the routine restores the thread state and returns to the user code.
  • FIG. 4 is a flow diagram of the domain_signal_handler routine.
  • This routine is invoked by the primary trap handler when the raising of the domain_signal caused the trap.
  • the domain_signal of a protection domain is raised by the operating system when the operating system wants to swap out the task executing on that protection domain.
  • the primary trap handler is executed by each stream in the protection domain and saves most of the state of the stream.
  • This routine is passed a save_area data structure that contains that state. Each stream links its save_area data structure onto a linked list so that the state is available when the task is swapped in. If the stream is not a team master stream, that is, it is a slave stream, then the stream quits.
  • the routine then invokes the last_stream_domain_signal_handler routine, which does not return until the team master stream is swapped in.
  • this routine returns to the primary trap handler, which restores, from the save_area data structure, the state of the stream at the time when the domain_signal was raised.
  • step 401 the routine locks the thread.
  • the locking of the thread means that the thread running on the stream will not give up the stream on a blocking call to the operating system or any other event such as a synchronization retry-limit exception.
  • step 402 the routine saves any remaining state that was not saved by the primary trap handler.
  • step 404 the routine invokes the preswap_parallel_work function to perform any necessary work for the running thread prior to swapping out the task.
  • step 405 the routine stores the address of the return point for this thread, upon swap in, in the return_linkage variable of the save_area data structure. In this embodiment, the address of slave_return_from_swap function is stored as the return point.
  • step 406 the routine fetches and adds to a team master variable.
  • the first stream to fetch and add to the team master variable is the team master stream for the team.
  • step 407 if this stream is the team master stream, then the routine continues at step 408 , else the routine continues at step 415 .
  • the team master stream executes steps 408 - 414 .
  • step 408 the routine waits for all other streams within the team to quit.
  • step 409 the routine links the save_area data structure of the stream to the head of the linked list of save_area data structures.
  • step 410 the routine invokes the last_stream_domain_signal_handler routine. This invoked routine returns only after this thread starts running again after being swapped in.
  • step 411 the routine restores the remaining state that was saved in step 402 .
  • step 412 the routine invokes the post swap_parallel_work function to perform any necessary work after the thread is swapped in.
  • step 413 the routine clears the domain_signal flag in the save_area data structure, so that the exception is cleared when the primary trap handler restores the state from the save_area data structure.
  • step 414 the routine unlocks the thread and returns to the primary trap handler.
  • Steps 415 and 416 are executed by the slave streams.
  • step 415 the routine links the save_area data structure to the linked list.
  • step 416 the routine quits the stream, which means that the stream is available to be allocated to another task, such as the task to be swapped in.
  • FIGS. 5A and 5B are flow diagrams of the last_stream_domain_signal_handler routine.
  • This routine is invoked by the team master stream of each team. This routine increments a number of teams variable, which is then used for barrier synchronization when the task is swapped in.
  • This routine then invokes an operating system call to notify the operating system that the team has completed saving its state and quitting the other streams. That operating system call does not return until the task is swapped back in, except for the call by the task master stream, which returns immediately.
  • the task master stream is the last stream that makes this operating system call.
  • the task master then performs an operating system call to notify the operating system that the task has completed saving its state.
  • the first stream that fetches and adds to a signal wait variable is designated as the task master stream for the swap in.
  • the task master stream creates a stream to process any Unix signals, and all the other team master streams wait until the Unix signal processing is complete.
  • the routine then invokes a routine to restart the slave streams for the team.
  • step 502 the routine fetches and adds to the num_teams variable in the task swap header data structure.
  • step 503 the routine invokes the tera_team_swapsave_complete operating system call passing the num_streams variable of the team swap header. This operating system call returns immediately when the last team master stream invokes it and returns as its return value a value of 1. For all other team master streams, this operating system call does not return until the task is swapped in. The last team master stream to invoke this operating system is designated as the task master stream.
  • step 504 if this stream is the task master stream, then the routine continues at step 505 , else the routine continues at step 507 .
  • step 505 the routine invokes the work_of_final_stream_in_task function. This invoked function does not return until the task is swapped in.
  • Steps 507 - 521 represent processing that is performed when the task is swapped in.
  • steps 507 - 508 the routine fetches and adds a 1 to the signal_wait variable of the task swap header and waits until that variable equals the num_teams variable in the task swap header. Thus, each team master stream waits until all the other team master streams reach this point in the routine before proceeding.
  • the first stream to increment the signal_wait variable is the task master stream for the swap in. Alternatively, the same stream that was designated as the task master for the swap out can also be the task master for the swap in.
  • the routine enables trapping for the domain_signal so that subsequent raising of the domain_signal will cause a trap.
  • the task master stream then processes the Unix signals. During the processing of Unix signals, another domain_signal may be raised. Thus, another swapout can occur before the states of the streams are completely restored.
  • the trap handler handling the domain_signal can handle nested invocations in that the trap handler can be executed again during execution of the trap handler. Therefore, an array of team and swap header data structures is needed to handle this nesting.
  • the routine enables the trapping of the domain_signal.
  • step 510 if this stream is the task master stream, then the routine continues at step 511 , else routine continues at step 513 .
  • step 511 the routine invokes the process_signals function to process the Unix signals.
  • the task master stream creates a thread to handle the Unix signals.
  • step 512 the routine sets the signal_wait$ synchronization variable of the task swap header to zero, in order to notify the other team master streams that the processing of the Unix signals is complete.
  • step 513 the routine waits for the notification that the task master stream has processed the Unix signals.
  • step 514 the routine disables the domain_signal to prevent nested handling of domain_signals.
  • the first save_area data structure in the linked list contains the state of team master stream when the task was swapped out.
  • the routine gets the next save_area data structure from the team swap header.
  • the routine clears the team swap header.
  • the routine fetches and adds a ⁇ 1 to the num_teams variable in the task swap header and waits until that variable is equal to 0.
  • each team master stream waits until all other team master streams reach this point in the processing.
  • these steps implement a synchronization barrier.
  • One skilled in the art would appreciate that such barriers can be implemented in different ways.
  • step 519 if this stream is the task master stream, then the routine continues at step 520 , else routine continues at step 521 .
  • step 520 the routine clears the task swap header, to initialize it for the next swap out.
  • step 523 the routine invokes the swap_restart_streams function to restart the slave streams of the team by creating streams, retrieving the save_area data structures, and initializing the created streams. This routine then returns.
  • FIG. 6 is a flow diagram of the work_of_final_stream_in_task function. This function determines whether the task is blocked and performs an operating system call to notify the operating system that the task has completed its save processing prior to being swapped out. The routine passes to the operating system call the indication of whether the task is blocked. If the task is blocked, the operating system can decide not to schedule this task until an external event occurs that would unblock this task. In this way, the operating system can allocate the resources of the processors to other tasks that are not blocked. A task is blocked when it is waiting only on an external event. In one embodiment, a task is considered blocked when all the streams of the task are executing the virtual processor code and the stream is not in the process of starting a thread, when no threads are ready to execute.
  • the virtual processor code can increment a counter when it determines that it is blocked and when that counter equals the number of streams of the task, then the task can be considered to be blocked. More generally, a task can notify the operating system whenever it becomes blocked so that the operating system can decide whether to swap out the task.
  • the routine determines whether the task is blocked.
  • the routine invokes the tera_task_saveswap_complete operating system call passing an indication of whether the task is currently blocked. This invocation of the operating system call does not return until the task is swapped in. The routine then returns.
  • FIG. 7 is a flow diagram of the process_signals function. This function loops retrieving and processing each Unix signal.
  • the user program may have registered with the user runtime customized signal handlers for processing the various Unix signals.
  • the function creates a thread control block for a new thread that is to process the Unix signals.
  • the function invokes the tera_get_signal_number operating system call. This operating system call returns the value of the signal number in the sig_num variable. If there are no Unix signals left to be handled, then this operating system call returns a 0.
  • the function saves the stream status word (SSW).
  • steps 704 - 708 the function executing in the new thread loops processing each signal.
  • step 704 if the sig_num variable is not equal to zero, then the function continues at step 705 , else the function continues at step 708 .
  • step 705 the function locates the handler for the returned signal number.
  • step 706 the function invokes the located handler.
  • step 707 the function invokes the tera_get_signal_number operating system call to retrieve the next signal number and loops to step 704 .
  • step 708 the function restores the saved SSW and returns.
  • FIG. 8 is a flow diagram of the swap_restart_stream function.
  • This function creates a stream for each of the threads that were executing when the stream was swapped out and restarts the thread executing in that stream.
  • the function retrieves and discards the first save_area data structure in the linked list.
  • the first save_area data structure is the data structure for the team master stream, which uses the stream provided by the operating system upon return from the tera_team_swapsave_complete operating system call of the team master stream.
  • steps 802 - 806 the function loops creating a stream for each save_area data structure in the link list.
  • the function retrieves the next save_area data structure in the linked list.
  • step 803 if all the save_area data structures have already been retrieved, then the function returns, else the function continues at step 804 .
  • step 804 the function creates a stream. The function loops to step 802 to retrieve the next save_area data structure. The newly created stream initializes the thread based on the retrieved save_area data structure and executes at the slave_return_from_swap address that was stored in the save_area data structure before the task was swapped out.
  • FIG. 9 is a flow diagram of the slave_return_from_swap routine.
  • This routine is invoked when the slave stream is created when the task is swapped in. This routine returns to the primary trap handler at a point after the invocation of the domain_signal_handler routine.
  • the routine restores the remaining state that was stored during the saving before the swap out.
  • the routine invokes the post_swap_parallel_work_routine to perform any application-dependent work upon swap in.
  • the routine unlocks the thread and returns to the routine that called the domain_signal_handler routine.
  • FIG. 10 is a block diagram of data structures used when swapping a task.
  • Each thread has a thread control block 1001 that contains information describing the current state of the thread and points to a team control block 1002 of the team of which the thread is a member.
  • the team control block contains information describing the team and points to a task control block 1005 of the task of which the team is a member.
  • the task control block contains information describing the task.
  • the team control block contains a pointer to a team swap header 1003 that contains information relating to the swapping of the team.
  • the team swap header contains a pointer to a linked list of save_area data structures 1004 that are used to restart the threads when the team is swapped in.
  • the task control block contains a pointer to a task swap header 1006 .
  • the task swap header contains information relating to the swapping of the task.
  • the operating system implements operating system calls that are provided to the user programs. When an operating system call is invoked, it begins executing on the same stream on which the invoking thread was executing. Certain operating system calls may be of indefinite duration. For example, an operating system call to return user input will not return until the user eventually inputs data. While the operating system call is waiting for user input, the user program can continue executing its other threads on its other streams. However, the user program effectively has one less stream on which to execute threads, because one of the streams is blocked on the operating system call.
  • the operating system and the user runtime implement an upcall protocol to return the stream to the user program while the operating system call is blocked.
  • An “upcall” occurs when the operating system invokes a function of the user program.
  • the user program typically the user runtime of the application program, can register special purpose functions with the operating system, so that the operating system knows which functions to invoke when it makes an upcall to the user program.
  • the user runtime registers a “rt_return_vp” function and a “rt_return_thread” function with the operating system.
  • the operating system When an operating system call that will block is invoked, the operating system (executing on the stream that invoked the operating system call) invokes the rt_return_vp function of the user program. This invocation returns the stream to the user program. The virtual processor code of the user program can then select another thread to execute on that stream while the operating system call is blocked. Eventually, the operating system call will become unblocked (e.g., the user has finally input data). When the operating system call becomes unblocked, the operating system (executing on one of its own streams) invokes the rt_return_thread function of the user program to notify the user program that the operating system call has now completed.
  • the rt_return_thread function performs the necessary processing to restart (or at least schedule) the thread that was blocked on the operating system call.
  • the rt_return_thread function then invokes the tera_return_stream operating system call to return the stream to the operating system.
  • a malicious user program could decide not to return the stream to the operating system and instead start one of its threads executing on that stream.
  • a user program could increase the number of streams allocated to it to an amount greater than the slim value set the operating system.
  • the operating system can mitigate the effects of such a malicious user program by not returning any more streams or, alternatively, killing the task when it detects that the user program has failed to return a certain number of the operating system streams.
  • FIGS. 11-16 illustrate the returning of a stream to a user program when an operating system call blocks. In one embodiment, this processing is performed by the user runtime.
  • FIG. 11 is a flow diagram of the user_entry_stub routine. This routine is a wrapper routine of an operating system call. This routine allocates a thread control block and then invokes the operating system call passing that thread control block. A new thread control block is needed because the rt_return_vp function and the rt_return_thread function may be executing at the same time on different streams. In particular, the rt_return_vp function may be executing in the stream returned by the operating system, and the rt_return_thread function may be executing in the operating system stream.
  • the rt_return_vp function is bound to this newly allocated thread control block.
  • this routine waits until the operating system stream is returned to the operating system and then deallocates the thread control block and returns.
  • the routine allocates a spare thread control block.
  • the routine sets the spare_thread_control_block variable in the upcall transfer (“ut”) data structure to point to this spare_thread_control_block.
  • the ut data structure contains information and synchronization variables that support the return of a stream to the user programs.
  • the routine sets the os_call variable of the thread control block that is not the spare thread control block to point to the address of the ut data structure.
  • step 1104 the routine enters the operating system passing the os_call variable to invoke the operating system call.
  • step 1105 upon return, if the operating system call was blocked, as indicated by the was_blocked variable of the ut data structure, then the routine continues at step 1106 , else the routine continues at step 1107 .
  • step 1106 the routine reads from the notify_done$ synchronization variable of the ut data structure. The full/empty bit of this synchronization variable is initially set to empty. The routine waits on this synchronization variable until the operating system call writes to it so that its full/empty bit is set to full, indicating that the operating system stream has been returned.
  • step 1107 the routine then deallocates the spare thread control block.
  • step 1108 the routine writes a 0 into the os_call variable of the thread control block and returns.
  • FIG. 12 is a flow diagram of the rt_return_vp function.
  • This function is invoked by the operating system to return a stream to the user program that invoked a blocking operating system call.
  • This function is passed the identification of the thread that invoked the blocking operating system call and its stream status word (SSW).
  • SSW stream status word
  • the function receives the thread control block for this thread.
  • the function increments the os_outstanding_threads variable of the team control block for this thread. This variable is used to keep track of the number of threads that are blocked in operating system calls.
  • the function sets the ut pointer to the value in the os_call variable of the thread control block, which was set in the user entry_stub routine.
  • step 1204 the function writes the passed identification of the thread into the call_id$ synchronization variable of the ut data structure. This sets the full/empty bit of the synchronization variable to full possibly after blocking.
  • the call_id$ synchronization variable is used by the thread executing on the stream.
  • the thread will spin in step 1206 , attempting to write to the call_id$ synchronization variable. This spinning will wait until the full/empty bit of the synchronization variable is set to empty.
  • a data blocked exception is raised. The trap handler for that exception determines whether the stream is locked.
  • the trap handler When a stream is locked by a thread, no other thread can execute on the stream. If the stream is locked, the trap handler returns to retry writing to the call_id$ synchronization variable. Thus, if the stream is locked, this thread will spin, waiting until the full/empty bit of this synchronization variable is set to empty when the operating system call completes. If, however, the stream is not locked, the trap handler places this thread on a blocked list and invokes the virtual processor code to schedule another thread to execute on this stream. In step 1205 , the function sets the was_blocked flag of the ut data structure so that the user_entry_stub routine will know whether to wait for the operating system stream to be returned to the operating system before the spare thread control block can be released.
  • step 1206 the routine writes a value of 0 into the call_id$ synchronization variable of the ut data structure. Since the full/empty bit of this synchronization variable was set to full in step 1204 , step 1206 retries the write until the full/empty bit is empty or a data blocked exception is raised as described above.
  • step 1207 the function returns to the user_entry_stub at the return point from the operating system call.
  • FIG. 13 is a flow diagram of the rt_return_thread function.
  • This function is invoked by the operating system to notify a user program that a thread that was blocked on an operating system call is now unblocked.
  • This function is passed the thread control block of the blocked thread and a return value of the operating system call.
  • the function sets the ut pointer to the value in the os_call variable of the thread control block.
  • the function sets the return_value variable of the ut data structure to point to be passed return value.
  • the function reads the call_id$ synchronization variable, which sets the full/empty bit of the synchronization variable to empty and allows the write in step 1206 to proceed.
  • step 1304 the function fetches and adds a ⁇ 1 to the os_outstanding_threads variable of the team control block for the thread. This allows the team to keep track of the number of threads that are blocked on an operating system call. A team will not be swapped out while an operating system call from a stream on that team is blocked.
  • step 1305 the function invokes the tera_return_stream operating system call to return this stream to the operating system.
  • FIG. 14 is a flow diagram of the tera_return_stream operating system call. This routine is invoked to return the operating system stream that was used to notify the user program of the completion of an operating system call.
  • This operating system call is passed a thread control block.
  • the operating system call sets the ut pointer to the os_call variable in the thread control block.
  • the operating system call disables trapping of the domain_signal exception.
  • the operating system call writes a value of 0 to the notify_done$ synchronization variable of the ut data structure, which notifies the user_entry_stub routine that the operating system stream has been returned.
  • the operating system invokes an operating system call to effect the returning of the stream to the operating system.
  • FIG. 15 is a flow diagram of a trap handler routine for handling data blocked exceptions that are raised when waiting for an operating system call to complete.
  • the exception is raised by step 1206 of the rt_return_vp function.
  • step 1501 if the stream is locked, then the routine returns, else the routine continues at step 1502 .
  • step 1502 the routine adds the thread to a list of blocked threads.
  • step 1503 the routine starts the virtual processor code for this stream so that another thread can start executing.
  • FIG. 16A is a diagram illustrating the synchronization of the user program and the operating system when the user program invokes an operating system call that blocks.
  • the diagram illustrates the processing performed by the user stream 1601 and the processing performed by an operating system stream 1602 .
  • the solid lines with arrows indicate flow of control from one routine within a stream to another routine within the same stream.
  • the dashed lines indicate the interaction of the synchronization variables.
  • the ellipses indicate omitted steps of the functions.
  • the user program invokes an operating system call by invoking the user_entry_stub routine 1100 . That routine in step 1104 invokes the operating system call. As indicated by the solid line between steps 1104 and 1603 , the user stream starts executing the operating system call.
  • the operating system call 1603 invokes the rt_return_vp function in step 1604 .
  • the rt_return_vp function 1200 stores a value the call_id$ synchronization variable in step 1204 , which sets the full/empty bit of the synchronization variable to full.
  • the rt_return_vp function then writes a value into the call_id$ synchronization variable in step 1206 . Since the call_id$ synchronization variable just had a value stored in it, its full/empty bit is set to full. This write cannot succeed until the full/empty bit is set to empty. Thus, step 1206 will cause data blocked exception to be raised and the trap handler routine 1500 will be invoked.
  • step 1501 if the thread is locked, then the trap handler returns to the blocking synchronization write in step 1206 .
  • the process of raising a data blocked exception and returning for a locked thread will continue until the full/empty bit of the call_id$ synchronization variable is set to empty when the operating system call completes.
  • the trap handler routine places the thread on the blocked pool and executes the virtual processor code to select another thread to execute on that stream.
  • the operating system call 1605 completes, the operating system in step 1606 invokes the rt_return_thread function 1300 of the user program. This invocation is within a stream allocated to the operating system.
  • the rt_return_thread function 1300 reads the call_id$ synchronization variable in step 1303 , which sets its full/empty bit to empty. As indicated by the dashed line, the writing of that synchronization variable in step 1206 then succeeds.
  • the rt_return_vp function then completes the execution of step 1206 and continues to step 1207 .
  • step 1207 the function returns to the location of the user_entry_stub routine immediately after the invocation of the operating system call.
  • the user_entry_stub routine in step 1106 reads the notify done$ synchronization variable. Since the full/empty bit of this synchronization variable is initially empty, this read blocks.
  • the rt_return_thread routine in step 1305 invokes the tera_return_stream operating system call 1400 to return the stream to the operating system.
  • the tera_return_stream operating system writes a value of 0 to the notify_done$ synchronization variable, which sets its full/empty bit to full. This releases the blocked read in step 1106 and the user_entry_stub routine returns to the user code.
  • FIG. 16B illustrates the Upcall Transfer (Ut) data structure 1650 .
  • the ut data structure 1650 is passed to the operating system when a blocking operating system call is invoked.
  • the ut data structure 1650 contains information for synchronizing the return of the stream to the user program.
  • the was_blocked flag 1655 is set to indicate whether the operating system call was blocked so that the user program can wait until the operating system stream is returned to the operating system and so that the function knows when return values need to be retrieved from the ut data structure 1650 .
  • the call_id$ synchronization variable 1660 is used to notify the thread that invoked the operating system call and that has locked the thread, that the operating system call is complete.
  • the notify_done$ synchronization variable 1665 is used to notify the thread that the operating system stream has been returned.
  • the spare_tcb pointer 1670 points to the spare thread control block that is used when the operating system notifies the user program that the operating system call is complete.
  • the return_value variable 1675 contains the return value of the operating system call.
  • the Unix operating system supports the concepts of a “long jump.”
  • a long jump transfers control from a certain point in a program to an arbitrary return point in the program that was previously identified.
  • a program can identify the return point by invoking a setjmp routine.
  • the setjmp routine sets the return point to the return address of the setjmp routine invocation.
  • the setjmp routine returns, it returns a certain value to indicate that the setjmp routine has just returned.
  • the return value has a different value. In this way, the code at the return point can determine whether the setjmp routine has just returned or whether a long jump has just occurred.
  • the setjmp routine also returns information describing the return point. To effect a long jump, a program invokes a longjmp routine passing the information returned by the setjmp routine.
  • a long jump is useful for immediately jumping to a known location when the user inputs a certain command. For example, if a user has completely traversed a menu hierarchy and is viewing the lowest level menu items, a certain command (e.g., “control-c”) can be used to signify that the user wants to immediately return to the highest level menu without having to exit each of the intermediate level menus manually.
  • a certain command e.g., “control-c”
  • the user program can invoke the setjmp routine at the point where the highest level menu is displayed and processed.
  • the user program can invoke the longjmp routine to effect the immediate jump to the return point of the invocation of the setjmp routine.
  • the longjmp routine may be invoked by a function that is invoked by other functions to an arbitrary level of nesting. To effect the long jump, the longjmp routine uses well-known techniques to undo the stack frames resulting from the nested invocation and to release any memory that was allocated by the functions whose invocations are represented by the stack frames.
  • FIGS. 17-22 illustrate the processing of a long jump in a MTA computer.
  • one thread of execution may want to effect a long jump to a set jump location (i.e., return point) that was set in another thread of execution (i.e., an inter-thread long jump).
  • the longjmp routine first locates the control block for the set jump thread. The longjmp routine then determines the current state of that set jump thread. Based on the current state, the longjmp routine causes the set jump thread to start executing at the set jump location. If the set jump thread is blocked on an operating system call, then the longjmp routine notifies the operating system to abort that operating system call.
  • the longjmp routine then can set the program counter of the set jump thread to a function that performs a standard (i.e., intra-thread) long jump.
  • a standard i.e., intra-thread
  • the longjmp routine may be invoked by a signal handler routine. For example, in a Unix environment, a program is notified of a “control-c” command by a Unix signal. Since, as described above, a new thread is created to handle Unix signals, each long jump in such a signal handler routine is an inter-thread long jump.
  • a Unix signal is received, the operating system notifies the user program whether any blocked operating system calls will automatically return or automatically be restarted. If the blocked operating system calls are restarted, then the longjmp routine directs the operating system to abort the operating system call on which the thread is blocked, if the thread is blocked on one.
  • FIG. 17 is a flow diagram of the basic longjmp routine. This routine is invoked whenever a long jump is to be performed. This routine determines whether the long jump is inter- or intra-thread and performs the appropriate behavior. This routine is passed a set jump buffer that was generated and returned by the setjmp routine. The set jump buffer contains the thread identifier of the thread that invoked the setjmp routine along with the set jump location information describing the state of the thread when the setjmp routine was invoked. In step 1701 , if the currently executing thread is not the thread that invoked the setjmp routine, then the routine continues at step 1703 , else the routine continues at step 1702 .
  • step 1702 the routine unwinds at the stack frames, restores the state of the jump buffer and returns to the set jump location.
  • the routine invokes the indirect_longjmp routine to effect an inter-thread long jump. The routine then returns.
  • FIG. 18 is a flow diagram of the indirect_longjmp routine.
  • This routine implements inter-thread long jumps.
  • the routine determines the state of the set jump thread and based on that state, modifies the state information (e.g., program counter) of the set jump thread to effect an inter_thread long jump.
  • the routine retrieves the thread identifier from the set jump buffer.
  • the routine locates the save_area data structure for the set jump thread.
  • the routine retrieves the thread control block from the save_area data structure.
  • step 1804 the routine jumps to steps 1805 , 1807 , 1806 , or 1807 , depending on whether the state of the thread is “blocked,” “resumable,” “running,” or “transition,” respectively.
  • a “blocked” thread is one that is blocked on any synchronization timeout. The processing for a blocked thread is shown in FIG. 19 .
  • a “running” thread is one that is currently executing on a stream. The processing for a running thread is shown in FIG. 20 .
  • a “resumable” thread is one that is ready and waiting to be allocated a stream. No special processing is performed for a resumable thread.
  • a “transition” thread is one that is in the process of being allocated a stream.
  • step 1807 if the state of the thread is “running,” then the routine returns, else the routine continues at step 1808 .
  • the routine sets the program counter in the thread control block data structure to the address of the longjmp routine.
  • the routine put the thread control block on a list of unblocked threads. In this way, when the thread starts running, it will invoke the longjmp routine.
  • FIG. 19 is a flow diagram of the processing performed when the state of the thread is “blocked.”
  • the routine removes the thread from the blocked list.
  • the routine sets the state of the thread to “resumable.”
  • the routine invokes the check_blocked_on_os_call routine to abort the operating system call if it will be restarted. The routine then returns.
  • FIG. 20 is a flow diagram of the processing performed when the state of the thread is “running.”
  • the routine invokes the check_blocked_on_os_call routine to abort the operating system call if it will be restarted.
  • the routine saves any additional state information that was not saved by the data blocked trap handler.
  • the data block handler saves minimal state information in case the thread decides to immediately redo the operation that caused the exception.
  • the routine creates and initializes a save_area data structure.
  • the routine sets the program counter in the save_area data structure to the address of the longjmp routine and then returns.
  • FIG. 21 is a flow diagram of the check_on_blocked_os_call routine.
  • the routine retrieves the ut data structure from the os_call variable of the thread control block. If the pointer to the ut data structure is null, the routine returns.
  • routine if the blocked operating system call is being restarted, then routine continues at step 2103 , else the routine continues at step 2104 .
  • the routine requests the operating system to abort the restarted operating system call.
  • the routine reads the notify_done$ synchronization variable of the ut data structure. This read will cause the longjmp routine to wait until the abort is complete.
  • the routine deallocates the spare thread control block that was used to notify the user program that the operating system call has completed, and returns.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Multi Processors (AREA)

Abstract

A method and system that prepares a task for being swapped out from processor utilization that is executing on a computer with multiple processors that each support multiple streams. The task has one or more teams of threads, where each team represents threads executing on a single processor. The task designates, for each stream that is executing a thread, one stream as a team master stream and one stream as a task master stream. For each team master stream, the task notifies the operating system that the team is ready to be swapped out when each other thread of the team has saved its state and has quit its stream. Finally, for the task master stream, the task notifies the operating system that the task is ready to be swapped when it has saved its state and each other team has notified that it is ready to be swapped out.

Description

CROSS-REFERENCE TO RELATED APPLICATION
This is a divisional of and claims priority to U.S. patent application Ser. No. 09/192,205, now U.S. Pat. No. 6,952,827 filed on Nov. 13, 1998, and which is hereby incorporated by reference.
TECHNICAL FIELD
The present invention relates to an interface between a user program and an operating system and, more particularly, to such an interface in a multithreaded environment.
BACKGROUND
Parallel computer architectures generally provide multiple processors that can each be executing different tasks simultaneously. One such parallel computer architecture is referred to as a multithreaded architecture (MTA). The MTA supports not only multiple processors but also multiple streams executing simultaneously in each processor. The processors of an MTA computer are interconnected via an interconnection network. Each processor can communicate with every other processor through the interconnection network. FIG. 1 provides a high-level overview of an MTA computer. Each processor 101 is connected to the interconnection network and memory 102. Each processor contains a complete set of registers 101 a for each stream. In addition, each processor also supports multiple protection domains 101 b so that multiple user programs can be executing simultaneously within that processor.
Each MTA processor can execute multiple threads of execution simultaneously. Each thread of execution executes on one of the 128 streams supported by an MTA processor. Every clock time period, the processor selects a stream that is ready to execute and allows it to issue its next instruction. Instruction interpretation is pipelined by the processor, the network, and the memory. Thus, a new instruction from a different stream may be issued in each time period without interfering with other instructions that are in the pipeline. When an instruction finishes, the stream to which it belongs becomes ready to execute the next instruction. Each instruction may contain up to three operations (i.e., a memory reference operation, an arithmetic operation, and a control operation) that are executed simultaneously.
The state of a stream includes one 64-bit Stream Status Word (“SSW”), 32 64-bit General Registers (“R0-R31”), and eight 32-bit Target Registers (“T0-T7”). Each MTA processor has 128 sets of SSWs, of general registers, and of target registers. Thus, the state of each stream is immediately accessible by the processor without the need to reload registers when an instruction of a stream is to be executed.
The MTA uses program addresses that are 32 bits long. The lower half of an SSW contains the program counter (“PC”) for the stream. The upper half of the SSW contains various mode flags (e.g., floating point rounding, lookahead disable), a trap disable mask (e.g., data alignment and floating point overflow), and the four most recently generated condition codes. The 32 general registers are available for general-purpose computations. Register R0 is special, however, in that it always contains a 0. The loading of register R0 has no effect on its contents. The instruction set of the MTA processor uses the eight target registers as branch targets. However, most control transfer operations only use the low 32 bits to determine a new program counter. One target register (T0) points to the trap handler, which may be an unprivileged program. When a trap occurs, the trapping stream starts executing instructions at the program location indicated by register T0. Trap handling is lightweight and independent of the operating system and other streams. A user program can install trap handlers for each thread to achieve specific trap capabilities and priorities without loss of efficiency.
Each MTA processor supports as many as 16 active protection domains that define the program memory, data memory, and number of streams allocated to the computations using that processor. Each executing stream is assigned to a protection domain, but which domain (or which processor, for that matter) need not be known by the user program.
The MTA divides memory into program memory, which contains the instructions that form the program, and data memory, which contains the data of the program. The MTA uses a program mapping system and a data mapping system to map addresses used by the program to physical addresses in memory. The mapping systems use a program page map and a data segment map. The entries of the data segment map and program page map specify the location of the segment in physical memory along with the level of privilege needed to access the segment.
The number of streams available to a program is regulated by three quantities: the stream limit (slim), the current number of streams (scur), and the number of reserved streams (sres) associated with each protection domain. The current numbers of streams executing in the protection domain is indicated by scur; it is incremented when a stream is created and decremented when a stream quits. A create can only succeed when the incremented scur does not exceed sres, the number of streams reserved in the protection domain. The operations for creating, quitting, and reserving streams are unprivileged. Several streams can be reserved simultaneously. The stream limit slim is an operating system limit on the number of streams the protection domain can reserve.
When a stream executes a CREATE operation to create a new stream, the operation increments scur, initializes the SSW for the new stream based on the SSW of the creating stream and an offset in the CREATE operation, loads register (T0), and loads three registers of the new stream from general purpose registers of the creating stream. The MTA processor can then start executing the newly created stream. A QUIT operation terminates the stream that executes it and decrements both sres and scur. A QUIT_PRESERVE operation only decrements scur, which gives up a stream without surrendering its reservation.
The MTA supports four levels of privilege: user, supervisor, kernel, and IPL. The IPL level is the highest privilege level. All levels use the program page and data segment maps for address translation, and represent increasing levels of privilege. The data segment map entries define the minimum levels needed to read and write each segment, and the program page map entries define the exact level needed to execute from each page. Each stream in a protection domain may be executing at a different privileged level.
Two operations are provided to allow an executing stream to change its privilege level. A “LEVEL_ENTER lev” operation sets the current privilege level to the program page map level if the current level is equal to lev. The LEVEL_ENTER operation is located at every entry point that can accept a call from a different privilege level. A trap occurs if the current level is not equal to lev. The “LEVEL_RETURN lev” operation is used to return to the original privilege level. A trap occurs if lev is greater than the current privilege level.
An exception is an unexpected condition raised by an event that occurs in a user program, the operating system, or the hardware. These unexpected conditions include various floating point conditions (e.g., divide by zero), the execution of a privileged operation by a non-privileged stream, and the failure of a stream create operation. Each stream has an exception register. When an exception is detected, then a bit in the exception register corresponding to that exception is set. If a trap for that exception is enabled, then control is transferred to the trap handler whose address is stored in register T0. If the trap is currently disabled, then control is transferred to the trap handler when the trap is eventually enabled assuming that the bit is still set in the exception register. The operating system can execute an operation to raise a domain_signal exception in all streams of a protection domain. If the trap for the domain_signal is enabled, then each stream will transfer control to its trap handler.
Each memory location in an MTA computer has four access state bits in addition to a 64-bit value. These access state bits allow the hardware to implement several useful modifications to the usual semantics of memory reference. These access state bits are two data trap bits, one full/empty bit, and one forward bit. The two data trap bits allow for application-specific lightweight traps, the forward bit implements invisible indirect addressing, and the full/empty bit is used for lightweight synchronization. The behavior of these access state bits can be overridden by a corresponding set of bits in the pointer value used to access the memory. The two data trap bits in the access state are independent of each other and are available for use, for example, by a language implementer. If a trap bit is set in a memory location, then an exception will be raised whenever that location is accessed if the trap bit is not disabled in the pointer. If the corresponding trap bit in the pointer is not disabled, then a trap will occur.
The forward bit implements a kind of “invisible indirection.” Unlike normal indirection, forwarding is controlled by both the pointer and the location pointed to. If the forward bit is set in the memory location and forwarding is not disabled in the pointer, the value found in the location is interpreted as a pointer to the target of the memory reference rather than the target itself. Dereferencing continues until either the pointer found in the memory location disables forwarding or the addressed location has its forward bit cleared.
The full/empty bit supports synchronization behavior of memory references. The synchronization behavior can be controlled by the full/empty control bits of a pointer or of a load or store operation. The four values for the full/empty control bits are shown below.
VALUE MODE LOAD STORE
0 normal read regardless write regardless
and set full
1 reserved reserved
2 future wait for full wait for full
and leave full and leave full
3 sync wait for full wait for empty
and set empty and set full
When the access control mode (i.e., synchronization mode) is future, loads and stores wait for the full/empty bit of memory location to be accessed to be set to full before the memory location can be accessed. When the access control mode is sync, loads are treated as “consume” operations and stores are treated as “produce” operations. A load waits for the full/empty bit to be set to full and then sets the full/empty bit to empty as it reads, and a store waits for the full/empty bit to be set to empty and then sets the full/empty bit to full as it writes. A forwarded location (i.e., its forward bit is set) that is not disabled (i.e., by the access control of a pointer) and that is empty (i.e., full/empty bit is set to empty) is treated as “unavailable” until its full/empty bit is set to full, irrespective of access control.
The full/empty bit may be used to implement arbitrary indivisible memory operations. The MTA also provides a single operation that supports extremely brief mutual exclusion during “integer add to memory.” The FETCH_ADD operation loads the value from a memory location and stores the sum of that value and another value back into the memory location.
Each protection domain has a retry limit that specifies how many times a memory access can fail in testing full/empty bit before a data blocked exception is raised. If the trap for the data blocked exception is enabled, then a trap occurs. The trap handler can determine whether to continue to retry the memory access or to perform some other action. If the trap is not enabled, then the next instruction after the instruction that caused the data blocked exception is executed.
The appendix contains the “Principles of Operation” of the MTA, which provides a more detailed description of the MTA.
SUMMARY
Embodiments of the present invention provide a method system for placing a task with multiple threads in a known state, such as a quiescent state. To effect the placing of the task in the known state, each thread of the task is notified that it should enter the known state. In response to receiving the notification, each of the threads enter the known state. When in the known state, certain actions can be performed safely without concern without corrupting the state of the task. The known state of the task may be the execution of idle instructions by each of the threads or by stopping the execution of instructions by the threads (e.g., quitting the streams). The notification may be by raising a domain signal for the protection domain in which the task is executing. The notification may also be initiated by the task itself by, for example, sending a request to the operating system. Prior to entering the known state, the threads may save their state information so that when the known state is exited the threads can restore their saved state and continue execution. The task, in response to receiving the notification, may also notify the operating system that the task is blocked from further productive use of the processor until an event occurs. In this way, rather than having the task continue to execute idle instructions (e.g., instructions looping checking for an event to occur), the operating system may assign the processor to another task. The operating system may also defer re-assigning the processor to the task until an event occurs that is directed to that task. Once a task has entered the known state, various actions can be performed relative to the task. For example, the operating system may assign the processor resources used to by that task to another task. Also, a debugger, which may be executing as one of the threads of the task, can access the state information saved by the other threads of the task. A designated thread of the task may also process operating system signals when the other threads of the task are in the known state. After the signals are processed by the thread, the other threads can be allowed to exit the known state. More generally, after the actions to be performed while the task is in the known state, then the threads of the task can exit the known state. A task that has entered a known state may exit the known state by receiving a notification to exit the known state. Upon receiving the notification, each thread exits the known state by executing instructions that were to be executed prior to entering the known state or more generally continuing with productive work (e.g., non-idle instructions). Upon receiving the notification, one thread may be designated as a master thread for causing the other threads to exit their known state (e.g., creating streams). The master thread may also perform signal processing prior to allowing the other threads to exit their known state.
One embodiment of the present invention provides a method in a multithreaded computer for preparing a task to be “swapped out” from processor utilization by placing the task in a known state. The computer has a processor with multiple streams for executing threads of the task. To prepare for being swapped out, the task designates one stream that is executing a thread to be a master stream. The task then saves the state of each stream that is executing a thread. Under control of each stream that is not the master stream, the task quits the stream. Under control of the master stream, the task notifies the operating system that the task is ready to be swapped out. The operating system can then swap the task out from processor utilization. In another embodiment, the method prepares a task that is executing on a computer with multiple processors. The task has one or more “teams” of threads where each team represents threads executing on a single processor. The task designates, for each stream, one stream that is executing a thread to be a team master stream. The task then designates one stream that is executing a thread to be a task master stream. For each team master stream, the task notifies the operating system that the team is ready to be swapped out when each other thread of the team has quit its stream. Finally, for the task master stream, the task notifies the operating system that the task is ready to be swapped out when each of the other teams have notified the operating system that that team is ready to be swapped out.
Other aspects of the present invention provide for a server to coordinate assignment of resources with various clients. The server initially assigns a resource to a client. The server then receives notification from the client assigned to the resource that the client is waiting for an occurrence of an event before the resource can be productively used. The server, upon receiving the notification, assigns the resource from the client and does not reassign that resource to the client until after the event occurs. In one embodiment, the server is an operating system, the clients are tasks, and the resource is a processor or protection domain. The server may receive the notification in response to a request that the task save its state information prior to having that resource un-assigned. After that external event occurs, the server can then reassign the resource to the task.
Another aspect of the present invention provides a method in a computer system for returning to a task a stream that is executing an operating system call that is blocked. The computer system has a processor with multiple streams. To return the stream, the operating system executing on a stream invokes a function provided by the task. The invoked function then executes instructions on that stream to effect the return of the stream to the task. The operating system then notifies the task when the operating system call is complete. Upon receiving the notification, the task can then continue the execution of the thread that invoked the blocking operating system call.
More generally, the present invention assigns a processor resource to a task after a thread of the task invokes an operating system call that will block waiting for the occurrence of an event. To assign the processor resource back to the task, the operating system invokes a routine of the task so that that routine can assign the processor resource to another thread of the task. In this way, the task can continue to execute other threads even though one of its threads may be blocked on operating system call.
Another aspect of the present invention provides, a method in a computer system for performing an inter-thread long jump from a long jump thread to a set jump thread. To effect the inter-thread long jump, the long jump thread receives an indication of a set jump location that was set by the set jump thread. The long jump thread then determines whether the set jump thread is the same thread that is currently executing. When the set jump thread is not the same thread that is currently executing, the long jump thread sets the state of the set jump thread to next execute a long jump indicating the set jump location. When the set jump thread executes its next instructions, an intra-thread long jump is performed.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 provides a high-level overview of the MTA.
FIG. 2 is a block diagram illustrating components of the operating system and user programs in one embodiment.
FIG. 3 is a flow diagram of the primary exception handler routine.
FIG. 4 is a flow diagram of the domain_signal_handler routine.
FIGS. 5A and 5B are flow diagrams of the last_stream_domain_signal_handler routine.
FIG. 6 is a flow diagram of the work_of_final_stream_in_task function.
FIG. 7 is a flow diagram of the process_signals function.
FIG. 8 is a flow diagram of the swap_restart_stream function.
FIG. 9 is a flow diagram of the slave_return_from_swap routine.
FIG. 10 is a block diagram of data structures used when swapping a task.
FIG. 11 is a flow diagram of the user_entry_stub routine.
FIG. 12 is a flow diagram of the rt_return_vp function.
FIG. 13 is a flow diagram of the rt_return_thread function.
FIG. 14 is a flow diagram of the tera_return_stream operating system call.
FIG. 15 is a flow diagram of a trap handler routine for handling data blocked exceptions that are raised when waiting for an operating system call to complete.
FIG. 16A is a diagram illustrating the synchronization of the user program and the operating system when the user program invokes an operating system call that blocks. FIG. 16B illustrates the Upcall Transfer (ut) data structure.
FIG. 17 is a flow diagram of the basic longjmp routine.
FIG. 18 is a flow diagram of the indirect_longjmp routine.
FIG. 19 is a flow diagram of the processing performed when the state of the thread is “blocked.”
FIG. 20 is a flow diagram of the processing performed when the state of the thread is running.
FIG. 21 is a flow diagram of the check_on_blocked_os_call routine.
DETAILED DESCRIPTION
Embodiments of the present invention provide an interface between a user program and an operating system in an MTA computer. In one aspect of the present invention, the user program cooperates with the operating system in saving the state of the user program when the operating system wants to allocate the protection domain in which the user program is executing to another user program so that the other user program may start executing its instructions. The operating system allows each user program to execute for a certain time slice or quantum before “swapping out” the user program from its protection domain. The operating system notifies the user program when the quantum expires. Each stream that is allocated to that user program receives the notification. Upon receiving the notification, each stream saves its state and quits except for one stream that is designated as a master stream. The master stream saves its state and waits for all the other streams to quit. The master stream then notifies the operating system that the user program is ready to be swapped out of its protection domain. The master stream also notifies the operating system of the number of streams that were created (or alternatively reserved) when the quantum expired. When the operating system decides to allow the user program to start executing again (i.e., be “swapped in”), the operating system restarts the thread that was executing in the master stream. That thread then creates the other streams and restarts each of the threads executing where they left off using the saved state. The operating system may defer swapping in the user program until sufficient streams (as indicated by the user program when it was swapped out) are available so that when the user program is swapped in, it can create the same number of streams it quit when swapping out.
In another aspect of the present invention, the operating system returns streams to the user program when the thread that was executing on the stream is blocked on an operating system call. Each user program may be limited to a certain number of streams by the operating system. A user program can create streams up to this limit and start different threads executing in each of the created streams. When a thread makes an operating system call, the operating system starts executing on the same stream on which the thread was executing. When the operating system call blocks (e.g., waiting for user input), the operating system returns that stream to the user program so that the user program can schedule another thread to execute on that stream. The operating system eventually notifies the user program when the operating system call completes, and the user program can restart the thread that was blocked on that operating system call. In this way, the user program can continue to use all of its created streams even though a thread is blocked on an operating system call.
In another aspect of the present invention, Unix-type set jump and long jump inter-thread behavior is supported. When invoked, a set jump function stores the current state of the stream in a set jump buffer. The current state includes the return address for that invocation of the set jump function. When a long jump function is eventually invoked passing the set jump buffer as a parameter, the long jump function deallocates memory (e.g., stack frames) allocated since the set jump function was invoked, restores the stream state stored in the set jump buffer, and jumps to the return address. If the long jump function is invoked by a thread (“the long jump thread”) different from the thread (“the set jump thread”) that invoked the set jump function, the long jump function first locates the state information for the set jump thread. The long jump function then sets the program counter in that state information to point to an instruction that invokes the long jump function passing the set jump buffer. When the set jump thread then executes its next instruction, an intra-thread long jump is performed.
FIG. 2 is a block diagram illustrating components of the operating system 210 and user programs 220 in one embodiment. The operating system includes a processor scheduler 211, a task list 212, and instructions implementing various operation system calls 213. The processor scheduler assigns tasks to execute in the protection domains of a processor, such tasks are referred to as active tasks. The term “task” refers to a running user program that may currently be either active or inactive. Periodically (e.g., when a time quantum expires), the processor scheduler determines whether an inactive task should be made active. If all the protection domains are already assigned to active tasks, then the operating system will swap out an active task, making it inactive, and swap in an inactive task making it active. If an MTA computer has multiple processors, then the operating system may assign multiple protection domains on different processors to the task. In this way, computations of the task can be executed simultaneously, not only on multiple streams on one processor, but also on multiple streams on multiple processors. The threads of execution of a task that are executing on one processor are referred to as a “team” of the task. Thus, a task comprises one or more teams, and each team comprises one or more threads of execution.
Each user program 220 includes user code 221 and a user runtime 222. The user code is the application-specific code of the user program, and the user runtime is code provided to assist the user program in managing the scheduling of threads to streams. The user runtime includes virtual processor code 223 and a thread list 224. The virtual processor code is responsible for deciding which thread to assign to the stream on which the virtual processor code is executing. When a task creates a stream, the virtual processor code is executed to select which thread should be assigned to that stream. When a thread completes, the virtual processor code also is executed to determine the next thread to assign to that stream. If threads are not currently available to assign to the stream, the virtual processor code may quit the stream so that the stream can be assigned to another task. The user runtime also provides standard trap handlers for handling various exceptions with a standard behavior. The user code can override the standard behaviors by providing customized trap handlers for various exceptions.
Task Swap Out
The processor scheduler of the operating system coordinates the allocation of the processor to the various tasks that are currently ready to be executed. As described above, each processor has 16 protection domains and can thus be simultaneously executing up to 15 tasks with the operating system being executed in the other domain. The processor scheduler allows each task to execute for a certain time quantum. When the time quantum expires for a task, the processor scheduler raises the domain_signal for the protection domain of that task to initiate a swap out for that task. The swapping in and swapping out of tasks requires cooperation on the part of the task. To swap out a task, the operating system asks the task to save its state and quit all its streams, but one. The one remaining stream then notifies the operating system that the state of the task has been saved and that another task can be swapped into that protection domain. If the task ignores the notification, then the operating system can abort the task.
The operating system notifies the task of the impending swap out by raising the domain_signal, which causes each stream of that task to trap (assuming the domain_signal trap is enabled) and to start executing its primary trap handler, whose address is stored in register T0. The primary trap handler saves the state of the thread executing on that stream and then invokes a domain_signal_handler routine. The task may be executing on multiple streams and on multiple processors. To ensure that the state of all executing threads are properly saved and that the task quits all its streams in an orderly manner, each team of the task designates one of the streams executing a thread of the task to be a team master stream, and the team master streams designate one of the team master streams to be a task master stream. In one embodiment, the team master stream is the thread that first increments a team master variable, and the task master stream is that team master stream that first notifies (or alternatively that last notifies) the operating system that its team is ready to be swapped out.
Each team master stream waits for all other streams of the team to quit and then performs some clean-up processing before notifying the operating system that all the other streams of the team have quit and that the team is ready to be swapped out. Analogously, the task master stream waits until all the team master streams have notified the operating system and performs some clean-up processing for the task before notifying the operating system that the task is ready to be swapped out. The team master streams and the task master stream notify the operating system by invoking an operating system call. The operating system then takes control of the last stream in each team and can start another task executing on that stream as part of swapping in that other task.
When the operating system eventually decides to swap in the task, the operating system returns from the operating system calls of the team master streams. A task master stream processes any Unix signals that have arrived and then releases all the other team master streams to restore the saved states. Each team master stream creates a stream for each thread that was running when the task was swapped out and sets the state of the created streams to the saved states of the threads.
FIGS. 3-11 illustrate the saving and restoring of a task state when the task is swapped out and then swapped in. In one embodiment, this saving and restoring is provided by the user runtime, which is started when the domain_signal_handler routine is invoked. The data structures used when the task state is saved and restored are shown in FIG. 10. FIG. 3 is a flow diagram of the primary exception handler routine. The address of the primary exception handler routine is stored in register T0. The primary exception handler routine determines which exception has been generated and invokes an appropriate secondary exception handler to process the exception. In step 301, the routine saves the state of the thread in a save_area data structure and disables the domain signal trap. The primary exception handler may save only partial thread state information depending on the type of exception. For example, if the exception is a data blocked exception, then the primary exception handler may save very little state information so that the handling can be lightweight if the secondary handler decides to retry access to the blocked memory location. In step 302, if a domain_signal exception has been raised, then routine continues at step 303, else the routine continues to check for other exceptions. In step 303, the routine invokes the domain_signal_handler routine to process the exception. The domain_signal_handler routine returns after the task has been swapped out and then swapped in. In step 304, the routine restores the thread state and returns to the user code.
FIG. 4 is a flow diagram of the domain_signal_handler routine. This routine is invoked by the primary trap handler when the raising of the domain_signal caused the trap. The domain_signal of a protection domain is raised by the operating system when the operating system wants to swap out the task executing on that protection domain. The primary trap handler is executed by each stream in the protection domain and saves most of the state of the stream. This routine is passed a save_area data structure that contains that state. Each stream links its save_area data structure onto a linked list so that the state is available when the task is swapped in. If the stream is not a team master stream, that is, it is a slave stream, then the stream quits. If the stream is the team master stream, the routine then invokes the last_stream_domain_signal_handler routine, which does not return until the team master stream is swapped in. When that invoked routine returns, this routine returns to the primary trap handler, which restores, from the save_area data structure, the state of the stream at the time when the domain_signal was raised.
In step 401, the routine locks the thread. The locking of the thread means that the thread running on the stream will not give up the stream on a blocking call to the operating system or any other event such as a synchronization retry-limit exception. In step 402, the routine saves any remaining state that was not saved by the primary trap handler. In step 404, the routine invokes the preswap_parallel_work function to perform any necessary work for the running thread prior to swapping out the task. In step 405, the routine stores the address of the return point for this thread, upon swap in, in the return_linkage variable of the save_area data structure. In this embodiment, the address of slave_return_from_swap function is stored as the return point. In step 406, the routine fetches and adds to a team master variable. The first stream to fetch and add to the team master variable is the team master stream for the team. In step 407, if this stream is the team master stream, then the routine continues at step 408, else the routine continues at step 415. The team master stream executes steps 408-414. In step 408, the routine waits for all other streams within the team to quit. In step 409, the routine links the save_area data structure of the stream to the head of the linked list of save_area data structures. In step 410, the routine invokes the last_stream_domain_signal_handler routine. This invoked routine returns only after this thread starts running again after being swapped in. In step 411, the routine restores the remaining state that was saved in step 402. In step 412, the routine invokes the post swap_parallel_work function to perform any necessary work after the thread is swapped in. In step 413, the routine clears the domain_signal flag in the save_area data structure, so that the exception is cleared when the primary trap handler restores the state from the save_area data structure. In step 414, the routine unlocks the thread and returns to the primary trap handler. Steps 415 and 416 are executed by the slave streams. In step 415, the routine links the save_area data structure to the linked list. In step 416, the routine quits the stream, which means that the stream is available to be allocated to another task, such as the task to be swapped in.
FIGS. 5A and 5B are flow diagrams of the last_stream_domain_signal_handler routine. This routine is invoked by the team master stream of each team. This routine increments a number of teams variable, which is then used for barrier synchronization when the task is swapped in. This routine then invokes an operating system call to notify the operating system that the team has completed saving its state and quitting the other streams. That operating system call does not return until the task is swapped back in, except for the call by the task master stream, which returns immediately. The task master stream is the last stream that makes this operating system call. The task master then performs an operating system call to notify the operating system that the task has completed saving its state. When swapped in, the first stream that fetches and adds to a signal wait variable is designated as the task master stream for the swap in. The task master stream creates a stream to process any Unix signals, and all the other team master streams wait until the Unix signal processing is complete. The routine then invokes a routine to restart the slave streams for the team.
In step 502, the routine fetches and adds to the num_teams variable in the task swap header data structure. In step 503, the routine invokes the tera_team_swapsave_complete operating system call passing the num_streams variable of the team swap header. This operating system call returns immediately when the last team master stream invokes it and returns as its return value a value of 1. For all other team master streams, this operating system call does not return until the task is swapped in. The last team master stream to invoke this operating system is designated as the task master stream. In step 504, if this stream is the task master stream, then the routine continues at step 505, else the routine continues at step 507. In step 505, the routine invokes the work_of_final_stream_in_task function. This invoked function does not return until the task is swapped in. Steps 507-521 represent processing that is performed when the task is swapped in. In steps 507-508, the routine fetches and adds a 1 to the signal_wait variable of the task swap header and waits until that variable equals the num_teams variable in the task swap header. Thus, each team master stream waits until all the other team master streams reach this point in the routine before proceeding. The first stream to increment the signal_wait variable is the task master stream for the swap in. Alternatively, the same stream that was designated as the task master for the swap out can also be the task master for the swap in. In steps 509-514, the routine enables trapping for the domain_signal so that subsequent raising of the domain_signal will cause a trap. The task master stream then processes the Unix signals. During the processing of Unix signals, another domain_signal may be raised. Thus, another swapout can occur before the states of the streams are completely restored. The trap handler handling the domain_signal can handle nested invocations in that the trap handler can be executed again during execution of the trap handler. Therefore, an array of team and swap header data structures is needed to handle this nesting. In step 509, the routine enables the trapping of the domain_signal. In step 510, if this stream is the task master stream, then the routine continues at step 511, else routine continues at step 513. In step 511, the routine invokes the process_signals function to process the Unix signals. In one embodiment, the task master stream creates a thread to handle the Unix signals. In step 512, the routine sets the signal_wait$ synchronization variable of the task swap header to zero, in order to notify the other team master streams that the processing of the Unix signals is complete. In step 513, the routine waits for the notification that the task master stream has processed the Unix signals. In step 514, the routine disables the domain_signal to prevent nested handling of domain_signals. The first save_area data structure in the linked list contains the state of team master stream when the task was swapped out. In step 515, the routine gets the next save_area data structure from the team swap header. In step 516, the routine clears the team swap header. In steps 517 and 518, the routine fetches and adds a −1 to the num_teams variable in the task swap header and waits until that variable is equal to 0. Thus, each team master stream waits until all other team master streams reach this point in the processing. Thus, these steps implement a synchronization barrier. One skilled in the art would appreciate that such barriers can be implemented in different ways. In step 519, if this stream is the task master stream, then the routine continues at step 520, else routine continues at step 521. In step 520, the routine clears the task swap header, to initialize it for the next swap out. In step 523, the routine invokes the swap_restart_streams function to restart the slave streams of the team by creating streams, retrieving the save_area data structures, and initializing the created streams. This routine then returns.
FIG. 6 is a flow diagram of the work_of_final_stream_in_task function. This function determines whether the task is blocked and performs an operating system call to notify the operating system that the task has completed its save processing prior to being swapped out. The routine passes to the operating system call the indication of whether the task is blocked. If the task is blocked, the operating system can decide not to schedule this task until an external event occurs that would unblock this task. In this way, the operating system can allocate the resources of the processors to other tasks that are not blocked. A task is blocked when it is waiting only on an external event. In one embodiment, a task is considered blocked when all the streams of the task are executing the virtual processor code and the stream is not in the process of starting a thread, when no threads are ready to execute. However, other criteria can be used to determine whether a task is blocked. For example, the virtual processor code can increment a counter when it determines that it is blocked and when that counter equals the number of streams of the task, then the task can be considered to be blocked. More generally, a task can notify the operating system whenever it becomes blocked so that the operating system can decide whether to swap out the task. In step 601, the routine determines whether the task is blocked. In step 602, the routine invokes the tera_task_saveswap_complete operating system call passing an indication of whether the task is currently blocked. This invocation of the operating system call does not return until the task is swapped in. The routine then returns.
FIG. 7 is a flow diagram of the process_signals function. This function loops retrieving and processing each Unix signal. The user program may have registered with the user runtime customized signal handlers for processing the various Unix signals. In step 701, the function creates a thread control block for a new thread that is to process the Unix signals. In step 702, the function invokes the tera_get_signal_number operating system call. This operating system call returns the value of the signal number in the sig_num variable. If there are no Unix signals left to be handled, then this operating system call returns a 0. In step 703, the function saves the stream status word (SSW). In steps 704-708, the function executing in the new thread loops processing each signal. In step 704, if the sig_num variable is not equal to zero, then the function continues at step 705, else the function continues at step 708. In step 705, the function locates the handler for the returned signal number. In step 706, the function invokes the located handler. In step 707, the function invokes the tera_get_signal_number operating system call to retrieve the next signal number and loops to step 704. In step 708, the function restores the saved SSW and returns.
FIG. 8 is a flow diagram of the swap_restart_stream function. This function creates a stream for each of the threads that were executing when the stream was swapped out and restarts the thread executing in that stream. In step 801, the function retrieves and discards the first save_area data structure in the linked list. The first save_area data structure is the data structure for the team master stream, which uses the stream provided by the operating system upon return from the tera_team_swapsave_complete operating system call of the team master stream. In steps 802-806, the function loops creating a stream for each save_area data structure in the link list. In step 802, the function retrieves the next save_area data structure in the linked list. In step 803, if all the save_area data structures have already been retrieved, then the function returns, else the function continues at step 804. In step 804, the function creates a stream. The function loops to step 802 to retrieve the next save_area data structure. The newly created stream initializes the thread based on the retrieved save_area data structure and executes at the slave_return_from_swap address that was stored in the save_area data structure before the task was swapped out.
FIG. 9 is a flow diagram of the slave_return_from_swap routine. This routine is invoked when the slave stream is created when the task is swapped in. This routine returns to the primary trap handler at a point after the invocation of the domain_signal_handler routine. In step 901, the routine restores the remaining state that was stored during the saving before the swap out. In step 902, the routine invokes the post_swap_parallel_work_routine to perform any application-dependent work upon swap in. In step 903, the routine unlocks the thread and returns to the routine that called the domain_signal_handler routine.
FIG. 10 is a block diagram of data structures used when swapping a task. Each thread has a thread control block 1001 that contains information describing the current state of the thread and points to a team control block 1002 of the team of which the thread is a member. The team control block contains information describing the team and points to a task control block 1005 of the task of which the team is a member. The task control block contains information describing the task. The team control block contains a pointer to a team swap header 1003 that contains information relating to the swapping of the team. The team swap header contains a pointer to a linked list of save_area data structures 1004 that are used to restart the threads when the team is swapped in. The task control block contains a pointer to a task swap header 1006. The task swap header contains information relating to the swapping of the task.
Operating System/Runtime Interface
The operating system implements operating system calls that are provided to the user programs. When an operating system call is invoked, it begins executing on the same stream on which the invoking thread was executing. Certain operating system calls may be of indefinite duration. For example, an operating system call to return user input will not return until the user eventually inputs data. While the operating system call is waiting for user input, the user program can continue executing its other threads on its other streams. However, the user program effectively has one less stream on which to execute threads, because one of the streams is blocked on the operating system call.
To prevent this “taking” of a stream from the user program during a blocking operating system call, the operating system and the user runtime implement an upcall protocol to return the stream to the user program while the operating system call is blocked. An “upcall” occurs when the operating system invokes a function of the user program. The user program, typically the user runtime of the application program, can register special purpose functions with the operating system, so that the operating system knows which functions to invoke when it makes an upcall to the user program. To support the returning of a stream that is blocked in an operating system call, the user runtime registers a “rt_return_vp” function and a “rt_return_thread” function with the operating system.
When an operating system call that will block is invoked, the operating system (executing on the stream that invoked the operating system call) invokes the rt_return_vp function of the user program. This invocation returns the stream to the user program. The virtual processor code of the user program can then select another thread to execute on that stream while the operating system call is blocked. Eventually, the operating system call will become unblocked (e.g., the user has finally input data). When the operating system call becomes unblocked, the operating system (executing on one of its own streams) invokes the rt_return_thread function of the user program to notify the user program that the operating system call has now completed. The rt_return_thread function performs the necessary processing to restart (or at least schedule) the thread that was blocked on the operating system call. The rt_return_thread function then invokes the tera_return_stream operating system call to return the stream to the operating system. A malicious user program could decide not to return the stream to the operating system and instead start one of its threads executing on that stream. Thus, a user program could increase the number of streams allocated to it to an amount greater than the slim value set the operating system. The operating system can mitigate the effects of such a malicious user program by not returning any more streams or, alternatively, killing the task when it detects that the user program has failed to return a certain number of the operating system streams.
FIGS. 11-16 illustrate the returning of a stream to a user program when an operating system call blocks. In one embodiment, this processing is performed by the user runtime. FIG. 11 is a flow diagram of the user_entry_stub routine. This routine is a wrapper routine of an operating system call. This routine allocates a thread control block and then invokes the operating system call passing that thread control block. A new thread control block is needed because the rt_return_vp function and the rt_return_thread function may be executing at the same time on different streams. In particular, the rt_return_vp function may be executing in the stream returned by the operating system, and the rt_return_thread function may be executing in the operating system stream. Thus, the rt_return_vp function is bound to this newly allocated thread control block. When the operating system call returns, this routine waits until the operating system stream is returned to the operating system and then deallocates the thread control block and returns. In step 1101, the routine allocates a spare thread control block. In step 1102, the routine sets the spare_thread_control_block variable in the upcall transfer (“ut”) data structure to point to this spare_thread_control_block. The ut data structure, described below in detail, contains information and synchronization variables that support the return of a stream to the user programs. In step 1103, the routine sets the os_call variable of the thread control block that is not the spare thread control block to point to the address of the ut data structure. In step 1104, the routine enters the operating system passing the os_call variable to invoke the operating system call. In step 1105, upon return, if the operating system call was blocked, as indicated by the was_blocked variable of the ut data structure, then the routine continues at step 1106, else the routine continues at step 1107. In step 1106, the routine reads from the notify_done$ synchronization variable of the ut data structure. The full/empty bit of this synchronization variable is initially set to empty. The routine waits on this synchronization variable until the operating system call writes to it so that its full/empty bit is set to full, indicating that the operating system stream has been returned. In step 1107, the routine then deallocates the spare thread control block. In step 1108, the routine writes a 0 into the os_call variable of the thread control block and returns.
FIG. 12 is a flow diagram of the rt_return_vp function. This function is invoked by the operating system to return a stream to the user program that invoked a blocking operating system call. This function is passed the identification of the thread that invoked the blocking operating system call and its stream status word (SSW). In step 1201, the function receives the thread control block for this thread. In step 1202, the function increments the os_outstanding_threads variable of the team control block for this thread. This variable is used to keep track of the number of threads that are blocked in operating system calls. In step 1203, the function sets the ut pointer to the value in the os_call variable of the thread control block, which was set in the user entry_stub routine. In step 1204, the function writes the passed identification of the thread into the call_id$ synchronization variable of the ut data structure. This sets the full/empty bit of the synchronization variable to full possibly after blocking. The call_id$ synchronization variable is used by the thread executing on the stream. The thread will spin in step 1206, attempting to write to the call_id$ synchronization variable. This spinning will wait until the full/empty bit of the synchronization variable is set to empty. When the predefined number of retry writes have been tried to the call_id$ synchronization variable in step 1206, a data blocked exception is raised. The trap handler for that exception determines whether the stream is locked. When a stream is locked by a thread, no other thread can execute on the stream. If the stream is locked, the trap handler returns to retry writing to the call_id$ synchronization variable. Thus, if the stream is locked, this thread will spin, waiting until the full/empty bit of this synchronization variable is set to empty when the operating system call completes. If, however, the stream is not locked, the trap handler places this thread on a blocked list and invokes the virtual processor code to schedule another thread to execute on this stream. In step 1205, the function sets the was_blocked flag of the ut data structure so that the user_entry_stub routine will know whether to wait for the operating system stream to be returned to the operating system before the spare thread control block can be released. In step 1206, the routine writes a value of 0 into the call_id$ synchronization variable of the ut data structure. Since the full/empty bit of this synchronization variable was set to full in step 1204, step 1206 retries the write until the full/empty bit is empty or a data blocked exception is raised as described above. In step 1207, the function returns to the user_entry_stub at the return point from the operating system call.
FIG. 13 is a flow diagram of the rt_return_thread function. This function is invoked by the operating system to notify a user program that a thread that was blocked on an operating system call is now unblocked. This function is passed the thread control block of the blocked thread and a return value of the operating system call. In step 1301, the function sets the ut pointer to the value in the os_call variable of the thread control block. In step 1302, the function sets the return_value variable of the ut data structure to point to be passed return value. In step 1303, the function reads the call_id$ synchronization variable, which sets the full/empty bit of the synchronization variable to empty and allows the write in step 1206 to proceed. In step 1304, the function fetches and adds a −1 to the os_outstanding_threads variable of the team control block for the thread. This allows the team to keep track of the number of threads that are blocked on an operating system call. A team will not be swapped out while an operating system call from a stream on that team is blocked. In step 1305, the function invokes the tera_return_stream operating system call to return this stream to the operating system.
FIG. 14 is a flow diagram of the tera_return_stream operating system call. This routine is invoked to return the operating system stream that was used to notify the user program of the completion of an operating system call. This operating system call is passed a thread control block. In step 1401, the operating system call sets the ut pointer to the os_call variable in the thread control block. In step 1402, the operating system call disables trapping of the domain_signal exception. In step 1403, the operating system call writes a value of 0 to the notify_done$ synchronization variable of the ut data structure, which notifies the user_entry_stub routine that the operating system stream has been returned. In step 1404, the operating system invokes an operating system call to effect the returning of the stream to the operating system.
FIG. 15 is a flow diagram of a trap handler routine for handling data blocked exceptions that are raised when waiting for an operating system call to complete. The exception is raised by step 1206 of the rt_return_vp function. In step 1501, if the stream is locked, then the routine returns, else the routine continues at step 1502. In step 1502, the routine adds the thread to a list of blocked threads. In step 1503, the routine starts the virtual processor code for this stream so that another thread can start executing.
FIG. 16A is a diagram illustrating the synchronization of the user program and the operating system when the user program invokes an operating system call that blocks. The diagram illustrates the processing performed by the user stream 1601 and the processing performed by an operating system stream 1602. The solid lines with arrows indicate flow of control from one routine within a stream to another routine within the same stream. The dashed lines indicate the interaction of the synchronization variables. The ellipses indicate omitted steps of the functions. The user program invokes an operating system call by invoking the user_entry_stub routine 1100. That routine in step 1104 invokes the operating system call. As indicated by the solid line between steps 1104 and 1603, the user stream starts executing the operating system call. The operating system call 1603 invokes the rt_return_vp function in step 1604. The rt_return_vp function 1200 stores a value the call_id$ synchronization variable in step 1204, which sets the full/empty bit of the synchronization variable to full. The rt_return_vp function then writes a value into the call_id$ synchronization variable in step 1206. Since the call_id$ synchronization variable just had a value stored in it, its full/empty bit is set to full. This write cannot succeed until the full/empty bit is set to empty. Thus, step 1206 will cause data blocked exception to be raised and the trap handler routine 1500 will be invoked. In step 1501, if the thread is locked, then the trap handler returns to the blocking synchronization write in step 1206. For a locked stream, the process of raising a data blocked exception and returning for a locked thread will continue until the full/empty bit of the call_id$ synchronization variable is set to empty when the operating system call completes. If, however, the thread is not locked, then the trap handler routine places the thread on the blocked pool and executes the virtual processor code to select another thread to execute on that stream. When the operating system call 1605 completes, the operating system in step 1606 invokes the rt_return_thread function 1300 of the user program. This invocation is within a stream allocated to the operating system. The rt_return_thread function 1300 reads the call_id$ synchronization variable in step 1303, which sets its full/empty bit to empty. As indicated by the dashed line, the writing of that synchronization variable in step 1206 then succeeds. The rt_return_vp function then completes the execution of step 1206 and continues to step 1207. In step 1207, the function returns to the location of the user_entry_stub routine immediately after the invocation of the operating system call. The user_entry_stub routine in step 1106 reads the notify done$ synchronization variable. Since the full/empty bit of this synchronization variable is initially empty, this read blocks. The rt_return_thread routine in step 1305 invokes the tera_return_stream operating system call 1400 to return the stream to the operating system. In step 1403, the tera_return_stream operating system writes a value of 0 to the notify_done$ synchronization variable, which sets its full/empty bit to full. This releases the blocked read in step 1106 and the user_entry_stub routine returns to the user code.
FIG. 16B illustrates the Upcall Transfer (Ut) data structure 1650. The ut data structure 1650 is passed to the operating system when a blocking operating system call is invoked. The ut data structure 1650 contains information for synchronizing the return of the stream to the user program. The was_blocked flag 1655 is set to indicate whether the operating system call was blocked so that the user program can wait until the operating system stream is returned to the operating system and so that the function knows when return values need to be retrieved from the ut data structure 1650. The call_id$ synchronization variable 1660 is used to notify the thread that invoked the operating system call and that has locked the thread, that the operating system call is complete. The notify_done$ synchronization variable 1665 is used to notify the thread that the operating system stream has been returned. The spare_tcb pointer 1670 points to the spare thread control block that is used when the operating system notifies the user program that the operating system call is complete. The return_value variable 1675 contains the return value of the operating system call.
Inter-Thread Long Jumps
The Unix operating system supports the concepts of a “long jump.” A long jump transfers control from a certain point in a program to an arbitrary return point in the program that was previously identified. A program can identify the return point by invoking a setjmp routine. The setjmp routine sets the return point to the return address of the setjmp routine invocation. When the setjmp routine returns, it returns a certain value to indicate that the setjmp routine has just returned. When a long jump jumps to the return point, the return value has a different value. In this way, the code at the return point can determine whether the setjmp routine has just returned or whether a long jump has just occurred. The setjmp routine also returns information describing the return point. To effect a long jump, a program invokes a longjmp routine passing the information returned by the setjmp routine.
A long jump is useful for immediately jumping to a known location when the user inputs a certain command. For example, if a user has completely traversed a menu hierarchy and is viewing the lowest level menu items, a certain command (e.g., “control-c”) can be used to signify that the user wants to immediately return to the highest level menu without having to exit each of the intermediate level menus manually. To effect this immediate return to the highest level menu, the user program can invoke the setjmp routine at the point where the highest level menu is displayed and processed. Whenever the user program receives an indication that the command has been entered by the user (e.g., in an input data routine), the user program can invoke the longjmp routine to effect the immediate jump to the return point of the invocation of the setjmp routine.
The longjmp routine may be invoked by a function that is invoked by other functions to an arbitrary level of nesting. To effect the long jump, the longjmp routine uses well-known techniques to undo the stack frames resulting from the nested invocation and to release any memory that was allocated by the functions whose invocations are represented by the stack frames.
FIGS. 17-22 illustrate the processing of a long jump in a MTA computer. In an MTA computer, one thread of execution may want to effect a long jump to a set jump location (i.e., return point) that was set in another thread of execution (i.e., an inter-thread long jump). To effect such a long jump, in one embodiment of the present invention, the longjmp routine first locates the control block for the set jump thread. The longjmp routine then determines the current state of that set jump thread. Based on the current state, the longjmp routine causes the set jump thread to start executing at the set jump location. If the set jump thread is blocked on an operating system call, then the longjmp routine notifies the operating system to abort that operating system call. The longjmp routine then can set the program counter of the set jump thread to a function that performs a standard (i.e., intra-thread) long jump. When the set jump thread is eventually restarted, it will first invoke the intra-thread long jump to jump to the set jump location.
The longjmp routine may be invoked by a signal handler routine. For example, in a Unix environment, a program is notified of a “control-c” command by a Unix signal. Since, as described above, a new thread is created to handle Unix signals, each long jump in such a signal handler routine is an inter-thread long jump. When a Unix signal is received, the operating system notifies the user program whether any blocked operating system calls will automatically return or automatically be restarted. If the blocked operating system calls are restarted, then the longjmp routine directs the operating system to abort the operating system call on which the thread is blocked, if the thread is blocked on one.
FIG. 17 is a flow diagram of the basic longjmp routine. This routine is invoked whenever a long jump is to be performed. This routine determines whether the long jump is inter- or intra-thread and performs the appropriate behavior. This routine is passed a set jump buffer that was generated and returned by the setjmp routine. The set jump buffer contains the thread identifier of the thread that invoked the setjmp routine along with the set jump location information describing the state of the thread when the setjmp routine was invoked. In step 1701, if the currently executing thread is not the thread that invoked the setjmp routine, then the routine continues at step 1703, else the routine continues at step 1702. In step 1702, the routine unwinds at the stack frames, restores the state of the jump buffer and returns to the set jump location. In step 1703, the routine invokes the indirect_longjmp routine to effect an inter-thread long jump. The routine then returns.
FIG. 18 is a flow diagram of the indirect_longjmp routine. This routine implements inter-thread long jumps. The routine determines the state of the set jump thread and based on that state, modifies the state information (e.g., program counter) of the set jump thread to effect an inter_thread long jump. In step 1801, the routine retrieves the thread identifier from the set jump buffer. In step 1802, the routine locates the save_area data structure for the set jump thread. In step 1803, the routine retrieves the thread control block from the save_area data structure. In step 1804, the routine jumps to steps 1805, 1807, 1806, or 1807, depending on whether the state of the thread is “blocked,” “resumable,” “running,” or “transition,” respectively. A “blocked” thread is one that is blocked on any synchronization timeout. The processing for a blocked thread is shown in FIG. 19. A “running” thread is one that is currently executing on a stream. The processing for a running thread is shown in FIG. 20. A “resumable” thread is one that is ready and waiting to be allocated a stream. No special processing is performed for a resumable thread. A “transition” thread is one that is in the process of being allocated a stream. In step 1807, if the state of the thread is “running,” then the routine returns, else the routine continues at step 1808. In step 1808, the routine sets the program counter in the thread control block data structure to the address of the longjmp routine. In step 1809, the routine put the thread control block on a list of unblocked threads. In this way, when the thread starts running, it will invoke the longjmp routine.
FIG. 19 is a flow diagram of the processing performed when the state of the thread is “blocked.” In step 1901, the routine removes the thread from the blocked list. In step 1902, the routine sets the state of the thread to “resumable.” In step 1903, the routine invokes the check_blocked_on_os_call routine to abort the operating system call if it will be restarted. The routine then returns.
FIG. 20 is a flow diagram of the processing performed when the state of the thread is “running.” In step 2001, the routine invokes the check_blocked_on_os_call routine to abort the operating system call if it will be restarted. In step 2002, if the thread is handling a data blocked exception, then the routine continues at step 2003, else the routine continues at step 2004. In step 2003, the routine saves any additional state information that was not saved by the data blocked trap handler. The data block handler saves minimal state information in case the thread decides to immediately redo the operation that caused the exception. In step 2004, the routine creates and initializes a save_area data structure. In step 2005, the routine sets the program counter in the save_area data structure to the address of the longjmp routine and then returns.
FIG. 21 is a flow diagram of the check_on_blocked_os_call routine. In step 2101, the routine retrieves the ut data structure from the os_call variable of the thread control block. If the pointer to the ut data structure is null, the routine returns. In step 2102, if the blocked operating system call is being restarted, then routine continues at step 2103, else the routine continues at step 2104. In step 2103, the routine requests the operating system to abort the restarted operating system call. In step 2104, the routine reads the notify_done$ synchronization variable of the ut data structure. This read will cause the longjmp routine to wait until the abort is complete. In step 2105, the routine deallocates the spare thread control block that was used to notify the user program that the operating system call has completed, and returns.
From the foregoing it will be appreciated that, although specific embodiments of the invention have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of the invention. For example, the principles described herein may be practiced in other computer architectures that support no multiple streams or that support multiple streams either within a single processor or within multiple processors. Accordingly, the invention is not limited except as by the appended claims.

Claims (47)

1. A method in a computer system for preparing a task to be swapped out from processor utilization, the computer system having multiple processors and an operating system, each processor having multiple streams for simultaneously executing threads of the task, the task having one or more teams of threads, each team representing threads executing on a single processor, the method comprising:
raising an exception for each stream of each processor currently executing a thread of the task; and
in response to the raising of the exception, for each stream executing a thread,
saving a state of the stream;
determining whether the stream is a team master stream;
when the stream is not the team master stream, quitting the stream;
when the stream is the team master stream,
waiting for all other streams executing threads in the same team to quit;
determining whether the stream is a task master stream,
when the stream is not the task master stream, notifying the operating system that the team for this processor is ready to be swapped out;
when the stream is the task master stream,
waiting for all other teams to notify the operating system that the team is ready to be swapped out; and
notifying the operating system that the task is ready to be swapped out.
2. The method of claim 1 wherein the team master stream is a stream that increments a team master variable first.
3. The method of claim 1 wherein the task master stream is a stream that first notifies the operating system that its team is ready to be swapped out.
4. The method of claim 1 wherein the task master stream is a stream that last notifies the operating system that its team is ready to be swapped out.
5. The method of claim 1 wherein the state of the stream is stored in a list of stream states.
6. A system for preparing a task to be swapped out from processor utilization, the system having multiple processors and an operating system, each processor having multiple streams for executing threads of the task, the task having one or more teams of threads, each team representing threads executing on a single processor, each thread having a state, the system comprising:
a component that raises an exception for each stream of each processor currently executing a thread of the task; and
an exception handler for each thread that, in response to the raising of an exception for the stream upon which the thread is executing,
saves the state of the thread;
determines whether it is a team master stream;
when the stream is not the team master stream, quits;
when the stream is the team master stream,
waits for all other streams executing threads in the same team to quit;
determines whether it is a task master stream; and
when it is not the task master stream, notifies the operating system that the team for this processor is ready to be swapped out; and
when the stream is the task master stream,
waits for all other teams to notify the operating system that the team is ready to be swapped out; and
notifies the operating system that the task is ready to be swapped out.
7. The system of claim 6 wherein the team master stream is a stream that increments a team master variable first.
8. The system of claim 6 wherein the task master stream is a stream that first notifies the operating system that its team is ready to be swapped out.
9. The system of claim 6 wherein the task master stream is a stream that last notifies the operating system that its team is ready to be swapped out.
10. The system of claim 6 wherein the state of the stream is stored in a list of stream states.
11. A method in a computer system for preparing a task to be swapped out from processor utilization, the computer system having multiple processors and an operating system, each processor having multiple streams for executing threads of the task, the task having one or more teams of threads, each team representing threads executing on a single processor, the method comprising:
for each team, designating one stream that is executing a thread as a team master stream;
designating one stream that is executing a thread as a task master stream for the task;
for each team master stream, notifying the operating system that the team is ready to be swapped out when each other thread of the team has quit its stream; and
for the task master stream, notifying the operating system that the task is ready to be swapped when each of the other teams have notified the operating system that that team is ready to be swapped out.
12. The method of claim 11 wherein the operating system swaps out the task upon receiving the notification that the task is ready to be swapped out.
13. The method of claim 11 wherein each stream stores is own state before quitting the stream or notifying the operating system.
14. The method of claim 11 wherein each stream that is not a team master stream quits its stream.
15. The method of claim 11 wherein the notifying of the operating system by the task master stream includes indicating whether the task is blocked so that the operating system can defer swapping in the task until an event occurs to unblock the task.
16. The method of claim 11 wherein each team master stream notifies the operating system of the number of streams that were executing threads so that the operating system can defer swapping in the task until enough streams are available to execute each of the threads that were executing when the task was swapped out.
17. A system for preparing a task to be swapped out from processor utilization, the system having multiple processors and an operating system, each processor having multiple streams for executing threads of the task, the task having one or more teams of threads, each team representing threads executing on a single processor, the system comprising:
a component that, for each team,
designates one stream that is executing a thread as a team master stream; and
designates one stream that is executing a thread as a task master stream for the task;
a component that, for each team master stream, notifies the operating system that the team is ready to be swapped out when each other thread of the team has quit its stream; and
a component that, for the task master stream, notifies the operating system that the task is ready to be swapped out when each of the other teams have notified the operating system that that team is ready to be swapped out.
18. The system of claim 17 wherein the operating system swaps out the task upon receiving the notification that the task is ready to be swapped out.
19. The system of claim 17 wherein each stream stores is own state before quitting the stream or notifying the operating system.
20. The system of claim 17 wherein each stream that is not a team master stream quits its stream.
21. The system of claim 17 wherein the component that notifies the operating system that the task is ready to be swapped indicates whether the task is blocked so that the operating system can defer swapping in the task until an event occurs to unblock the task.
22. The system of claim 17 wherein each team master stream notifies the operating system of the number of streams that were executing threads so that the operating system can defer swapping in the task until enough streams are available to execute each of the threads that were executing when the task was swapped out.
23. A method in a computer system for preparing a task to be swapped out from processor utilization, the computer system having a processor and an operating system, the method comprising:
saving state information of each stream of the processor that is executing a thread;
under control of each stream that is not a master stream, quitting the stream; and
under control of the master stream,
waiting for each stream that is not a master stream to quit; and
providing a notification that the task is ready to be swapped out.
24. The method of claim 23 wherein the operating system swaps out the task upon receiving the notification that the task is ready to be swapped out.
25. The method of claim 23 wherein each stream saves its own state information.
26. The method of claim 23 including upon being swapped in:
starting execution of a master stream; and
under control of the master stream,
creating a stream corresponding to each stream that quit when the task was swapped out; and
initializing each created stream based on state information saved before the stream quit.
27. The method of claim 23 wherein the notification includes an indication of whether the task is blocked so that swapping in of the task can be deferred until an event occurs to unblock the task.
28. The method of claim 23 wherein the task notifies the operating system of the number of streams that were executing threads so that swapping in of the task can be deferred until enough streams are available to execute each of the threads that were executing when the task was swapped out.
29. A system for preparing a task in a computer system to be swapped out, the computer system having a processor and an operating system, the processor having streams for executing threads of the task, each stream having a state, the system comprising:
means for saving the state of each stream that is executing a thread;
means for quitting each stream that is not a master stream; and
means for, when the stream is the master stream,
waiting for each stream that is not a master stream to quit; and
notifying the operating system that the task is ready to be swapped out.
30. The system of claim 29 wherein the operating system swaps out the task upon receiving the notification that the task is ready to be swapped out.
31. The system of claim 29 wherein each stream saves its own state.
32. The system of claim 29 including upon being swapped in:
means for starting execution of a master stream; and
means for, when the stream is the master stream,
creating a stream corresponding to each stream that quit when the task was swapped out; and
initializing each created stream based on state information saved before the stream quit.
33. The system of claim 29 wherein the means for notifying the operating system by the master stream includes means for indicating whether the task is blocked so that the operating system can defer swapping in the task until an event occurs to unblock the task.
34. The system of claim 29 including means for notifying the operating system of the number of streams that were executing threads so that the operating system can defer swapping in the task until enough streams are available to execute each of the threads that were executing when the task was swapped out.
35. A method in a computer system for restarting execution of a task that has been swapped out of processor utilization, the method comprising:
starting execution of a master stream; and
under control of the master stream,
creating a stream corresponding to each stream that quit when the task was swapped out; and
initializing each created stream based on state information saved when the stream quit.
36. The method of claim 35 wherein the method is performed by user runtime code.
37. The method of claim 35 wherein the starting is done in response to a signal.
38. The method of claim 37 wherein the signal is generated at the end of a time interval.
39. The method of claim 37 wherein the signal is generated when a routine is invoked.
40. The method of claim 35 wherein the restarting is deferred until a sufficient number of streams is available.
41. A method in a computer system for restarting execution of a task that has been swapped out of processor utilization, the method comprising:
starting execution of a task master stream;
initializing the task master stream based on information saved when the task master stream quit;
under control of the task master stream, starting team master streams; and
under control of each team master stream,
creating a stream corresponding to each stream associated with the master stream that quit when the task was swapped out; and
initializing each created stream based on state information saved when the stream quit.
42. The method of claim 41 wherein the streams associated with a team master stream were executing on the same processor when the streams quit.
43. The method of claim 41 wherein each team master stream executes on a different processor.
44. A method in a computer system for preparing a task to be swapped out from utilization of processors, the method comprising:
saving state information for each stream of the task; and
for each processor,
when no stream of the task executing on the processor is a task master stream,
providing notification that the streams of the processor are ready to be swapped out; and
when a stream of the task executing on the processor is a task master stream,
waiting for a stream of each other processor to provide notification; and
providing notification that the task is ready to be swapped.
45. The method of claim 44 wherein the task master stream is a stream that first notifies the operating system that it is ready to be swapped out.
46. The method of claim 44 wherein the task master stream is a stream that last notifies the operating system that it is ready to be swapped out.
47. The method of claim 44 wherein the state of the stream is stored in a list of stream states.
US10/659,407 1998-11-13 2003-09-10 Task swap out in a multithreaded environment Expired - Lifetime US7360221B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/659,407 US7360221B2 (en) 1998-11-13 2003-09-10 Task swap out in a multithreaded environment

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/192,205 US6952827B1 (en) 1998-11-13 1998-11-13 User program and operating system interface in a multithreaded environment
US10/659,407 US7360221B2 (en) 1998-11-13 2003-09-10 Task swap out in a multithreaded environment

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/192,205 Division US6952827B1 (en) 1998-11-13 1998-11-13 User program and operating system interface in a multithreaded environment

Publications (2)

Publication Number Publication Date
US20040088711A1 US20040088711A1 (en) 2004-05-06
US7360221B2 true US7360221B2 (en) 2008-04-15

Family

ID=32028847

Family Applications (6)

Application Number Title Priority Date Filing Date
US09/192,205 Expired - Lifetime US6952827B1 (en) 1998-11-13 1998-11-13 User program and operating system interface in a multithreaded environment
US10/659,407 Expired - Lifetime US7360221B2 (en) 1998-11-13 2003-09-10 Task swap out in a multithreaded environment
US10/663,897 Expired - Fee Related US7191444B2 (en) 1998-11-13 2003-09-16 Stream management in a multithreaded environment
US10/663,895 Expired - Lifetime US7536690B2 (en) 1998-11-13 2003-09-16 Deferred task swapping in a multithreaded environment
US10/676,680 Expired - Lifetime US7392525B2 (en) 1998-11-13 2003-10-01 Inter-thread long jumps in a multithreaded environment
US10/683,774 Expired - Lifetime US7426732B2 (en) 1998-11-13 2003-10-10 Placing a task of a multithreaded environment in a known state

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US09/192,205 Expired - Lifetime US6952827B1 (en) 1998-11-13 1998-11-13 User program and operating system interface in a multithreaded environment

Family Applications After (4)

Application Number Title Priority Date Filing Date
US10/663,897 Expired - Fee Related US7191444B2 (en) 1998-11-13 2003-09-16 Stream management in a multithreaded environment
US10/663,895 Expired - Lifetime US7536690B2 (en) 1998-11-13 2003-09-16 Deferred task swapping in a multithreaded environment
US10/676,680 Expired - Lifetime US7392525B2 (en) 1998-11-13 2003-10-01 Inter-thread long jumps in a multithreaded environment
US10/683,774 Expired - Lifetime US7426732B2 (en) 1998-11-13 2003-10-10 Placing a task of a multithreaded environment in a known state

Country Status (1)

Country Link
US (6) US6952827B1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050132364A1 (en) * 2003-12-16 2005-06-16 Vijay Tewari Method, apparatus and system for optimizing context switching between virtual machines
US20070089111A1 (en) * 2004-12-17 2007-04-19 Robinson Scott H Virtual environment manager
US20100262971A1 (en) * 2008-07-22 2010-10-14 Toyota Jidosha Kabushiki Kaisha Multi core system, vehicular electronic control unit, and task switching method
US9195575B2 (en) 2013-05-17 2015-11-24 Coherent Logix, Incorporated Dynamic reconfiguration of applications on a multi-processor embedded system

Families Citing this family (79)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6952827B1 (en) * 1998-11-13 2005-10-04 Cray Inc. User program and operating system interface in a multithreaded environment
JP2003505753A (en) * 1999-06-10 2003-02-12 ペーアーツェーテー インフォルマツィオーンステヒノロギー ゲゼルシャフト ミット ベシュレンクテル ハフツング Sequence division method in cell structure
US7093260B1 (en) * 2000-05-04 2006-08-15 International Business Machines Corporation Method, system, and program for saving a state of a task and executing the task by a processor in a multiprocessor system
US7086053B2 (en) * 2000-06-12 2006-08-01 Sun Microsystems, Inc. Method and apparatus for enabling threads to reach a consistent state without explicit thread suspension
US7007244B2 (en) * 2001-04-20 2006-02-28 Microsoft Corporation Method and system for displaying categorized information on a user interface
US6785361B1 (en) * 2001-08-30 2004-08-31 Bellsouth Intellectual Property Corporation System and method for performance measurement quality assurance
US7577816B2 (en) * 2003-08-18 2009-08-18 Cray Inc. Remote translation mechanism for a multinode system
DE10242667B4 (en) * 2002-09-13 2004-07-29 Phoenix Contact Gmbh & Co. Kg Real-time control system with a PLC application under a non-real-time operating system
US7653906B2 (en) * 2002-10-23 2010-01-26 Intel Corporation Apparatus and method for reducing power consumption on simultaneous multi-threading systems
US20040128654A1 (en) * 2002-12-30 2004-07-01 Dichter Carl R. Method and apparatus for measuring variation in thread wait time
US7519771B1 (en) 2003-08-18 2009-04-14 Cray Inc. System and method for processing memory instructions using a forced order queue
US7743223B2 (en) 2003-08-18 2010-06-22 Cray Inc. Decoupling of write address from its associated write data in a store to a shared memory in a multiprocessor system
US7421565B1 (en) * 2003-08-18 2008-09-02 Cray Inc. Method and apparatus for indirectly addressed vector load-add -store across multi-processors
US8307194B1 (en) 2003-08-18 2012-11-06 Cray Inc. Relaxed memory consistency model
US7543133B1 (en) 2003-08-18 2009-06-02 Cray Inc. Latency tolerant distributed shared memory multiprocessor computer
US7735088B1 (en) * 2003-08-18 2010-06-08 Cray Inc. Scheduling synchronization of programs running as streams on multiple processors
JP4818919B2 (en) * 2003-08-28 2011-11-16 ミップス テクノロジーズ インコーポレイテッド Integrated mechanism for suspending and deallocating computational threads of execution within a processor
US7836450B2 (en) 2003-08-28 2010-11-16 Mips Technologies, Inc. Symmetric multiprocessor operating system for execution on non-independent lightweight thread contexts
US9032404B2 (en) * 2003-08-28 2015-05-12 Mips Technologies, Inc. Preemptive multitasking employing software emulation of directed exceptions in a multithreading processor
US7401202B1 (en) * 2004-09-14 2008-07-15 Azul Systems, Inc. Memory addressing
US8453157B2 (en) * 2004-11-16 2013-05-28 International Business Machines Corporation Thread synchronization in simultaneous multi-threaded processor machines
US8140678B2 (en) * 2004-12-28 2012-03-20 Sap Ag Failover protection from a failed worker node in a shared memory system
US7933947B2 (en) * 2004-12-28 2011-04-26 Sap Ag Connection manager that supports failover protection
US7937709B2 (en) 2004-12-29 2011-05-03 Intel Corporation Synchronizing multiple threads efficiently
US20060212450A1 (en) * 2005-03-18 2006-09-21 Microsoft Corporation Temporary master thread
US8195922B2 (en) * 2005-03-18 2012-06-05 Marvell World Trade, Ltd. System for dynamically allocating processing time to multiple threads
US20060212853A1 (en) * 2005-03-18 2006-09-21 Marvell World Trade Ltd. Real-time control apparatus having a multi-thread processor
US7529224B2 (en) * 2005-04-18 2009-05-05 International Business Machines Corporation Scheduler, network processor, and methods for weighted best effort scheduling
US7474662B2 (en) * 2005-04-29 2009-01-06 International Business Machines Corporation Systems and methods for rate-limited weighted best effort scheduling
US7996659B2 (en) * 2005-06-06 2011-08-09 Atmel Corporation Microprocessor instruction that allows system routine calls and returns from all contexts
US7546430B1 (en) * 2005-08-15 2009-06-09 Wehnus, Llc Method of address space layout randomization for windows operating systems
US7681197B1 (en) * 2005-09-21 2010-03-16 Sun Microsystems, Inc. Nested monitor handling processes
US8316220B2 (en) * 2005-09-27 2012-11-20 Sony Computer Entertainment Inc. Operating processors over a network
US20070150586A1 (en) * 2005-12-28 2007-06-28 Frank Kilian Withdrawing requests in a shared memory system
US8707323B2 (en) * 2005-12-30 2014-04-22 Sap Ag Load balancing algorithm for servicing client requests
TWI298129B (en) * 2006-01-20 2008-06-21 Hon Hai Prec Ind Co Ltd System and method for processing files distributively
WO2008118613A1 (en) * 2007-03-01 2008-10-02 Microsoft Corporation Executing tasks through multiple processors consistently with dynamic assignments
US8190624B2 (en) * 2007-11-29 2012-05-29 Microsoft Corporation Data parallel production and consumption
US8312254B2 (en) * 2008-03-24 2012-11-13 Nvidia Corporation Indirect function call instructions in a synchronous parallel thread processor
US8151266B2 (en) * 2008-03-31 2012-04-03 Qualcomm Incorporated Operating system fast run command
FR2931274B1 (en) * 2008-05-14 2018-05-04 Airbus Operations METHOD OF MANAGING DATA FOR WORKSHOP ORIENTED COLLABORATIVE SERVICE
US8566830B2 (en) * 2008-05-16 2013-10-22 Microsoft Corporation Local collections of tasks in a scheduler
US8561072B2 (en) * 2008-05-16 2013-10-15 Microsoft Corporation Scheduling collections in a scheduler
US20090288089A1 (en) * 2008-05-16 2009-11-19 International Business Machines Corporation Method for prioritized event processing in an event dispatching system
US8307353B2 (en) * 2008-08-12 2012-11-06 Oracle America, Inc. Cross-domain inlining in a system virtual machine
US8887162B2 (en) * 2008-12-17 2014-11-11 Microsoft Corporation Persistent local storage for processor resources
US8175759B2 (en) * 2009-06-22 2012-05-08 Honeywell International Inc. Systems and methods for validating predetermined events in reconfigurable control systems
US9672132B2 (en) * 2009-11-19 2017-06-06 Qualcomm Incorporated Methods and apparatus for measuring performance of a multi-thread processor
US20110252423A1 (en) 2010-04-07 2011-10-13 Apple Inc. Opportunistic Multitasking
US8464104B2 (en) 2010-09-10 2013-06-11 International Business Machines Corporation Mobility of versioned workload partitions
US9065786B2 (en) 2010-09-24 2015-06-23 Yagi Corp. Context-sensitive auto-responder
EP2629198A4 (en) * 2010-10-14 2014-04-16 Nec Corp Distributed processing device and distributed processing system
WO2012064788A1 (en) * 2010-11-08 2012-05-18 Robert Plotkin Enforced unitasking in multitasking systems
US8850443B2 (en) * 2011-11-22 2014-09-30 Red Hat Israel, Ltd. Asynchronous input/output (I/O) using alternate stack switching in kernel space
US8850450B2 (en) 2012-01-18 2014-09-30 International Business Machines Corporation Warning track interruption facility
US9110878B2 (en) * 2012-01-18 2015-08-18 International Business Machines Corporation Use of a warning track interruption facility by a program
US9104508B2 (en) 2012-01-18 2015-08-11 International Business Machines Corporation Providing by one program to another program access to a warning track facility
US20130247069A1 (en) * 2012-03-15 2013-09-19 International Business Machines Corporation Creating A Checkpoint Of A Parallel Application Executing In A Parallel Computer That Supports Computer Hardware Accelerated Barrier Operations
US9513975B2 (en) 2012-05-02 2016-12-06 Nvidia Corporation Technique for computational nested parallelism
US9928109B2 (en) 2012-05-09 2018-03-27 Nvidia Corporation Method and system for processing nested stream events
KR101984635B1 (en) 2012-07-19 2019-05-31 삼성전자주식회사 Arithmetic processing apparatus and method for high speed processing to application
US10182128B1 (en) 2013-02-07 2019-01-15 Amazon Technologies, Inc. Optimization of production systems
US9336068B2 (en) 2013-06-07 2016-05-10 Apple Inc. Throttling of application access to resources
US10389697B1 (en) * 2014-08-27 2019-08-20 Amazon Technologies, Inc. Software container activation and throttling
US9710315B2 (en) 2014-09-12 2017-07-18 Qualcomm Incorporated Notification of blocking tasks
US9684546B2 (en) * 2014-12-16 2017-06-20 Microsoft Technology Licensing, Llc Job scheduling and monitoring in a distributed computing environment
US10310820B2 (en) * 2016-05-12 2019-06-04 Basal Nuclei Inc Programming model and interpreted runtime environment for high performance services with implicit concurrency control
US10860618B2 (en) 2017-09-25 2020-12-08 Splunk Inc. Low-latency streaming analytics
US10733083B2 (en) * 2017-10-18 2020-08-04 Salesforce.Com, Inc. Concurrency testing
US10997180B2 (en) 2018-01-31 2021-05-04 Splunk Inc. Dynamic query processor for streaming and batch queries
US10936585B1 (en) 2018-10-31 2021-03-02 Splunk Inc. Unified data processing across streaming and indexed data sets
US11238048B1 (en) 2019-07-16 2022-02-01 Splunk Inc. Guided creation interface for streaming data processing pipelines
CN111552574A (en) * 2019-09-25 2020-08-18 华为技术有限公司 Multithreading synchronization method and electronic equipment
US11614923B2 (en) 2020-04-30 2023-03-28 Splunk Inc. Dual textual/graphical programming interfaces for streaming data processing pipelines
CN112612581B (en) * 2020-12-02 2024-02-13 北京和利时系统工程有限公司 Thread active exit method and device
US11650995B2 (en) 2021-01-29 2023-05-16 Splunk Inc. User defined data stream for routing data to a data destination based on a data route
US11687487B1 (en) * 2021-03-11 2023-06-27 Splunk Inc. Text files updates to an active processing pipeline
US11663219B1 (en) 2021-04-23 2023-05-30 Splunk Inc. Determining a set of parameter values for a processing pipeline
US11989592B1 (en) 2021-07-30 2024-05-21 Splunk Inc. Workload coordinator for providing state credentials to processing tasks of a data processing pipeline

Citations (90)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4819234A (en) 1987-05-01 1989-04-04 Prime Computer, Inc. Operating system debugger
US4872167A (en) 1986-04-01 1989-10-03 Hitachi, Ltd. Method for displaying program executing circumstances and an apparatus using the same
EP0422945A2 (en) 1989-10-13 1991-04-17 International Business Machines Corporation Parallel processing trace data manipulation
EP0455966A2 (en) 1990-05-10 1991-11-13 International Business Machines Corporation Compounding preprocessor for cache
US5179702A (en) * 1989-12-29 1993-01-12 Supercomputer Systems Limited Partnership System and method for controlling a highly parallel multiprocessor using an anarchy based scheduler for parallel execution thread scheduling
US5197138A (en) 1989-12-26 1993-03-23 Digital Equipment Corporation Reporting delayed coprocessor exceptions to code threads having caused the exceptions by saving and restoring exception state during code thread switching
EP0537098A2 (en) 1991-10-11 1993-04-14 International Business Machines Corporation Event handling mechanism having a filtering process and an action association process
US5257358A (en) 1989-04-18 1993-10-26 Nec Electronics, Inc. Method for counting the number of program instruction completed by a microprocessor
US5301325A (en) 1991-03-07 1994-04-05 Digital Equipment Corporation Use of stack depth to identify architechture and calling standard dependencies in machine code
US5333280A (en) 1990-04-06 1994-07-26 Nec Corporation Parallel pipelined instruction processing system for very long instruction word
US5450575A (en) 1991-03-07 1995-09-12 Digital Equipment Corporation Use of stack depth to identify machine code mistakes
US5485626A (en) 1992-11-03 1996-01-16 International Business Machines Corporation Architectural enhancements for parallel computer systems utilizing encapsulation of queuing allowing small grain processing
US5504932A (en) 1990-05-04 1996-04-02 International Business Machines Corporation System for executing scalar instructions in parallel based on control bits appended by compounding decoder
US5524250A (en) 1991-08-23 1996-06-04 Silicon Graphics, Inc. Central processing unit for processing a plurality of threads using dedicated general purpose registers and masque register for providing access to the registers
US5526521A (en) 1993-02-24 1996-06-11 International Business Machines Corporation Method and system for process scheduling from within a current context and switching contexts only when the next scheduled context is different
US5533192A (en) 1994-04-21 1996-07-02 Apple Computer, Inc. Computer program debugging system and method
US5557761A (en) 1994-01-25 1996-09-17 Silicon Graphics, Inc. System and method of generating object code using aggregate instruction movement
US5564051A (en) 1989-08-03 1996-10-08 International Business Machines Corporation Automatic update of static and dynamic files at a remote network node in response to calls issued by or for application programs
US5581764A (en) 1993-04-30 1996-12-03 Novadigm, Inc. Distributed computer network including hierarchical resource information structure and related method of distributing resources
US5594864A (en) 1992-04-29 1997-01-14 Sun Microsystems, Inc. Method and apparatus for unobtrusively monitoring processor states and characterizing bottlenecks in a pipelined processor executing grouped instructions
US5598560A (en) 1991-03-07 1997-01-28 Digital Equipment Corporation Tracking condition codes in translation code for different machine architectures
US5621886A (en) 1995-06-19 1997-04-15 Intel Corporation Method and apparatus for providing efficient software debugging
US5632032A (en) 1994-02-07 1997-05-20 International Business Machines Corporation Cross address space thread control in a multithreaded environment
GB2307760A (en) 1995-11-29 1997-06-04 At & T Corp Isochronal updating of data records
US5652889A (en) 1991-03-07 1997-07-29 Digital Equipment Corporation Alternate execution and interpretation of computer program having code at unknown locations due to transfer instructions having computed destination addresses
US5668993A (en) 1994-02-28 1997-09-16 Teleflex Information Systems, Inc. Multithreaded batch processing system
US5712996A (en) 1993-03-15 1998-01-27 Siemens Aktiengesellschaft Process for dividing instructions of a computer program into instruction groups for parallel processing
DE19710252A1 (en) 1996-08-23 1998-02-26 Fujitsu Ltd Displaying results of processing power monitoring and analysis of parallel processing system
US5740413A (en) 1995-06-19 1998-04-14 Intel Corporation Method and apparatus for providing address breakpoints, branch breakpoints, and single stepping
US5754855A (en) 1994-04-21 1998-05-19 International Business Machines Corporation System and method for managing control flow of computer programs executing in a computer system
US5768592A (en) 1994-09-27 1998-06-16 Intel Corporation Method and apparatus for managing profile data
US5768591A (en) 1995-09-08 1998-06-16 Iq Systems Method of de-bugging host-processor software in a distributed processing system having a host processor and at least one object oriented processor
US5774358A (en) 1996-04-01 1998-06-30 Motorola, Inc. Method and apparatus for generating instruction/data streams employed to verify hardware implementations of integrated circuit designs
US5774721A (en) 1995-09-08 1998-06-30 Iq Systems, Inc. Method of communication between processors in a distributed processing system having a host processor and at least one object oriented processor
US5778230A (en) 1995-11-13 1998-07-07 Object Technology Licensing Corp. Goal directed object-oriented debugging system
US5787245A (en) 1995-11-13 1998-07-28 Object Technology Licensing Corporation Portable debugging service utilizing a client debugger object and a server debugger object
EP0855648A2 (en) 1997-01-24 1998-07-29 Texas Instruments Inc. Data processing with parallel or sequential execution of program instructions
US5805892A (en) 1994-09-26 1998-09-08 Nec Corporation Method of and apparatus for debugging multitask programs
EP0864979A2 (en) 1997-03-10 1998-09-16 Digital Equipment Corporation Processor performance counter for sampling the execution frequency of individual instructions
US5812811A (en) * 1995-02-03 1998-09-22 International Business Machines Corporation Executing speculative parallel instructions threads with forking and inter-thread communication
US5826265A (en) 1996-12-06 1998-10-20 International Business Machines Corporation Data management system having shared libraries
US5867643A (en) 1995-11-06 1999-02-02 Apple Computer, Inc. Run-time data type description mechanism for performance information in an extensible computer system
US5877766A (en) 1997-08-15 1999-03-02 International Business Machines Corporation Multi-node user interface component and method thereof for use in accessing a plurality of linked records
US5887166A (en) * 1996-12-16 1999-03-23 International Business Machines Corporation Method and system for constructing a program including a navigation instruction
US5901315A (en) 1997-06-13 1999-05-04 International Business Machines Corporation Method for debugging a Java application having native method dynamic load libraries
US5913925A (en) 1996-12-16 1999-06-22 International Business Machines Corporation Method and system for constructing a program including out-of-order threads and processor and method for executing threads out-of-order
US5953530A (en) 1995-02-07 1999-09-14 Sun Microsystems, Inc. Method and apparatus for run-time memory access checking and memory leak detection of a multi-threaded program
US5961639A (en) 1996-12-16 1999-10-05 International Business Machines Corporation Processor and method for dynamically inserting auxiliary instructions within an instruction stream during execution
US5966539A (en) 1994-03-01 1999-10-12 Digital Equipment Corporation Link time optimization with translation to intermediate program and following optimization techniques including program analysis code motion live variable set generation order analysis, dead code elimination and load invariant analysis
US5978902A (en) 1997-04-08 1999-11-02 Advanced Micro Devices, Inc. Debug interface including operating system access of a serial/parallel debug port
US6002879A (en) 1997-04-01 1999-12-14 Intel Corporation Method for performing common subexpression elimination on a rack-N static single assignment language
US6002872A (en) 1998-03-31 1999-12-14 International Machines Corporation Method and apparatus for structured profiling of data processing systems and applications
US6003066A (en) 1997-08-14 1999-12-14 International Business Machines Corporation System for distributing a plurality of threads associated with a process initiating by one data processing station among data processing stations
US6009269A (en) 1997-03-10 1999-12-28 Digital Equipment Corporation Detecting concurrency errors in multi-threaded programs
US6016542A (en) 1997-12-31 2000-01-18 Intel Corporation Detecting long latency pipeline stalls for thread switching
US6029005A (en) 1997-04-01 2000-02-22 Intel Corporation Method for identifying partial redundancies in a new processor architecture
US6049671A (en) 1996-04-18 2000-04-11 Microsoft Corporation Method for identifying and obtaining computer software from a network computer
US6058493A (en) 1997-04-15 2000-05-02 Sun Microsystems, Inc. Logging and reproduction of automated test operations for computing systems
US6059840A (en) * 1997-03-17 2000-05-09 Motorola, Inc. Automatic scheduling of instructions to reduce code size
US6072952A (en) 1998-04-22 2000-06-06 Hewlett-Packard Co. Method and apparatus for coalescing variables
US6088788A (en) 1996-12-27 2000-07-11 International Business Machines Corporation Background completion of instruction and associated fetch request in a multithread processor
US6094716A (en) * 1998-07-14 2000-07-25 Advanced Micro Devices, Inc. Register renaming in which moves are accomplished by swapping rename tags
US6101524A (en) 1997-10-23 2000-08-08 International Business Machines Corporation Deterministic replay of multithreaded applications
US6105051A (en) * 1997-10-23 2000-08-15 International Business Machines Corporation Apparatus and method to guarantee forward progress in execution of threads in a multithreaded processor
US6112293A (en) 1997-11-17 2000-08-29 Advanced Micro Devices, Inc. Processor configured to generate lookahead results from operand collapse unit and for inhibiting receipt/execution of the first instruction based on the lookahead result
US6151704A (en) 1997-04-01 2000-11-21 Intel Corporation Method for optimizing a loop in a computer program by speculatively removing loads from within the loop
US6151701A (en) 1997-09-30 2000-11-21 Ahpah Software, Inc. Method for reconstructing debugging information for a decompiled executable file
US6212544B1 (en) 1997-10-23 2001-04-03 International Business Machines Corporation Altering thread priorities in a multithreaded processor
US6219690B1 (en) 1993-07-19 2001-04-17 International Business Machines Corporation Apparatus and method for achieving reduced overhead mutual exclusion and maintaining coherency in a multiprocessor system utilizing execution history and thread monitoring
US6223202B1 (en) 1998-06-05 2001-04-24 International Business Machines Corp. Virtual machine pooling
US6233599B1 (en) 1997-07-10 2001-05-15 International Business Machines Corporation Apparatus and method for retrofitting multi-threaded operations on a computer by partitioning and overlapping registers
US6272520B1 (en) 1997-12-31 2001-08-07 Intel Corporation Method for detecting thread switch events
US6282638B1 (en) * 1997-08-01 2001-08-28 Micron Technology, Inc. Virtual shadow registers and virtual register windows
US6289446B1 (en) 1998-09-29 2001-09-11 Axis Ab Exception handling utilizing call instruction with context information
US6298370B1 (en) 1997-04-04 2001-10-02 Texas Instruments Incorporated Computer operating process allocating tasks between first and second processors at run time based upon current processor load
US20020103847A1 (en) 2001-02-01 2002-08-01 Hanan Potash Efficient mechanism for inter-thread communication within a multi-threaded computer system
US6466898B1 (en) * 1999-01-12 2002-10-15 Terence Chan Multithreaded, mixed hardware description languages logic simulation on engineering workstations
US6470376B1 (en) * 1997-03-04 2002-10-22 Matsushita Electric Industrial Co., Ltd Processor capable of efficiently executing many asynchronous event tasks
US6487590B1 (en) 1998-10-30 2002-11-26 Lucent Technologies Inc. Method for controlling a network element from a remote workstation
US6505229B1 (en) * 1998-09-25 2003-01-07 Intelect Communications, Inc. Method for allowing multiple processing threads and tasks to execute on one or more processor units for embedded real-time processor systems
US6529958B1 (en) 1998-07-17 2003-03-04 Kabushiki Kaisha Toshiba Label switched path set up scheme with reduced need for label set up retry operation
US6560628B1 (en) 1998-04-27 2003-05-06 Sony Corporation Apparatus, method, and recording medium for scheduling execution using time slot data
US6560626B1 (en) 1998-04-02 2003-05-06 Microsoft Corporation Thread interruption with minimal resource usage using an asynchronous procedure call
US6567839B1 (en) 1997-10-23 2003-05-20 International Business Machines Corporation Thread switch control in a multithreaded processor system
US6584489B1 (en) 1995-12-07 2003-06-24 Microsoft Corporation Method and system for scheduling the use of a computer system resource using a resource planner and a resource provider
US6594698B1 (en) 1998-09-25 2003-07-15 Ncr Corporation Protocol for dynamic binding of shared resources
US6622155B1 (en) 1998-11-24 2003-09-16 Sun Microsystems, Inc. Distributed monitor concurrency control
US6631425B1 (en) 1997-10-28 2003-10-07 Microsoft Corporation Just-in-time activation and as-soon-as-possible deactivation or server application components
US6766515B1 (en) 1997-02-18 2004-07-20 Silicon Graphics, Inc. Distributed scheduling of parallel jobs with no kernel-to-kernel communication
US6785887B2 (en) 2000-12-27 2004-08-31 International Business Machines Corporation Technique for using shared resources on a multi-threaded processor

Family Cites Families (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US533280A (en) * 1895-01-29 doug-las
US4450575A (en) * 1983-01-17 1984-05-22 General Electric Company X-Ray tomography table having a virtual fulcrum arm pivot
US5168566A (en) * 1983-11-25 1992-12-01 Sharp Kabushiki Kaisha Multi-task control device for central processor task execution control provided as a peripheral device and capable of prioritizing and timesharing the tasks
US5497497A (en) * 1989-11-03 1996-03-05 Compaq Computer Corp. Method and apparatus for resetting multiple processors using a common ROM
US5210873A (en) * 1990-05-25 1993-05-11 Csi Control Systems International, Inc. Real-time computer system with multitasking supervisor for building access control or the like
US5088788A (en) * 1991-03-22 1992-02-18 Moulton Lee A Vehicle cover apparatus
ATE187268T1 (en) * 1992-07-06 1999-12-15 Microsoft Corp METHOD FOR NAMING AND BINDING OBJECTS
US5362478A (en) * 1993-03-26 1994-11-08 Vivorx Pharmaceuticals, Inc. Magnetic resonance imaging with fluorocarbons encapsulated in a cross-linked polymeric shell
JPH07129418A (en) * 1993-11-08 1995-05-19 Fanuc Ltd Program control system for multi-task environment
US5490272A (en) * 1994-01-28 1996-02-06 International Business Machines Corporation Method and apparatus for creating multithreaded time slices in a multitasking operating system
US6055559A (en) * 1994-03-07 2000-04-25 Fujitsu Limited Process switch control apparatus and a process control method
US5613114A (en) * 1994-04-15 1997-03-18 Apple Computer, Inc System and method for custom context switching
DE69635409T2 (en) 1995-03-06 2006-07-27 Intel Corp., Santa Clara A COMPUTER SYSTEM WITH UNBEATED ON-REQUEST AVAILABILITY
US5902352A (en) 1995-03-06 1999-05-11 Intel Corporation Method and apparatus for task scheduling across multiple execution sessions
US5768352A (en) 1995-05-10 1998-06-16 Mci Communications Corporation Generalized statistics engine for telephone network
US5812844A (en) * 1995-12-07 1998-09-22 Microsoft Corporation Method and system for scheduling the execution of threads using optional time-specific scheduling constraints
US6282561B1 (en) * 1995-12-07 2001-08-28 Microsoft Corporation Method and system for resource management with independent real-time applications on a common set of machines
US5960212A (en) * 1996-04-03 1999-09-28 Telefonaktiebolaget Lm Ericsson (Publ) Universal input/output controller having a unique coprocessor architecture
US6233544B1 (en) * 1996-06-14 2001-05-15 At&T Corp Method and apparatus for language translation
US5937187A (en) * 1996-07-01 1999-08-10 Sun Microsystems, Inc. Method and apparatus for execution and preemption control of computer process entities
JP3832517B2 (en) * 1996-07-05 2006-10-11 セイコーエプソン株式会社 Robot controller and control method thereof
CA2213371C (en) * 1996-08-28 2003-01-28 Hitachi, Ltd. Process executing method and resource accessing method in computer system
US6401099B1 (en) * 1996-12-06 2002-06-04 Microsoft Corporation Asynchronous binding of named objects
US5958028A (en) * 1997-07-22 1999-09-28 National Instruments Corporation GPIB system and method which allows multiple thread access to global variables
US6697935B1 (en) * 1997-10-23 2004-02-24 International Business Machines Corporation Method and apparatus for selecting thread switch events in a multithreaded processor
US5987492A (en) * 1997-10-31 1999-11-16 Sun Microsystems, Inc. Method and apparatus for processor sharing
US6125447A (en) * 1997-12-11 2000-09-26 Sun Microsystems, Inc. Protection domains to provide security in a computer system
US6018759A (en) * 1997-12-22 2000-01-25 International Business Machines Corporation Thread switch tuning tool for optimal performance in a computer processor
US6430593B1 (en) * 1998-03-10 2002-08-06 Motorola Inc. Method, device and article of manufacture for efficient task scheduling in a multi-tasking preemptive priority-based real-time operating system
US6652889B2 (en) * 1998-06-01 2003-11-25 Albemarle Corporation Concentrated aqueous bromine solutions and their preparation and use
US6952827B1 (en) * 1998-11-13 2005-10-04 Cray Inc. User program and operating system interface in a multithreaded environment

Patent Citations (93)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4872167A (en) 1986-04-01 1989-10-03 Hitachi, Ltd. Method for displaying program executing circumstances and an apparatus using the same
US4819234A (en) 1987-05-01 1989-04-04 Prime Computer, Inc. Operating system debugger
US5257358A (en) 1989-04-18 1993-10-26 Nec Electronics, Inc. Method for counting the number of program instruction completed by a microprocessor
US5564051A (en) 1989-08-03 1996-10-08 International Business Machines Corporation Automatic update of static and dynamic files at a remote network node in response to calls issued by or for application programs
EP0422945A2 (en) 1989-10-13 1991-04-17 International Business Machines Corporation Parallel processing trace data manipulation
US5168554A (en) 1989-10-13 1992-12-01 International Business Machines Corporation Converting trace data from processors executing in parallel into graphical form
US5197138A (en) 1989-12-26 1993-03-23 Digital Equipment Corporation Reporting delayed coprocessor exceptions to code threads having caused the exceptions by saving and restoring exception state during code thread switching
US5179702A (en) * 1989-12-29 1993-01-12 Supercomputer Systems Limited Partnership System and method for controlling a highly parallel multiprocessor using an anarchy based scheduler for parallel execution thread scheduling
US6195676B1 (en) 1989-12-29 2001-02-27 Silicon Graphics, Inc. Method and apparatus for user side scheduling in a multiprocessor operating system program that implements distributive scheduling of processes
US5333280A (en) 1990-04-06 1994-07-26 Nec Corporation Parallel pipelined instruction processing system for very long instruction word
US5504932A (en) 1990-05-04 1996-04-02 International Business Machines Corporation System for executing scalar instructions in parallel based on control bits appended by compounding decoder
EP0455966A2 (en) 1990-05-10 1991-11-13 International Business Machines Corporation Compounding preprocessor for cache
US5301325A (en) 1991-03-07 1994-04-05 Digital Equipment Corporation Use of stack depth to identify architechture and calling standard dependencies in machine code
US5652889A (en) 1991-03-07 1997-07-29 Digital Equipment Corporation Alternate execution and interpretation of computer program having code at unknown locations due to transfer instructions having computed destination addresses
US5450575A (en) 1991-03-07 1995-09-12 Digital Equipment Corporation Use of stack depth to identify machine code mistakes
US5598560A (en) 1991-03-07 1997-01-28 Digital Equipment Corporation Tracking condition codes in translation code for different machine architectures
US5524250A (en) 1991-08-23 1996-06-04 Silicon Graphics, Inc. Central processing unit for processing a plurality of threads using dedicated general purpose registers and masque register for providing access to the registers
EP0537098A2 (en) 1991-10-11 1993-04-14 International Business Machines Corporation Event handling mechanism having a filtering process and an action association process
US5594864A (en) 1992-04-29 1997-01-14 Sun Microsystems, Inc. Method and apparatus for unobtrusively monitoring processor states and characterizing bottlenecks in a pipelined processor executing grouped instructions
US5485626A (en) 1992-11-03 1996-01-16 International Business Machines Corporation Architectural enhancements for parallel computer systems utilizing encapsulation of queuing allowing small grain processing
US5526521A (en) 1993-02-24 1996-06-11 International Business Machines Corporation Method and system for process scheduling from within a current context and switching contexts only when the next scheduled context is different
US5712996A (en) 1993-03-15 1998-01-27 Siemens Aktiengesellschaft Process for dividing instructions of a computer program into instruction groups for parallel processing
US5581764A (en) 1993-04-30 1996-12-03 Novadigm, Inc. Distributed computer network including hierarchical resource information structure and related method of distributing resources
US6219690B1 (en) 1993-07-19 2001-04-17 International Business Machines Corporation Apparatus and method for achieving reduced overhead mutual exclusion and maintaining coherency in a multiprocessor system utilizing execution history and thread monitoring
US5557761A (en) 1994-01-25 1996-09-17 Silicon Graphics, Inc. System and method of generating object code using aggregate instruction movement
US5632032A (en) 1994-02-07 1997-05-20 International Business Machines Corporation Cross address space thread control in a multithreaded environment
US5668993A (en) 1994-02-28 1997-09-16 Teleflex Information Systems, Inc. Multithreaded batch processing system
US5966539A (en) 1994-03-01 1999-10-12 Digital Equipment Corporation Link time optimization with translation to intermediate program and following optimization techniques including program analysis code motion live variable set generation order analysis, dead code elimination and load invariant analysis
US5533192A (en) 1994-04-21 1996-07-02 Apple Computer, Inc. Computer program debugging system and method
US5754855A (en) 1994-04-21 1998-05-19 International Business Machines Corporation System and method for managing control flow of computer programs executing in a computer system
US5805892A (en) 1994-09-26 1998-09-08 Nec Corporation Method of and apparatus for debugging multitask programs
US5768592A (en) 1994-09-27 1998-06-16 Intel Corporation Method and apparatus for managing profile data
US5812811A (en) * 1995-02-03 1998-09-22 International Business Machines Corporation Executing speculative parallel instructions threads with forking and inter-thread communication
US5953530A (en) 1995-02-07 1999-09-14 Sun Microsystems, Inc. Method and apparatus for run-time memory access checking and memory leak detection of a multi-threaded program
US5740413A (en) 1995-06-19 1998-04-14 Intel Corporation Method and apparatus for providing address breakpoints, branch breakpoints, and single stepping
US5621886A (en) 1995-06-19 1997-04-15 Intel Corporation Method and apparatus for providing efficient software debugging
US5768591A (en) 1995-09-08 1998-06-16 Iq Systems Method of de-bugging host-processor software in a distributed processing system having a host processor and at least one object oriented processor
US5774721A (en) 1995-09-08 1998-06-30 Iq Systems, Inc. Method of communication between processors in a distributed processing system having a host processor and at least one object oriented processor
US5867643A (en) 1995-11-06 1999-02-02 Apple Computer, Inc. Run-time data type description mechanism for performance information in an extensible computer system
US5787245A (en) 1995-11-13 1998-07-28 Object Technology Licensing Corporation Portable debugging service utilizing a client debugger object and a server debugger object
US5778230A (en) 1995-11-13 1998-07-07 Object Technology Licensing Corp. Goal directed object-oriented debugging system
GB2307760A (en) 1995-11-29 1997-06-04 At & T Corp Isochronal updating of data records
US6584489B1 (en) 1995-12-07 2003-06-24 Microsoft Corporation Method and system for scheduling the use of a computer system resource using a resource planner and a resource provider
US5774358A (en) 1996-04-01 1998-06-30 Motorola, Inc. Method and apparatus for generating instruction/data streams employed to verify hardware implementations of integrated circuit designs
US6049671A (en) 1996-04-18 2000-04-11 Microsoft Corporation Method for identifying and obtaining computer software from a network computer
DE19710252A1 (en) 1996-08-23 1998-02-26 Fujitsu Ltd Displaying results of processing power monitoring and analysis of parallel processing system
US5903730A (en) 1996-08-23 1999-05-11 Fujitsu Limited Method of visualizing results of performance monitoring and analysis in a parallel computing system
US5826265A (en) 1996-12-06 1998-10-20 International Business Machines Corporation Data management system having shared libraries
US5913925A (en) 1996-12-16 1999-06-22 International Business Machines Corporation Method and system for constructing a program including out-of-order threads and processor and method for executing threads out-of-order
US5887166A (en) * 1996-12-16 1999-03-23 International Business Machines Corporation Method and system for constructing a program including a navigation instruction
US5961639A (en) 1996-12-16 1999-10-05 International Business Machines Corporation Processor and method for dynamically inserting auxiliary instructions within an instruction stream during execution
US6088788A (en) 1996-12-27 2000-07-11 International Business Machines Corporation Background completion of instruction and associated fetch request in a multithread processor
EP0855648A2 (en) 1997-01-24 1998-07-29 Texas Instruments Inc. Data processing with parallel or sequential execution of program instructions
US6766515B1 (en) 1997-02-18 2004-07-20 Silicon Graphics, Inc. Distributed scheduling of parallel jobs with no kernel-to-kernel communication
US6470376B1 (en) * 1997-03-04 2002-10-22 Matsushita Electric Industrial Co., Ltd Processor capable of efficiently executing many asynchronous event tasks
EP0864979A2 (en) 1997-03-10 1998-09-16 Digital Equipment Corporation Processor performance counter for sampling the execution frequency of individual instructions
US6009269A (en) 1997-03-10 1999-12-28 Digital Equipment Corporation Detecting concurrency errors in multi-threaded programs
US6059840A (en) * 1997-03-17 2000-05-09 Motorola, Inc. Automatic scheduling of instructions to reduce code size
US6029005A (en) 1997-04-01 2000-02-22 Intel Corporation Method for identifying partial redundancies in a new processor architecture
US6002879A (en) 1997-04-01 1999-12-14 Intel Corporation Method for performing common subexpression elimination on a rack-N static single assignment language
US6151704A (en) 1997-04-01 2000-11-21 Intel Corporation Method for optimizing a loop in a computer program by speculatively removing loads from within the loop
US6298370B1 (en) 1997-04-04 2001-10-02 Texas Instruments Incorporated Computer operating process allocating tasks between first and second processors at run time based upon current processor load
US5978902A (en) 1997-04-08 1999-11-02 Advanced Micro Devices, Inc. Debug interface including operating system access of a serial/parallel debug port
US6058493A (en) 1997-04-15 2000-05-02 Sun Microsystems, Inc. Logging and reproduction of automated test operations for computing systems
US5901315A (en) 1997-06-13 1999-05-04 International Business Machines Corporation Method for debugging a Java application having native method dynamic load libraries
US6233599B1 (en) 1997-07-10 2001-05-15 International Business Machines Corporation Apparatus and method for retrofitting multi-threaded operations on a computer by partitioning and overlapping registers
US6282638B1 (en) * 1997-08-01 2001-08-28 Micron Technology, Inc. Virtual shadow registers and virtual register windows
US6003066A (en) 1997-08-14 1999-12-14 International Business Machines Corporation System for distributing a plurality of threads associated with a process initiating by one data processing station among data processing stations
US5877766A (en) 1997-08-15 1999-03-02 International Business Machines Corporation Multi-node user interface component and method thereof for use in accessing a plurality of linked records
US6151701A (en) 1997-09-30 2000-11-21 Ahpah Software, Inc. Method for reconstructing debugging information for a decompiled executable file
US6212544B1 (en) 1997-10-23 2001-04-03 International Business Machines Corporation Altering thread priorities in a multithreaded processor
US6567839B1 (en) 1997-10-23 2003-05-20 International Business Machines Corporation Thread switch control in a multithreaded processor system
US6105051A (en) * 1997-10-23 2000-08-15 International Business Machines Corporation Apparatus and method to guarantee forward progress in execution of threads in a multithreaded processor
US6101524A (en) 1997-10-23 2000-08-08 International Business Machines Corporation Deterministic replay of multithreaded applications
US6631425B1 (en) 1997-10-28 2003-10-07 Microsoft Corporation Just-in-time activation and as-soon-as-possible deactivation or server application components
US6112293A (en) 1997-11-17 2000-08-29 Advanced Micro Devices, Inc. Processor configured to generate lookahead results from operand collapse unit and for inhibiting receipt/execution of the first instruction based on the lookahead result
US6272520B1 (en) 1997-12-31 2001-08-07 Intel Corporation Method for detecting thread switch events
US6016542A (en) 1997-12-31 2000-01-18 Intel Corporation Detecting long latency pipeline stalls for thread switching
US6002872A (en) 1998-03-31 1999-12-14 International Machines Corporation Method and apparatus for structured profiling of data processing systems and applications
US6560626B1 (en) 1998-04-02 2003-05-06 Microsoft Corporation Thread interruption with minimal resource usage using an asynchronous procedure call
US6072952A (en) 1998-04-22 2000-06-06 Hewlett-Packard Co. Method and apparatus for coalescing variables
US6560628B1 (en) 1998-04-27 2003-05-06 Sony Corporation Apparatus, method, and recording medium for scheduling execution using time slot data
US6223202B1 (en) 1998-06-05 2001-04-24 International Business Machines Corp. Virtual machine pooling
US6094716A (en) * 1998-07-14 2000-07-25 Advanced Micro Devices, Inc. Register renaming in which moves are accomplished by swapping rename tags
US6529958B1 (en) 1998-07-17 2003-03-04 Kabushiki Kaisha Toshiba Label switched path set up scheme with reduced need for label set up retry operation
US6594698B1 (en) 1998-09-25 2003-07-15 Ncr Corporation Protocol for dynamic binding of shared resources
US6505229B1 (en) * 1998-09-25 2003-01-07 Intelect Communications, Inc. Method for allowing multiple processing threads and tasks to execute on one or more processor units for embedded real-time processor systems
US6289446B1 (en) 1998-09-29 2001-09-11 Axis Ab Exception handling utilizing call instruction with context information
US6487590B1 (en) 1998-10-30 2002-11-26 Lucent Technologies Inc. Method for controlling a network element from a remote workstation
US6622155B1 (en) 1998-11-24 2003-09-16 Sun Microsystems, Inc. Distributed monitor concurrency control
US6466898B1 (en) * 1999-01-12 2002-10-15 Terence Chan Multithreaded, mixed hardware description languages logic simulation on engineering workstations
US6785887B2 (en) 2000-12-27 2004-08-31 International Business Machines Corporation Technique for using shared resources on a multi-threaded processor
US20020103847A1 (en) 2001-02-01 2002-08-01 Hanan Potash Efficient mechanism for inter-thread communication within a multi-threaded computer system

Non-Patent Citations (43)

* Cited by examiner, † Cited by third party
Title
"Method of Tracing Events in Multi-Threaded OS/2 Applications," IBM Tech. Disclosure Bulletin, Sep. 1993, pp. 19-22.
Adelberg, Brad et al., "The Strip Rule System for Efficient Maintaining Derived Data," Sigmod Record, Association for Computing Machinery, New York, vol. 26, No. 2, Jun. 1, 1997.
Agrawal, Gagan et al., "Interprocedural Data Flow Based Optimizations for Compilation of Irregular Problems," Annual Workshop on Language and Compilers for Parallel Computing, 1995.
Agrawal, Hiralal, "Dominators, Super Blocks and Program Coverage," 21st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Portland, Oregon, Jan. 17-21, 1994.
Alverson, Gail et al., "Processor Management in the Tera MTA System," 1995.
Alverson, Gail et al., "Scheduling on the Tera MTA," Job Scheduling Strategies for Parallel Processing, 1995.
Alverson, Gail et al., "Tera Hardware-Software Cooperation," Proceedings of Supercomputing 1997, San Jose, California, Nov. 1997.
Alverson, Robert et al., "The Tera Computer System,"Proceedings of 1990 ACM International Conference on Supercomputing, Jun. 1990.
Anderson, Jennifer, et al., "Continuous Profiling: Where Have All The Cycles Gone?," Operating SystemsReview, ACM Headquarters, New York, vol. 31, No. 5, Dec. 1, 1997.
Bailey, D.H. et al., "The NAS Parallel Benchmarks-Summary and Preliminary Results," Numerical Aerodynamic Simulation (NAS) Systems Division, NASA Ames Research Center, California, 1991.
Briggs, Preston et al., "Coloring Register Pairs," ACM Letters on Programming Languages and Systems, vol. 1, No. 1, Mar. 1992.
Briggs, Preston et al., "Effective Partial Redundancy Elimination," ACM SIGPLAN Notices, Association for Computing Machinery, New York, vol. 29, No. 6, Jun. 1, 1994.
Brigss, Preston, et al., "Coloring Heuristics for Register Allocation," Department of Computer Science, Rice University, Houston, Texas, Jun. 1989.
Callahan, David et al., "Improving Register Allocation for Subscription Variables," Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, White Plans, New York, Jun. 20-22, 1990.
Callahan, David et al., "Register Allocation via Hierarchical Graph Coloring," Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design and Implementation, Toronto, Canada, Jun. 26-28, 1991.
Callahan, David et al., A Future-Based Parallel Language for a General-Purpose Highly-Parallel Computer, Languages and Compilers for Parallel Computing, MIT Press, 1990.
Callahan, David, "Recognizing and Parallelizing Bounded Recurrences," Aug. 1991.
Chow, Fred C. et al., "The Priority-Based Coloring Approach to Register Allocation," ACM Transactions on Programming Languages and Systems, vol. 12, No. 4, Oct. 1990, pp. 501-536.
Click, Cliff, "Global Code Motion, Global Value Numbering," ACM SIGPLAN Notices, Association for Computing Machinery, New York, vol. 30, No. 6, Jun. 1, 1995.
Cook, Jonathan et al., "Event Based Detection of Concurrency," SIGSOFT '98 ACM, 1998, pp. 34-45.
Davidson, Jack W. et al., "Reducing the Cost of Branches by Using Registers," Proceedings of the 17th Annual Symposium on Computer Architecture, Seattle Washington, May 28-31, 1990.
Galarowicz, Jim et al., "Analyzing Message Passing Programs on the Cray T3E with PAT and VAMPIR," Research Report, "Online!", May 1998.
Goldman, Kenneth, J., "Introduction to Data Structures," 1996, Retrieved from Internet https://www.cs.wustl.edu/{kjg/CS101<SUB>-</SUB>SP97/Notes?DataStructures/structures.html.
Hayashi, H. et al., "ALPHA: A High Performance Lisp Machine Equipped with a New Stack Structure and Garbage Collection System," 10<SUP>th </SUP>Annual International Symposium on Computer Architecture, 1983.
Knoop, Jens et al., "The Power of Assignment Motion," ACM SIGPLAN '95 Conference on Programming Language Design and Implementation, La Jolla, California, Jun. 18-21, 1995.
Kolte, Priyadarshan et al., "Load/Store Range Analysis for Global Register Allocation," ACM-SIGPLAN, Jun. 1993.
Korry, Richard et al., "Memory Management in the Tera MTA System," 1995.
Lal, George et al., "Iterated Register Coalescing," ACM Transactions on Programming Languages and Systems, vol. 18, No. 3, May 1996, pp. 300-324.
Lang, Tomas et al., "Reduced Register Saving/Restoring in Single-Window Register Files," Computer Architecture News, vol. 14, No. 3, Jun. 1986.
Linton, Mark A., "The Evolution of Dbx,"USENIX Summer Conference, Jun. 11-15, 1990.
Major System Characteristics of the TERA MTA, 1995.
Minwen, Ji et al., "Performance Measurements for Multithreaded Programs," SIGMETRICS '98, ACM, 1998, pp. 168-170.
Ram, A. et al., "Parallel Garbage Collection Without Synchronization Overhead," 12<SUP>th </SUP>Annual Symposium on Computer Architecture, Jun. 17, 1985.
Shim, SangMin et al., Split-Path Enhanced Pipeline Scheduling for Loops with Contorl Flows, IEEE, Dec. 2, 1998.
Silberschatz et al., "Operating System Concepts," Fifth Edition, John Wiley & Sons, Inc., 1998, p. 103.
Smith, Burton, "Opportunities for Growth in High Performance Computing," Nov. 1994.
Smith, Burton, "The End of Architecture," Keynote Address Presented at the 17<SUP>th </SUP>Annual Symposium on Computer Architecture, Seattle, Washington, May 29, 1990.
Smith, Burton, The Quest for General-Purpose Parallel Computing.
Sreedhar, Vugranam C. et al., "Incremental Computation of Dominator Trees," ACM SIGPLAN Notices, Association for Computing Machinery, New York, vol. 30, No. 3, Mar. 1, 1995.
Surajit, Chaudhuri et al., "An Overview of Data Warehousing and OLAP Technology," Sigmond Record, Association for Computing, New York, vol. 26, No. 1, Mar. 1997.
Tera MTA Principles of Operation, Nov. 18, 1997.
Touzeau, Roy F., "A Fortran Compiler for the FPS-164 Scientific Computer," Proceedings of the ACM SIGPLAN '84 Symposium on Compiler Construction, SIGPLAN Notices 19(6):48-57, Jun. 1984.
Tsai, Jenn-Yuan et al., "Performance Study of a Concurrent Multithreaded Processor," IEEE, 1998, pp. 24-35.

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050132364A1 (en) * 2003-12-16 2005-06-16 Vijay Tewari Method, apparatus and system for optimizing context switching between virtual machines
US20070089111A1 (en) * 2004-12-17 2007-04-19 Robinson Scott H Virtual environment manager
US9606821B2 (en) 2004-12-17 2017-03-28 Intel Corporation Virtual environment manager for creating and managing virtual machine environments
US10019273B2 (en) 2004-12-17 2018-07-10 Intel Corporation Virtual environment manager
US10642634B2 (en) 2004-12-17 2020-05-05 Intel Corporation Method, apparatus and system for transparent unification of virtual machines
US11347530B2 (en) 2004-12-17 2022-05-31 Intel Corporation Method, apparatus and system for transparent unification of virtual machines
US20100262971A1 (en) * 2008-07-22 2010-10-14 Toyota Jidosha Kabushiki Kaisha Multi core system, vehicular electronic control unit, and task switching method
US8856196B2 (en) * 2008-07-22 2014-10-07 Toyota Jidosha Kabushiki Kaisha System and method for transferring tasks in a multi-core processor based on trial execution and core node
US9195575B2 (en) 2013-05-17 2015-11-24 Coherent Logix, Incorporated Dynamic reconfiguration of applications on a multi-processor embedded system
US9990227B2 (en) 2013-05-17 2018-06-05 Coherent Logix, Incorporated Dynamic reconfiguration of applications on a multi-processor embedded system
US11023272B2 (en) 2013-05-17 2021-06-01 Coherent Logix, Incorporated Dynamic reconfiguration of applications on a multi-processor embedded system
US11726812B2 (en) 2013-05-17 2023-08-15 Coherent Logix, Incorporated Dynamic reconfiguration of applications on a multi-processor embedded system

Also Published As

Publication number Publication date
US20040093603A1 (en) 2004-05-13
US20040064818A1 (en) 2004-04-01
US7191444B2 (en) 2007-03-13
US20040088711A1 (en) 2004-05-06
US20040078795A1 (en) 2004-04-22
US7426732B2 (en) 2008-09-16
US20040064816A1 (en) 2004-04-01
US7392525B2 (en) 2008-06-24
US7536690B2 (en) 2009-05-19
US6952827B1 (en) 2005-10-04

Similar Documents

Publication Publication Date Title
US7360221B2 (en) Task swap out in a multithreaded environment
US7904685B1 (en) Synchronization techniques in a multithreaded environment
US7428727B2 (en) Debugging techniques in a multithreaded environment
Anderson et al. Scheduler activations: Effective kernel support for the user-level management of parallelism
US6314471B1 (en) Techniques for an interrupt free operating system
US6799236B1 (en) Methods and apparatus for executing code while avoiding interference
US5305455A (en) Per thread exception management for multitasking multithreaded operating system
US5515538A (en) Apparatus and method for interrupt handling in a multi-threaded operating system kernel
US4685125A (en) Computer system with tasking
JP2866241B2 (en) Computer system and scheduling method
EP0767938B1 (en) Method for enforcing a hierarchical invocation structure in real time asynchronous software applications
US20020161957A1 (en) Methods and systems for handling interrupts
US9274859B2 (en) Multi processor and multi thread safe message queue with hardware assistance
US5666523A (en) Method and system for distributing asynchronous input from a system input queue to reduce context switches
US20160085601A1 (en) Transparent user mode scheduling on traditional threading systems
US20020046230A1 (en) Method for scheduling thread execution on a limited number of operating system threads
EP0531107A2 (en) Program execution manager
US7661115B2 (en) Method, apparatus and program storage device for preserving locked pages in memory when in user mode
US11620215B2 (en) Multi-threaded pause-less replicating garbage collection
US20100138685A1 (en) Real-Time Signal Handling In Guest And Host Operating Systems
US7360213B1 (en) Method for promotion and demotion between system calls and fast kernel calls
EP1197857A2 (en) Method of controlling a computer
US7996848B1 (en) Systems and methods for suspending and resuming threads
JP4006428B2 (en) Computer system
Bhoedjang et al. User-space solutions to thread switching overhead

Legal Events

Date Code Title Description
STCF Information on status: patent grant

Free format text: PATENTED CASE

CC Certificate of correction
FPAY Fee payment

Year of fee payment: 4

AS Assignment

Owner name: CRAY INC., WASHINGTON

Free format text: CHANGE OF NAME;ASSIGNOR:TERA COMPUTER COMPANY;REEL/FRAME:031293/0937

Effective date: 20000403

Owner name: TERA COMPUTER COMPANY, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ALVERSON, GAIL A.;CALLAHAN, II, CHARLES DAVID;COATNEY, SUSAN L.;AND OTHERS;SIGNING DATES FROM 19990204 TO 19990208;REEL/FRAME:031287/0050

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 8

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 12