THE CLI'S EXECUTION ENVIRONMENTread.pudn.com/downloads147/ebook/636538/Inside Rotor/NuttDVM_… ·...

THE CLI'S EXECUTION ENVIRONMENT

© 2005 by Pearson Education, Inc.

10.1 SECTION TITLE 265

In the previous chapters, you learned about the aspects of the CLI that are most rel-evant to DVM technology. This final chapter summarizes the environment in which the CLI DVM executes. In the CLR, managed applications generally use classes from the .NET namespace to obtain CLR services; these classes are implemented in Microsoft’s Framework Class Library (FCL). In Rotor, only a subset of the CLR namespace is avail-able to managed applications, and that subset is implemented in the base class library; Rotor interacts with the base class library to manage application programs (see Figure 10.1). The platform adaptation layer (PAL) enables the Rotor CLI to use the system call functions that are exported from the host operating system. In the case of Windows, the PAL is a thin veneer, since the CLR code (from which the CLI is derived) was designed and implemented to run on the Windows NT family of operating systems. In the case of FreeBSD and OS X, the PAL is a substantial software project. The base class library and PAL are reviewed in this chapter.

FIGURE 10.1 The Environs

Managed ApplicationsManaged Applications

Rotor Base Class LibraryRotor Base Class Library

AppDomain

Policy

Manager

ClassLoader

MethodTable

AssemblyLoader

EE

Platform Adaptation Layer (PAL)Platform Adaptation Layer (PAL)

Host OSHost OS

HardwareHardware

CLI

JIT

Compiler

10.1 THE BASE CLASS LIBRARYThe mscoree.dll implements the execution engine for the Rotor CLI. As you learned in Chapters 7–9, the operation of the mscoree depends heavily on the behavior of the mscorlib.dll, the compiled version of the base class library.

The source code for the base class library is in .../clr/src/bcl/. This directory is organized to reflect the system namespace organization. The directory contains about 150 C# files to define components in the system namespace, ranging from the system.

10.1 THE BASE CLASS LIBRARY 265


266 CHAPTER 10 >> THE CLI'S EXECUTION ENVIRONMENT

activator class to the system.weakreference class. The directory also contains subdirectories to implement sub namespaces:

• collections

• configuration

> assemblies

• diagnostics

> symbolstore

• globalization

• io

> isolatedstorage

• reflection

> cache

> emit

• resources

• runtime

> compilerservices

> interopservices

• expando

> remoting

> serialization

• formatters

• binary

• security

> permissions

> policy

> principal

> util

• text

• threading

By writing the hierarchical names in dotted notation, you can infer which namespaces are included in the base class library. (For example, the “collections” directory contains classes in the system.collections namespace. Similarly, since there is an “assem-blies” directory under “configurations,” the base class library includes classes from the system.configuration.assemblies namespace.) If you intend to use any of these classes, the MSDN CLR documentation describes all FCL classes, including those in the Rotor base class library.

The mscorlib code is managed C# code, but the Rotor CLI is unmanaged C++ code. This means that when an mscorlib function calls a CLI function, it would normally need to use the P/Invoke mechanism. However, many of the C# functions in the mscorlib


2.X SECTION TITLE 267

already have intimate knowledge of the CLI, so the CLR designers incorporated the fcall interface (see Chapter 7) to bypass the normal P/Invoke machinery. As you know, the CLI functions that are called with the fcall mechanism use special macros (and the ecall table), so they are easy to identify. Furthermore, the mscorlib functions are the only code that uses fcalls.

10.2 THE PLATFORM ADAPTATION LAYER The purpose of the Platform Adaptation Layer (PAL) is to adapt the Rotor CLI code so that it can run on a target OS and hardware (see Figure 10.2). In the Rotor implementation, it is interesting to note that all parts of the CLI other than the PAL code are the same for all host OS implementations—a remarkable piece of software design and implementation (even though it is somewhat tangential to DVM technology).

As mentioned earlier, the Rotor CLI source code was derived from Microsoft’s produc-tion CLR source code. The adaptation generally removed functionality, making it easier for people to study an ECMA-335 implementation without having to have access to all of Microsoft’s proprietary software (such as Internet Explorer). The second aspect of the challenge is to port the Rotor CLI code so that it can be executed on top of different operating systems and different hardware platforms. The Rotor Version 1.0 distribution has been ported to FreeBSD UNIX executing on an Intel 80386 architecture, and to Apple Macintosh computers that have had OS X adapted to their PowerPC hardware and low-level software environment.

FIGURE 10.2 The Challenge

PALPAL

OS XOS X

AdaptorAdaptor

PALPAL

Windows XPWindows XP

i386i386

CLR

Windows XPWindows XP

i386i386

Rotor

CLIFreeBSDFreeBSD

i386i386

Rotor

CLI

Rotor

CLI

Apple ToolsApple Tools

PowerPCPowerPC

10.2 THE PLATFORM ADAPTATION LAYER 267



What is involved in such a port of the Rotor CLI to a different OS? Observe that the CLR was designed to execute on top of the Win32 API, a system call interface of over 2,000 functions. This OS interface supports the usual (POSIX-like) process/memory/file/device management functions such as those used to create/destroy processes, allocate/deallocate resources (including the primary memory), and open/close and read/write files, as well as device functions. However, the Win32 API also exports functions to support kernel threads, memory-mapped files, and a diversity of synchronization models, as well as window graphics functions.

There are at least two obvious strategies you could take to adapt the Rotor CLI to a different OS, such as FreeBSD (the first port): (1) The Win32 API function calls in the CLI code could be replaced by FreeBSD system calls, or (2) a software module could be built to map Win32 API calls into the FreeBSD system calls. The problem with the first approach is that it necessitates that the implementation of the CLI be changed, since the adaptation requires that certain features be added to the adaptation software (such as syn-chronization implementations to allow Win32 API events to be implemented in terms of POSIX thread synchronization primitives). This would add complexity to the CLI code, which is a negative goal of releasing the Rotor CLI code under the shared source agree-ment.

The Rotor team decided to use the second strategy, namely to build a software module that would be inserted between the CLI and the host FreeBSD OS—the PAL. This is a nontrivial software module (you will find it beneficial to read the requirements documen-tation that comes with the distribution in .../docs/techinfo/pal_guide.html early in your study of the PAL). Due to the difficulty of fully implementing the Win32 API on top of the FreeBSD (POSIX.1) system call interface, the designers note that “The PAL has been specified such that correct implementation will enable the portable subset of the .NET Framework to be run on a variety of non-Microsoft® Windows® operating systems” (from the PAL documentation Overview). This means that there is no assurance that the PAL actually supplies full Win32 API compatibility, but it does implement enough of the functionality to enable users to run the Rotor CLI code on top of the PAL.

In the remainder of this section, we will examine some of the PAL functionality for the FreeBSD port. The OS X port is generally the same as the FreeBSD port; there are few enough differences to enable us to focus only on the FreeBSD port in this description.

10.2.1 Thread Management

When you run clix.exe from a UNIX shell, a new single-threaded OS process is cre-ated and it execs the clix.exe code. Rotor CLI is launched, and begins to execute in the single-threaded UNIX process. The default behavior of the CLI is to continue to execute as a single-threaded process, although the application can propagate new CLI threads using the System.Threading mscorlib API (see the class definition in .../src/bcl/system/threading/thread.cs). Summarizing the class Thread code frag-ment, this code ultimately makes an fcall to the CLI:



public sealed class Thread{ ... /*========================================================================= ** Creates a new Thread object which will begin execution at ** start.ThreadStart on a new thread when the Start method is called. ...=========================================================================*/ ... public Thread(ThreadStart start) { if (start == null) {...} SetStart(start); }

SetStart(start)—in the same file—is defined as follows:

... [MethodImplAttribute(MethodImplOptions.InternalCall)] private extern void SetStart(ThreadStart start);


The fcall linkage in .../clr/src/vm/ecall.cpp causes the following mscoree method (defined in .../clr/src/vm/comsynchronizable.cpp) to be called:

FCIMPL2(void, ThreadNative::SetStart, ThreadBaseObject* pThisUNSAFE, Object* pDelegateUNSAFE) { ... if (pThis->m_InternalThread == NULL) { // if we don't have an internal Thread object associated with this // exposed object, now is our first opportunity to create one. Thread *unstarted = SetupUnstartedThread(); pThis->SetInternal(unstarted); unstarted->SetExposedObject(pThis); } ...}FCIMPLEND

The SetupUnstartedThread() function is defined in .../clr/src/vm/threads.cpp. Notice that it creates a new CLI Thread object, then adds it to the ThreadStore object:

//-------------------------------------------------------------------------// Public function: SetupUnstartedThread()// This sets up a Thread object for an exposed System.Thread that// has not been started yet. This allows us to properly enumerate all // threads in the ThreadStore, so we can report on even unstarted threads.// Clearly there is no physical thread to match, yet.//



// When there is, complete the setup with Thread::HasStarted()//---------------------------------------------------------------Thread* SetupUnstartedThread(){ _ASSERTE(ThreadInited()); Thread* pThread = new Thread();

if (pThread) { FastInterlockOr((ULONG *) &pThread->m_State, (Thread::TS_Unstarted | Thread::TS_WeOwn)); ThreadStore::AddThread(pThread); } return pThread;}

The Thread constructor can be found in .../clr/src/vm/threads.cpp file. Here you can see that a number of CLI-specific fields are initialized to define the CLI thread (in contrast with the platform thread).

Thread::Thread(){ m_pFrame = FRAME_TOP; m_pUnloadBoundaryFrame = NULL;

m_fPreemptiveGCDisabled = 0; ... m_dwLockCount = 0; // Initialize lock state ... m_alloc_context.init();

m_UserInterrupt = 0; m_SafeEvent = m_SuspendEvent = INVALID_HANDLE_VALUE; m_EventWait = INVALID_HANDLE_VALUE; m_WaitEventLink.m_Next = NULL; m_WaitEventLink.m_LinkSB.m_pNext = NULL; m_ThreadHandle = INVALID_HANDLE_VALUE; m_ThreadHandleForClose = INVALID_HANDLE_VALUE; m_ThreadId = 0; m_Priority = INVALID_THREAD_PRIORITY; m_ExternalRefCount = 1; m_State = TS_Unstarted; m_StateNC = TSNC_Unknown; // It can't be a LongWeakHandle because we zero stuff out // of the exposed object as it is finalized. At that point, // calls to GetCurrentThread() had better get a new one,! m_ExposedObject = CreateGlobalShortWeakHandle(NULL); m_StrongHndToExposedObject = CreateGlobalStrongHandle(NULL);



m_LastThrownObjectHandle = NULL;

// Zeros out both filter CONTEXT* and the extra state flags. m_debuggerWord1 = NULL; m_debuggerCantStop = 0; ... m_PreventAsync = 0; m_pDomain = NULL; m_Context = NULL; m_TraceCallCount = 0; m_ThrewControlForThread = 0; m_OSContext = NULL; m_ThreadTasks = (ThreadTasks)0;

Thread *pThread = GetThread(); _ASSERTE(SystemDomain::System()->DefaultDomain()->GetDefaultContext()); InitContext(); _ASSERTE(m_Context); if (pThread) { _ASSERTE(pThread->GetDomain() && pThread->GetDomain() ->GetDefaultContext()); // Start off the new thread in the default context of // the creating thread's appDomain. This could be changed by // SetDelegate SetKickOffDomain(pThread->GetDomain()); } else SetKickOffDomain(SystemDomain::System()->DefaultDomain()); // The state and the tasks must be 32-bit aligned for atomicity to // be guaranteed. _ASSERTE((((size_t) &m_State) & 3) == 0); _ASSERTE((((size_t) &m_ThreadTasks) & 3) == 0);

m_dNumAccessOverrides = 0; // Track perf counter for the logical thread object. COUNTER_ONLY(GetPrivatePerfCounters().m_LocksAndThreads. cCurrentThreadsLogical++); COUNTER_ONLY(GetGlobalPerfCounters().m_LocksAndThreads. cCurrentThreadsLogical++);

... m_pSharedStaticData = NULL; m_pUnsharedStaticData = NULL; m_pStaticDataList = NULL; m_pDLSHash = NULL; m_pCtx = NULL;




m_fSecurityStackwalk = FALSE; m_compressedStack = NULL; m_fPLSOptimizationState = TRUE;

m_pFusionAssembly = NULL; m_pAssembly = NULL; m_pModuleToken = mdFileNil; ...}

GetThread() is a platform-specific function call (see .../clr/src/vm/thread.cpp).

After the thread has been created, it is started by the call to the System.Threading Start() function (defined in .../clr/src/bcl/system/threading/thread.cs):

public void Start(){ // Attach current thread's security principal object to the // new thread. Be careful not to bind the current thread to // a principal if it's not already bound. IPrincipal principal = (IPrincipal) CallContext.SecurityData.Principal; StackCrawlMark stackMark = StackCrawlMark.LookForMyCaller; StartInternal(principal, ref stackMark);}

This code finally calls StartInternal()—another fcall, which is also defined in .../clr/src/vm/comsynchronizable.cpp:

FCIMPL3(void, ThreadNative::Start, ThreadBaseObject* pThisUNSAFE, ...){ THROWSCOMPLUSEXCEPTION();

HELPER_METHOD_FRAME_BEGIN_NOPOLL();

StartInner(pThisUNSAFE, pPrincipalUNSAFE, pStackMark);

HELPER_METHOD_FRAME_END_POLL();}FCIMPLEND

The essential call here (in tracing the creation of the thread) is the one to StartInner(), which is defined in the comsynchronizable.cpp file. This code makes the thread runnable:

// Start up a thread, which by now should be in the ThreadStore's // Unstarted list.void ThreadNative::StartInner(ThreadBaseObject* pThisUNSAFE, Object* pPrincipalUNSAFE, StackCrawlMark* pStackMark){ ... Thread *pCurThread = GetThread(); Thread *pNewThread = gc.pThis->GetInternal();



... // Is the thread already started? You can't restart a thread. ... // Carry over the state used by security to the new thread ... // Generate code-access security stack to carry over to thread.

... // As soon as we create the new thread, it is eligible for // suspension, etc. // So it gets transitioned to cooperative mode before this call // returns to us. It is our duty to start it running immediately, // so that GC isn't blocked. h = pNewThread->CreateNewThread(0 /*stackSize override*/, KickOffThread, share, &newThreadId); ... // After we have established the thread handle, we can check // m_Priority. This ordering is required to eliminate the race // condition on setting the priority of a thread just as it starts // up. ::SetThreadPriority(h, MapToNTPriority(gc.pThis->m_Priority)); // Before we do the resume, we need to take note of the new // ThreadId. This is necessary because -- before the thread starts // executing at KickofThread -- it may perform some DllMain // DLL_THREAD_ATTACH notifications. These could call into managed // code. During the consequent SetupThread, we need to perform // the Thread::HasStarted call instead of going through the normal // 'new thread' pathway. ... pNewThread->SetThreadId(newThreadId); share = NULL; // we have handed off ownership of the shared struct ... ::ResumeThread(h); ...}

The PAL thread store is the collection of PAL threads (contrasted with both CLI threads and OS threads). Each CLI thread has an associated Thread object, and appears in the system-wide ThreadStore object. There is not necessarily a one-to-one correspondence between CLI threads and PAL threads. The CLI ThreadpoolMgr object (see .../clr/src/vm/win32threadpool.h for the class specification) manages the binding between a PAL thread and one or more CLI threads. One of the data structures that you see in these code fragments is the ThreadPool.

The host platform thread support provides the idea of thread local storage (TLS), meaning a logical block of storage that is within the encapsulating OS process’s address space, yet which is private to a particular thread. CLI and PAL threads depend on TLS, so the PAL provides a collection of TLS functions in .../pal/unix/thread/localstorage.c. These functions are generally documented in MSDN OS documentation, and are reimplemented for the UNIX ports so that they provide a uniform TLS facility to the CLI threads.




10.2.2 Synchronization

The CLI code uses Win32 API synchronization functions (see the .../pal/unix/sync/ directory). The PAL implements these synchronization functions well enough to support the CLI, with no assurance that the implementations are robust. These primitives are intended to operate on CLI threads rather than on the host (POSIX) threads. This means that the PAL cannot simply implement the synchronization mechanism directly in terms of POSIX mechanisms; instead, the PAL must provide both its own queues to hold CLI threads that are blocked on a synchronization event and the algorithms to manage competition for those events. The CLI threads use critical sections, events, mutex, sema-phore, and wait primitives.

Some parts of the PAL synchronization implementation do use the POSIX thread syn-chronization mechanism. For example, the SuspendThread() function (implemented in .../pal/unix/thread/thread.c) uses the POSIX function pthread_mutex_lock(), pthread_mutex_unlock(), and pthread_resume_np():

/*++Function: SuspendThread

See MSDN doc.--*/DWORDPALAPISuspendThread( IN HANDLE hThread){ ... /* First, we need to retrieve the thread storage for the passed in thread handle */ ... /* Next we need to get the thread storage for the current thread. */ ... /* Finally, check the validity of the objects retrieved */ ...

if(lpThread == lpCurrentThread) { /* suspending the current thread */ ... if (lpThread->bCreateSuspended) { [This code uses the PAL’s implementation of synchronization rather than the pthread primitives] ... } ... /* Block here */ while(1) { pollRet = poll(&fds, 1, INFTIM);



if(-1 != pollRet) { break; } if(EINTR != errno) {...} ... } ... [Here is some code that uses pthread primitives] /* Check out current suspension count. If we are 0, then we need to actually suspend the thread... */ if (lpThread->dwSuspendCount == 0) { TRACE ("Suspending thread handle %p with pthread_suspend_np\n", hThread); #if HAVE_PTHREAD_NP /* pthread_suspend_np returns EINVAL for invalid thread parameter */ if ( pthread_suspend_np( (pthread_t)lpThread->dwThreadId) == EINVAL)#elif HAVE_MACH_THREADS if ( thread_suspend(pthread_mach_thread_np((pthread_t) lpThread->dwThreadId)) == KERN_INVALID_ARGUMENT )#elif HAVE_SOLARIS_THREADS if (thr_suspend((thread_t) lpThread->dwThreadId) != 0)#endif

#if HAVE_PTHREAD_NP || HAVE_MACH_THREADS || HAVE_SOLARIS_THREADS { ERROR("Unable to suspend the thread\n"); SetLastError(ERROR_INVALID_HANDLE); retval = -1; /* release the spinlock */ lpThread->suspend_spinlock = 0; goto SuspendThreadExit; } if(0 != lpThread->critsec_count) { /* eep, suspended thread was holding some internal critical sections. we have to let it release them all */ /* make sure the thread will wait for us to suspend it again */ pthread_mutex_lock(&lpThread->suspension_mutex); pthread_mutex_lock(&lpThread->cond_mutex); /* indicate our intention to suspend this thread */ lpThread->suspend_intent = TRUE;




/* let the thread run until it releases all its critical sections */#if HAVE_PTHREAD_NP pthread_resume_np((pthread_t)lpThread->dwThreadId);#elif HAVE_MACH_THREADS thread_resume(pthread_mach_thread_np((pthread_t) lpThread->dwThreadId));#elif HAVE_SOLARIS_THREADS thr_continue((thread_t) lpThread->dwThreadId);#endif /* wait for the thread to tell us it has released all its critical sections */ /* note : this will release the mutex until the condition is signalled, but that's fine, since no one can try to take it in the interval */ pthread_cond_wait(&lpThread->suspender_cond, &lpThread->cond_mutex); /* all critical sections are released, target is now blocked waiting for the suspension mutex (or about to). it is now safe to suspend it */#if HAVE_PTHREAD_NP pthread_suspend_np((pthread_t)lpThread->dwThreadId);#elif HAVE_MACH_THREADS thread_suspend(pthread_mach_thread_np((pthread_t) lpThread->dwThreadId));#elif HAVE_SOLARIS_THREADS thr_suspend((thread_t) lpThread->dwThreadId);#endif lpThread->suspend_intent = FALSE; /* allow target thread to acquire the mutex and continue when it gets resumed */ pthread_mutex_unlock(&lpThread->cond_mutex); pthread_mutex_unlock(&lpThread->suspension_mutex); }#else // HAVE_PTHREAD_NP || HAVE_MACH_THREADS || HAVE_SOLARIS_THREADS #error "Don't know how to suspend threads on this platform!"#endif ... } else {...} ... }

SuspendThreadExit: ... return (retval);}



This code fragment demonstrates some of the difficulties in implementing the PAL. There are many conditional compilation blocks in the code, depending on the capability of the underlying platform. Some of these are somewhat surprising, such as the HAVE_SOLARIS_THREADS symbol. Essentially, the implementation presumes the Pthread pack-age for synchronization (which is also the POSIX thread API); the Rotor Version 1 release was written to operate on top of FreeBSD Version 4.5, which does not (transparently) support kernel threads. Thus, the Pthreads and synchronization were all implemented in terms of user space functions. The suspend_np() and resume_np() Pthread func-tions are nonstandard functions, and are not implemented in various Pthread packages. Therefore, the HAVE_PTHREAD_NP symbol is used to distinguish between cases where the platform supports these calls and where it does not. But see the Lab Exercise at the end of the chapter.

We finish with an interesting piece of code that implements shared memory (see .../pal/unix/shmemory/shmemory.c). There is an extraordinarily great comment at the beginning of this file that is presented here verbatim. The “I” in the comment is Barry Bond, who implemented most of the Rotor PAL. If only all of the code were this well commented:


/*++

Copyright (c) 2002 Microsoft Corporation. All rights reserved.

The use and distribution terms for this software are contained in the filenamed license.txt, which can be found in the root of this distribution.By using this software in any fashion, you are agreeing to be bound by theterms of this license.

You must not remove this notice, or any other, from this software.

Module Name:

shmemory/shmemory.c

Abstract:

Implementation of shared memory infrastructure for IPC

Issues :

Interprocess synchronization

There doesn't seem to be ANY synchronization mechanism that will workinter-process AND be pthread-safe. FreeBSD's pthread implementation has nosupport for inter-process synchronization (PTHREAD_PROCESS_SHARED);"traditionnal" inter-process syncronization functions, on the other hand, are not pthread-aware, and thus will block entire processes instead of only the calling thread.



From suggestions and information obtained on the freebsd-hackers mailing list, I have come up with 2 possible strategies to ensure serialized access to our shared memory region

Note that the estimates of relative efficiency are wild guesses; my assumptions are that blocking entire processes is least efficient, busy wait somewhat better, and anything that does neither is preferable. However, the overhead of complex solutions is likely to have an important impact on performance

Option 1 : very simple; possibly less efficient. in 2 words : "busy wait"Basically,

while(InterlockedCompareExchange(spinlock_in_shared_memory, 1, 0) sched_yield();

In other words, if a value is 0, set it to 1; otherwise, try again until wesucceed. use shed_yield to give the system a chance to schedule other threads while we wait. (once a thread succeeds at this, it does its work, then sets the value back to 0)One inconvenient : threads will not unblock in the order they are blocked;once a thread releases the mutex, whichever waiting thread is scheduled nextwill be unblocked. This is what is called the "thundering herd" problem, and in extreme cases, can lead to starvationUpdate : we'll set the spinlock to our PID instead of 1, that way we can find out if the lock is held by a dead process.

Option 2 : possibly more efficient, much more complex, borders on"over-engineered". I'll explain it in stages, in the same way I deduced it.

Option 2.1 : probably less efficient, reasonably simple. stop at step 2)

1) The minimal, original idea was to use SysV semaphores for synchronization.This didn't work, because semaphores block the entire process, which can easilylead to deadlocks (thread 1 takes sem, thread 2 tries to take sem, blocksprocess, thread 1 is blocked and never releases sem)

2) (this is option 2.1) Protect the use of the semaphores in critical sections.Enter the critical section before taking the semaphore, leave the section after releasing the semaphore. This ensures that 2 threads of the same process will never try to acquire the semaphore at the same time, which avoids deadlocks.However, the entire process still blocks if another process has the semaphore.Here, unblocking order should match blocking order (assuming the semaphores work properly); therefore, no risk of starvation.

3) This is where it gets complicated. To avoid blocking whole processes, wecan't use semaphores. One suggestion I got was to use multi-ended FIFOs, here's how it would work.

-as in option 1, use InterlockedCompareExchange on a value in shared memory.-if this was not succesful (someone else has locked the shared memory), then :



-open a special FIFO for reading; try to read 1 byte. This will block until someone writes to it, and *should* only block the current thread. (note : more than one thread/process can open the same FIFO and block on read(), in this case, only one gets woken up when someone writes to it. *which* one is, again, not predictable; this may lead to starvation) -once we are unblocked, we have the lock.-once we have the lock (either from Interlocked...() or from read()), we can do our work-once the work is done, we open the FIFO for writing. this will fail if no one is listening.-if no one is listening, release the lock by setting the shared memory value back to 0-if someone is listening, write 1 byte to the FIFO to wake someone, then close the FIFO. the value in shared memory will remain nonzero until a thread tries to wake the next one and sees no one is listening.

problem with this option : it is possible for a thread to call Interlocked...()BETWEEN the failed "open for write" attempt and the subsequent restoration of the SHM value back to zero. In this case, that thread will go to sleep and will not wake up until *another* thread asks for the lock, takes it and releases it.

so to fix that, we come to step

4) Instead of using InterlockedCompareExchange, use a SysV semaphore :-when taking the lock : -take the semaphore -try to take the lock (check if value is zero, change it to 1 if it is) -if we fail : open FIFO for reading, release the semaphore, read() and block -if we succeed : release the semaphore-when releasing the lock : -take the semaphore -open FIFO for write -if we succeed, release semaphore, then write value -if we fail, reset SHM value to 0, then release semaphore.

Yes, using a SysV semaphore will block the whole process, but for a very short time (unlike option 2.1)problem with this : again, we get deadlocks if 2 threads from a single process try to take the semaphore. So like in option 2.1, we ave to wrap the semaphore usage in a critical section. (complex enough yet?)

so the locking sequence becomes SYNCEnterCriticalSection - take semaphore - try to lock - open FIFO - release semaphore - SYNCLeaveCriticalSection - readand the unlocking sequence becomes EnterCS - take sem - open FIFO - release sem - LeaveCS - write




Once again, the unblocking order probably won't match the blocking order.This could be fixed by using multiple FIFOs : waiting thread open their ownpersonal FIFO, write the ID of their FIFO to another FIFO. The thread that wants to release the lock reads ID from that FIFO, determines which FIFO to open for writing and writes a byte to it. This way, whoever wrote its ID to the FIFO first will be first to awake. How's that for complexity?

So to summarize, the options are1 - busy wait2.1 - semaphores + critical sections (whole process blocks)2 - semaphores + critical sections + FIFOs (minimal process blocking)2.2 - option 2 with multiple FIFOs (minimal process blocking, order preserved)

Considering the overhead involved in options 2 & 2.2, it is our guess thatoption 1 may in fact be more efficient, and this is how we'll implement it for the moment. Note that other platforms may not present the same difficulties (i.e. other pthread implementations may support inter-process mutexes), and may be able to use a simpler, more efficient approach.

B] Reliability.It is important for the shared memory implementation to be as foolproof aspossible. Since more than one process will be able to modify the shared data, it becomes possible for one unstable process to destabilize the others. The simplest example is a process that dies while modifying shared memory : if it doesn't release its lock, we're in trouble. (this case will be taken care of by using PIDs in the spinlock; this we we can check if the locking process is still alive).

--*/


281

10.3 LAB EXERCISE:

Porting the CLI to FreeBSD Kernel ThreadsRotor 1.0 on top of FreeBSD 4.5 relies on a user space Pthread implementation, although FreeBSD Version 5 apparently has kernel thread support. The 4.5 kernel provides incre-mental support for kernel threads with the rfork()system call (see the man page for rfork()). The idea is that there is an option to create a child process that uses the same address space as the parent—the essential semantics of a thread. Specifically, the rfork_thread() uses rfork(RFMEM), along with some other features to create a thread environment.

These kernel threads use the normal shared memory semaphores (see the semop(), semget(), and semctl() man pages on your FreeBSD system) for synchronization. The PAL code uses the Pthread API for all its calls for thread support: This exercise is designed to reimplement the subset of the Pthread library used by the PAL so that you can rebuild the Rotor distribution to use your kernel implementation rather than the user space Pthread library.

This exercise is comprehensive, and it may be excessively time-consuming to complete the whole thing; your instructor may choose completion criteria for a subset of the work.

10.3.1 BACKGROUND

This is a UNIX programming assignment. In principle, you need little background infor-mation, other than that included in the body of the chapter, to complete this exercise. If you are not well-versed in UNIX programming, you may find it helpful to consult one of the many advanced UNIX programming books such as [Stevens, 1992].

10.3.2 ATTACKING THE PROBLEM

As an early step in your attack, you will need to determine which Pthread calls are used by the PAL. There are many ways to determine this, ranging from searching the Rotor source code, to removing the reference to the Pthread library and recompiling Rotor.

Once you have determined which functions are called by Rotor, I recommend that you construct a program with stubs for all the functions used by Rotor. You can then incre-mentally define the behavior of each of these functions using rfork_thread() and the semaphore system calls. In some cases, your implementation will be a simple wrapper, but in others you may have to implement some of the required functionality in your code. For example, here is a skeleton of a few of the stubbed out functions:



/* Translates pthread calls used in the CLI into FreeBSD * kernel thread functions. These functions do NOT * provide full functionality -- only that needed by * the Rotor CLI PAL. */

#include "pal/palinternal.h"#include "pal/handle.h"#include "pal/file.h"#include "pal/init.h"#include "pal/critsect.h"#include "pal/thread.h"#include "pal/dbgmsg.h"#include "pal/virtual.h"#include "pal/mutex.h"#include "pal/utils.h"...

/* global vars (but private to this file */static pthread_t *threadList = NULL; /* List of thread descriptors */int pmutex; /* mutex for pthread functions */...

/**************************************************//************* kthread functions ******************//**************************************************/

/** Basic thread management **/int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine)(void *), void *arg){ ... return 0;}

void pthread_exit(void *value_ptr){ DPR("kthread.c: pthread_exit()\n"); exit(1); /* This thread just disappears */}

pthread_t pthread_self(void){ ... return NULL;}


2.X SECTION TITLE 283 10.X LAB EXERCISE: TITLE 283

/*** Mutex operations ****/int pthread_mutex_init(pthread_mutex_t *mutex, const pthread_mutexattr_t *attr) { ... return 0; /* Successful return */}...

/*** Condition variable operations ****/int pthread_cond_init(pthread_cond_t *cond, const pthread_condattr_t *attr) { ... return 0; /* Successful return */}...

/*** rwlock ****/int pthread_rwlock_init(pthread_rwlock_t *lock, const pthread_rwlockattr_t *attr) { ... return 0; /* Successful return */}...

It will be difficult to use the Rotor code to test your functions as you develop them. You should consider writing a simple driver program that tests each of your functions for minimal behavior. Once you have determined that these functions are basically correct, you will need to finish debugging them with the Rotor code. This code fragment will give you an idea of how to construct a test/debug infrastructure:

/* Driver program to test kernel thread implementation of pthread * API subset (the subset needed by the SSCLI */

...#include <sched.h> /* sched_get/setscheduler() */#include <stdio.h>#include <errno.h> /* Need to include in kthread.h */#include "pal/kthread.h"

#define N_MAX 4

/* Globals to simplify testing */void *tWork(void *me);static int N;static pthread_mutex_t myMutex1;static pthread_mutex_t myMutex2;static pthread_cond_t myCond;static pthread_rwlock_t myLock;...

10.3 LAB EXERCISE: PORTING THE CLI TO FREEBSD KERNEL THREADS 283



int main () { int i; int status, childTerm; int myStackSize; pthread_t pid[N_MAX]; pthread_attr_t attr; pthread_key_t myKey; void **result;

N = N_MAX; .../* Test * pthread_mutex_init(pthread_mutex_t *, void (*)(void *)); *//* Setup two mutexes for the threads to use */ myMutex1 = (pthread_mutex_t) malloc(sizeof(struct pthread_mutex)); if(pthread_mutex_init(&myMutex1, NULL) != 0) { /* mutex initialization failed */ printf("ParentApp: mutex initialization failed\n"); } printf("ParentApp: created mutex id = %d\n", myMutex1->id); myMutex2 = (pthread_mutex_t) malloc(sizeof(struct pthread_mutex)); if(pthread_mutex_init(&myMutex2, NULL) != 0) { /* mutex initialization failed */ printf("ParentApp: mutex initialization failed\n"); } printf("ParentApp: created mutex id = %d\n", myMutex2->id);

/* Test * pthread_cond_init(pthread_cond_t *, void (*)(void *)); *//* Setup a cond for the threads to use */ myCond = (pthread_cond_t) malloc(sizeof(struct pthread_cond)); if(pthread_cond_init(&myCond, NULL) != 0) { /* cond initialization failed */ printf("ParentApp: cond initialization failed\n"); } printf("ParentApp: created cond\n");

/* Test * pthread_rwlock_init(pthread_rwlock_t *, void (*)(void *)); *//* Setup a rwlock for the threads to use */ myLock = (pthread_rwlock_t) malloc(sizeof(struct pthread_rwlock)); if(pthread_rwlock_init(&myLock, NULL) != 0) { /* lock initialization failed */ printf("ParentApp: rwlock initialization failed\n"); exit(-1); } printf("ParentApp: created rwlock id = %d\n", myLock->id);


2.X SECTION TITLE 285 10.X LAB EXERCISE: TITLE 285

/* Test * pthread_create(pthread_t, const pthread_attr_t *, void (*)(void *), void *) */ for(i=0; i < N; i++) { printf("ParentApp: Ready to create child %d\n", i); attr->stackaddr_attr = (void *) malloc(myStackSize); pid[i] = (struct pthread *) malloc(sizeof(struct pthread)); if(pthread_create(&(pid[i]), &attr, tWork, (void *) i) != 0) { /* Error condition */ printf("ParentApp: pthread_create() failed\n"); N = i+1; break; } printf("ParentApp: Created child with pid = %d, stack addr = %x\n", pid[i]->id, attr->stackaddr_attr); usleep(500); /* Yield */ } ...}

void *tWork(void *meNum) { int i, j; pthread_t me; int result; int policy; struct sched_param param;

/* Test * pthread_self() *//* Check pthread_self() and shared variables */ me = pthread_self(); shared += 1; printf("Thread[pid=%d]: Alive, shared = %d\n", me->id, shared); sleep(1); /* This should be 0.0025 seconds */ ...

/* Test * pthread_exit(void *) *//* ... then terminate */ pthread_exit(&me);}

10.3 LAB EXERCISE: PORTING THE CLI TO FREEBSD KERNEL THREADS 285


Date post:	28-Jun-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

THE CLI'S EXECUTION ENVIRONMENTread.pudn.com/downloads147/ebook/636538/Inside Rotor/NuttDVM_… ·...

Documents