Continue from Application Domains

Application Domains
Application domains are to the CLR what processes are to an operating system. It may be surprising to note that the CLR can run multiple .NET applications within a single process, without any contention or security difficulties. Because the CLR has complete control over loading and executing programs, and because of the presence of type information, the CLR guarantees that .NET applications cannot read or write each other's memory, even when running in the same process. Because there is less performance overhead in switching between application domains than in switching between processes, this provides a performance gain. This is especially beneficial to web applications running in Internet Information Services (IIS), where scalability is an issue.

Common Language Specification (CLS)
The CLI defines a runtime that is capable of supporting most, if not all, of the features found in modern programming languages. It is not intended that all languages that target the CLR will support all CLR features. This could cause problems when components written in different languages attempt to interoperate. The CLI therefore defines a subset of features that are considered compatible across language boundaries. This subset is called the Common Language Specification (CLS).

Vendors creating components for use by others need to ensure that all externally visible constructs (e.g., public types, public and protected methods, parameters on public and protected methods, etc.) are CLS-compliant. This ensures that their components will be usable within a broad array of languages, including Visual Basic .NET. Developers authoring components in Visual Basic .NET have an easy job because all Visual Basic .NET code is CLS-compliant (unless the developer explicitly exposes a public or protected type member or method parameter that is of a non-CLS-compliant type). Programming Visual Basic .NET

Because Visual Basic .NET automatically generates CLS-compliant components, this book does not describe the CLS rules. However, to give you a sense of the kind of thing that the CLS specifies, consider that some languages support a feature called operator overloading . This allows the developer to specify actions that should be taken if the standard operator symbols (+, -, *, /, =, etc.) are used on user-defined classes. Because it is not reasonable to expect that all languages should implement such a feature, the CLS has a rule about it. The rule states that if a CLS-compliant component has public types that provide overloaded operators, those types must provide access to that functionality in another way as well (usually by providing a public method that performs the same operation).

Intermediate Language (IL) and Just-In-Time (JIT) Compilation
All compilers that target the CLR compile source code to Intermediate Language (IL), also known as Common Intermediate Language (CIL). IL is a machine language that is not tied to any specific machine. Microsoft designed it from scratch to support the CLI's programming concepts. The CLI specifies that all CLR implementations can compile or interpret IL on the machine on which the CLR is running. If the IL is compiled (versus interpreted), compilation can occur at either of two times: 
  • Immediately prior to a method in the application being executed 
  • At deployment time
In the first case, each method is compiled only when it is actually needed. After the method is compiled, subsequent calls bypass the compilation mechanism and call the compiled code directly. The compiled code is not saved to disk, so if the application is stopped and restarted, the compilation must occur again. This is known as just-in-time (JIT) compilation and is the most common scenario.

In the second case, the application is compiled in its entirety at deployment time.
IL is saved to .exe and .dll files. When such a file containing IL is executed, the CLR knows how to invoke the JIT compiler and execute the resulting code.
Note that on the Microsoft Windows platforms, IL is always compiled—never interpreted.

Metadata
Source code consists of some constructs that are procedural in nature and others that are declarative in nature. An example of a procedural construct is:
CODES:
This is procedural because it compiles into executable code that performs an action at runtime. Namely, it assigns the value 5 to the SomeMember member of the someObject object.

In contrast, here is a declarative construct:
CODES:
This is declarative because it doesn't perform an action. It states that the symbol someObject is a variable that holds a reference to an object of type SomeClass.

In the past, declarative information typically was used only by the compiler and did not compile directly into the executable. In the CLR, however, declarative information is everything! The CLR uses type and signature information to ensure that memory is always referenced in a safe way. The JIT compiler uses type and signature information to resolve method calls to the appropriate target code at JIT compile time. The only way for this to work is for this declarative information to be included alongside its associated procedural information. Compilers that target the CLR therefore store both procedural

and declarative information in the resulting .exe or .dll file. The procedural information is stored as IL, and the declarative information is stored as metadata. Metadata is just the CLI's name for declarative information.

The CLI has a mechanism that allows programmers to include arbitrary metadata in compiled applications. This mechanism is known as custom attributes and is available in Visual Basic .NET. 

Memory Management and Garbage Collection
In any object-oriented programming environment, there arises the need to instantiate and destroy objects. Instantiated objects occupy memory. When objects are no longer in use, the memory they occupy should be reclaimed for use by other objects. Recognizing when objects are no longer being used is called lifetime management, which is not a trivial problem. The solution the CLR uses has implications for the design and use of the components you write, so it is worth understanding.

In the COM world, the client of an object notified the object whenever a new object reference was passed to another client. Conversely, when any client of an object was finished with it, the client notified the object of that fact. The object kept track of how many clients had references to it. When that count dropped to zero, the object was free to delete itself (that is, give its memory back to the memory heap). This method of lifetime management is known as reference counting. Visual Basic programmers were not necessarily aware of this mechanism because the Visual Basic compiler automatically generated the low-level code to perform this housekeeping. C++ developers had no such luxury.

Reference counting has some drawbacks:
  • A method call is required every time an object reference is copied from one variable to another and every time an object reference is overwritten.
  • Difficult-to-track bugs can be introduced if the reference-counting rules are not precisely followed.
  • Care must be taken to ensure that circular references are specially treated (because circular references can result in objects that never go away).
The CLR mechanism for lifetime management is quite different. Reference counting is not used. Instead, the memory manager keeps a pointer to the address at which free memory (known as the heap) starts. To satisfy a memory request, it just hands back a copy of the pointer and then increments the pointer by the size of the request, leaving it in a position to satisfy the next memory request. This makes memory allocation very fast. No action is taken at all when an object is no longer being used. As long as the heap doesn't run out, memory is not reclaimed until the application exits. If the heap is large enough to satisfy all memory requests during program execution, this method of memory allocation is as fast as is theoretically possible, because the only overhead is incrementing the heap pointer on memory allocations.

If the heap runs out of memory, there is more work to do. To satisfy a memory request when the heap is exhausted, the memory manager looks for any previously allocated memory that can be reclaimed. It does this by examining the application variables that hold object references. The objects that these variables reference (and therefore the associated memory) are considered in use because they can be reached through the program's variables. Furthermore, because the runtime has complete access to the application's type information, the memory manager knows whether the objects contain members that reference other objects, and so on. In this way, the memory manager can find all of the memory that is in use. During this process, it consolidates the contents of all this memory into one contiguous block at the start of the heap, leaving the remainder of the heap free to satisfy new memory requests. This process of freeing up memory is known as garbage collection (GC), a term that also applies to this overall method of lifetime management. The portion of the memory manager that performs garbage collection is called the garbage collector.

The benefits of garbage collection are:
  • No overhead is incurred unless the heap becomes exhausted.
  • It is impossible for applications to cause memory leaks.
  • The application need not be careful with circular references.
Although the process of garbage collection is expensive (on the order of a fraction of a second when it occurs), Microsoft claims that the total overhead of garbage collection is on average much less than the total overhead of reference counting (as shown by their benchmarks). This, of course, is highly dependent on the exact pattern of object allocation and deallocation that occurs in any given program.

Finalize
Many objects require some sort of cleanup (i.e., finalization) when they are destroyed. An example might be a business object that maintains a connection to a database. When the object is no longer in use, its database connection should be released. The .NET Framework provides a way for objects to be notified when they are about to be released, thus permitting them to release nonmemory resources. (Memory resources held by the object can be ignored because they will be handled automatically by the garbage collector.) Here's how it works: the Object class (defined in the System namespace) has a method called Finalize that can be overridden. Its default implementation does nothing. If it is overridden in a derived class, however, the garbage collector automatically calls it on an instance of that class when that instance is about to be reclaimed. Here's an example of overriding the Finalize method:
CODES:
The Finalize method should release any nonmanaged resources that the object has allocated. Nonmanaged resources are any resources other than memory (for example, database connections, file handles, or other OS handles). In contrast, managed resources are object references. As already mentioned, it is not necessary to release managed resources in a Finalize method—the garbage collector will handle it. After releasing resources allocated by the class, the Finalize method must always call the base class's Finalize implementation so that it can release any resources allocated by base-class code. If the class is derived directly from the Object class, technically this could be omitted (because the Object class's Finalize method doesn't do anything). However, calling it doesn't hurt anything, and it's a good habit to get into.

An object's Finalize method should not be called by application code. The Finalize method has special meaning to the CLR and is intended to be called only by the garbage collector. If you're familiar with destructors in C++, you'll recognize that the Finalize method is the identical concept. The only difference between the Finalize method and C++ destructors is that C++ destructors automatically call their base class destructors, whereas in Visual Basic .NET, the programmer must remember to put in the call to the base class's Finalize method. It is interesting to note that C#—another language on the .NET platform—actually has destructors (as C++ does), but they are automatically compiled into Finalize methods that work as described here.

Dispose
The downside of garbage collection and the Finalize method is the loss of deterministic finalization . With reference counting, finalization occurs as soon as the last reference to an object is released (this is deterministic because object finalization is controlled by program flow). In contrast, an object in a garbage-collected system is not destroyed until garbage collection occurs or until the application exits. This is nondeterministic because the program has no control over when it happens. This is a problem because an object that holds scarce resources (such as a database connection) should free those resources as soon as the object is no longer needed. If this is not done, the program may run out of such resources long before it runs out of memory.




Unfortunately, no one has discovered an elegant solution to this problem. Microsoft does have a recommendation, however. Objects that hold nonmanaged resources should implement the IDisposable interface (defined in the System namespace). The IDisposable interface exposes a single method, called Dispose, which takes no parameters and returns no result. Calling it tells the object that it is no longer needed. The object should respond by releasing all the resources it holds, both managed and nonmanaged, and should call the Dispose method on any subordinate objects that also expose the IDisposable interface. In this way, scarce resources are released as soon as they are no longer needed.


This solution requires that the user of an object keep track of when it is done with the object. This is often trivial, but if there are multiple users of an object, it may be difficult to know which user should call Dispose. At the time of this writing, it is simply up to the programmer to work this out. In a sense, the Dispose method is an alternate destructor to address the issue of nondeterministic finalization when nonmanaged resources are involved. However, the CLR itself never calls the Dispose method. It is up to the client of the object to call the Dispose method at the appropriate time, based on the client's knowledge of when it is done using the object. This implies responsibilities for both the class author and client author. The class author must document the presence of the Dispose method so that the client author knows that it's necessary to call it. The client author must make an effort to determine whether any given class has a Dispose method and, if so, to call it at the appropriate time.


Even when a class exposes the IDisposable interface, it should still override the Finalize method, just in case the client neglects to call the Dispose method. This ensures that nonmanaged resources are eventually released, even if the client forgets to do it. A simple (but incomplete) technique would be to place a call to the object's Dispose method in its Finalize method, like this:
CODES:
In this way, if the client of the object neglects to call the Dispose method, the object itself will do so when the garbage collector destroys it. Microsoft recommends that the Dispose method be written so it is not an error to call it more than once. This way, even if the client calls it at the correct time, it's OK for it to be called again in the Finalize method.


If the object holds references to other objects that implement the IDisposable interface, the code just shown may cause a problem. This is because the order of object destruction is not guaranteed. Specifically, if the Finalize method is executing, it means that garbage collection is occurring. If the object holds references to other objects, the garbage collector may have already reclaimed those other objects. If the object attempts to call the Dispose method on a reclaimed object, an error will occur. This situation exists only during the call to Finalize—if the client calls the Dispose method, subordinate objects will still be there. (They can't have been reclaimed by the garbage collector because they are reachable from the application's code.)


To resolve this race condition, it is necessary to take slightly different action when finalizing than when disposing. Here is the modified code:
CODES:
Here, the Finalize method only releases unmanaged resources. It doesn't worry about calling the Dispose method on any subordinate objects, assuming that if the subordinate objects are also unreachable, they will be reclaimed by the garbage collector and their finalizers (and hence their Dispose methods) will run.


An optimization can be made to the Dispose method. When the Dispose method is called by the client, there is no longer any reason for the Finalize method to be called when the object is destroyed. Keeping track of and calling objects' Finalize methods imposes overhead on the garbage collector. To remove this overhead for an object with its Dispose method called, the Dispose method should call the SuppressFinalize shared method of the GC class, like this:
CODES:
The type designer must decide what will occur if the client attempts to use an object after calling its Dispose method. If possible, the object should automatically reacquire its resources. If this is not possible, the object should throw an exception.
CODES:
A Brief Tour of the .NET Framework Namespaces
The .NET Framework provides a huge class library—something on the order of 6,000 types. To help developers navigate though the huge hierarchy of types, Microsoft has divided them into namespaces. However, even the number of namespaces can be daunting. Here are the most common namespaces and an overview of what they contain:


Microsoft.VisualBasic
Runtime support for applications written in Visual Basic .NET. This namespace contains the functions and procedures included in the Visual Basic .NET language.



No comments:

Post a Comment