Test Specific Questions |
Following is a list of questions around testing. Click on the question to see the response:
Are critical section leaks important?
Can we deal programmatically with stack overflows?
Is it expensive to increase initial stack commits to avoid overflows?
What about reserved stack size?
Are there recommendations on how to pick the right size for LINKER_STACKCOMMITSIZE=?
What are the memory requirements for the heap test?
Why am I receiving an ALL_ACCESS stop?
Why don�t I get any logs from the heap and lock tests?
Why is Fault Injection not working?
Why does Leak Verifier not report certain leaks?
Are critical section leaks important?
Whenever you leak a critical section you leak the following: an event handle, a small amount of kernel pool and a small heap allocation. These will get cleaned up if the process exits.
If your process is supposed to stay alive a long time, then these leaks can bite you. Since the fixes are very easy in 99% of the cases (developer just forgot to call RtlDeleteCriticalSection) you should address them.
Can we deal programmatically with stack overflows?
Establishing an exception handler in the initial thread function is not guaranteed to catch the potential stack overflows that might be raised. This is because the code that dispatches exceptions needs also a little bit of stack to execute on top of the current activation record. Since we just failed the stack extension it is very likely that we will step over the end of the committed stack and raise a second exception while trying to dispatch the first one. A double fault exception will terminate the process unconditionally.
The LoaderLock test is giving an error about calling DestroyWindow. Why can�t I call DestroyWindow in DllMain?
You do not control which thread is going to detach. If it is not the same thread that created the window, you cannot destroy the window. So you leak the window and the next time the window receives a message, you crash because the Wndproc has been unloaded.
You need to destroy the window before you get the process-detach. The danger is not that user32 will be unloaded. The danger is that you are being unloaded. So the next message that the window receives will crash the process because user32 will deliver the message to your Wndproc which does not exist any more.
Microsoft Windows operating system has thread affinity. Process-detach does not. The loader lock is not really the big problem; the problem is Dllmain. Process-detach is the last time your DLL gets to run code. You must get rid of everything before you return. But since Windows has thread affinity, you cannot clean up the window if you are on the wrong thread.
The loader lock enters into the picture if somebody has a global hook installed (e.g., spy++ is running). In this case, you enter a potential deadlock scenario. Again, the solution is to destroy the window before you get process-detach.
Is it expensive to increase initial stack commits to avoid overflows?
When you commit stack you are just reserving page file space. There is no performance impact. No physical memory is actually used. The only additional cost happens if you actually touch the stack space that you committed. But this will happen anyway even if you do not commit the stack upfront.
Let us see what would be the cost to make all services running in svchost.exe bulletproof. On a test machine, I get 9 svchost.exe processes having a total of 139 threads. If we set the default stack for each thread at 32K we will need roughly 32K x 200 ~ 6.4 Mb of page file space to commit all stacks upfront.
This is a pretty small price to pay for reliability.
What about reserved stack size?
There are interesting items, such as exception dispatching on IA64/AMD64 that requires "unexpected" extra stack. There might be some processing happening on RPC worker threads whose stack requirements are past reasonable attempts to measure them.
First of all, you should get an idea of all the thread pools living in the process. The NT-Thread-Pool, with the alertable-wait-threads is sometimes special, because, for example, if you use a database component from SQL, it will use alertable sleeps over a thread that is a target of user-APC. This can cause problems with nested calls.
Once you know all the thread pools, get an idea of how to control their stack requirements. For Example, RPC reads a registry key for the stack commit. The WDM pump threads get that from the image. For Other Thread Pools, the mileage may vary.
When you all threads are clear you can take some action. Not having a huge reserved space helps address space fragmentation only if threads comes and goes very often. If you have a stable thread pool that is in your control, then you might have an advantage in reducing the reserved space as well. It will really help in saving address space for the heaps and address space for users.
Are there recommendations on how to pick the right size for LINKER_STACKCOMMITSIZE=?
The value should be divisible by the page size (4k/8k depending on the CPU). Here are some guidelines to determine the size you need would do:
-
Convert any recursive functions with potential unbound depth (or at least user inducible high depth) to iterative.
-
Reduce alloca usage. Use heap or safealloca.
-
Run Prefast with reduced stack size checking (say 8k). Fix those functions flagged as using too much stack.
-
Set the stack commit to 16k.
-
Run under a debugger bunch of tests with Application Verifier's "Stacks" check on.
-
When you see stack overflow determine the worst offenders and fix them. (See step 5.)
-
When you cannot reduce the stack usage any more bump by 8k. If you are > 64k there is something wrong, decrease back to 64k and see step 6. Otherwise go to step 5.
What are the memory requirements for the heap test?
For full heap tests, you'll need 256MB of RAM and at least a 1GB page file. For normal heap tests, you'll need at least 128MB of RAM. There are no specific processor or disk requirements.
Why am I receiving an ALL_ACCESS stop?
Any application that uses _ALL_ACCESS renders the object it is accessing unauditable because the audit log will not reflect what you have actually done with the object�only what you asked to do with the object.
This condition creates a camouflage for a more devious attack. An administrator scanning for an attack activity in progress will see nothing wrong with the person requesting ALL_ACCESS on key X, because a particular application always does that. The administrator will think "the person is probably just running Word". The administrator cannot tell that a hacker has penetrated my account and is now probing the system to determine what access I have, which he can exploit for his nefarious ends. The possibilities are endless.
The ACL issue with ALL_ACCESS is that you must always be granted it. If we wanted to someday deny you DELETE access to a certain key, we would not be able to. Even though you were not actually deleting the key, we would be breaking your application because you would request delete access.
Why don�t I get any logs from the heap and lock tests?
Those tests are verification layers built in the operative system (and not in the package) and report errors in a debugger. If you run an application with those tests enabled and have no crashes, then they are not reporting any problems.
If you do run into crashes, it will be necessary to run under a debugger, or pass the application to a developer to test more closely.
Why is fault injection not working?
The fault injection probability was changed to parts per million in AppVerifier builds released after February 2007 based on customer feedback. So, a probability of 0n20000 is 2%, 0n500000 is 50% and so on.
!avrf �flt debugger extension can be used to change the probability on the fly in the debugger. However, the Low Resource Simulation check for the process should be turned on for this to work.
!avrf debugger extension is part exts.dll that ships with the debugger package. The changes in !avrf that support the probability change are in the latest debugger package. If you are experiencing problems with fault injection, please update your debuggers and the AppVerifier package.
Why does Leak Verifier not report certain resource leaks?
Leak Verifier does not report any resource leaks while a DLL or EXE module is loaded. When a module is unloaded, Leak Verifier issues a stop if any of the resources that were allocated by the module have not been released.
To inspect resources allocated by a loaded DLL or EXE, use the !avrf -leak debugger extension.