One of the unnoticed improvements of Window 10 is the parallel library loading support in ntdll.dll. This feature decreases process startup times by using multiple threads to load libraries from disk into memory.
How Windows 10 Implements Parallel Loading
Windows 10 implements parallel loading by creating a thread pool of worker threads when the process initializes. The parent process defines the number of worker threads by defining in the PEB->ProcessParameters->LoaderThreads (ULONG) field. ntdll!LdrpInitializeExecutionOptions can further override the LoaderThreads field by querying the Image File Execution Options (IFEO) registry key HKLMSoftwareMicrosoftWindows NTCurrentVersionImage File Execution Options<image.exe>MaxLoaderThreads.
Interestingly, Windows 10 contains a default entry for chrome.exe with MaxLoaderThreads set to 1 to disable parallel loading.
Figure 1: Querying the IFEO registry key for MaxLoaderThreads
The initial thread in the process executing ntdll!LdrInitializeThunk will be referred to as the master thread. Threads created by the master thread in the thread pool will be referred to as worker threads.
ntdll!LdrpInitParallelLoadingSupport and ntdll!LdrpCreateLoaderEvents are called to initialize the following structures:
• ntdll!LdrpWorkQueue (LIST_ENTRY)
• ntdll!LdrpWorkQueueTail (LIST_ENTRY)
• ntdll!LdrpWorkQueueLock (CRITICAL_SECTION)
• ntdll!LdrpRetryQueue (LIST_ENTRY)
• ntdll!LdrpRetryQueueTail (LIST_ENTRY)
• ntdll!LdrpLoadCompleteEvent (HANDLE)
• ntdll!LdrpWorkCompleteEvent (HANDLE)
Figure 2: Initializing the work queue structures
Figure 3: Creating the synchronization events
After ntdll loads kernel32.dll and kernelbase.dll are loaded, ntdll!LdrpEnableParallelLoading is called to set up the necessary events and worker pool. One interesting thing to note is that kernel32.dll and kernelbase.dll are loaded even if the process does not require it.
How Windows 10 Mitigates Parallel Loading Hazards
There are a lot of hazards when it comes to parallel loading and code hooking. In order to mitigate against corrupting memory or compatibility issues, Windows detects if a process is hooked before enabling parallel loading.
This is a Security Bloggers Network syndicated blog post authored by Jeffrey Tang. Read the original post at: Cylance Blog