reactos/reactos/ntoskrnl/cc/view.c

1393 lines
35 KiB
C
Raw Normal View History

/*
* COPYRIGHT: See COPYING in the top level directory
* PROJECT: ReactOS kernel
* FILE: ntoskrnl/cc/view.c
* PURPOSE: Cache manager
*
* PROGRAMMERS: David Welch (welch@mcmail.com)
*/
/* NOTES **********************************************************************
*
* This is not the NT implementation of a file cache nor anything much like
* it.
*
* The general procedure for a filesystem to implement a read or write
* dispatch routine is as follows
*
* (1) If caching for the FCB hasn't been initiated then so do by calling
* CcInitializeFileCache.
*
* (2) For each 4k region which is being read or written obtain a cache page
* by calling CcRequestCachePage.
*
* (3) If either the page is being read or not completely written, and it is
* not up to date then read its data from the underlying medium. If the read
* fails then call CcReleaseCachePage with VALID as FALSE and return a error.
*
* (4) Copy the data into or out of the page as necessary.
*
* (5) Release the cache page
*/
/* INCLUDES ******************************************************************/
#include <ntoskrnl.h>
#define NDEBUG
#include <debug.h>
#if defined (ALLOC_PRAGMA)
#pragma alloc_text(INIT, CcInitView)
#endif
/* GLOBALS *******************************************************************/
/*
* If CACHE_BITMAP is defined, the cache manager uses one large memory region
* within the kernel address space and allocate/deallocate space from this block
* over a bitmap. If CACHE_BITMAP is used, the size of the mdl mapping region
* must be reduced (ntoskrnl\mm\mdl.c, MI_MDLMAPPING_REGION_SIZE).
*/
//#define CACHE_BITMAP
static LIST_ENTRY DirtySegmentListHead;
static LIST_ENTRY CacheSegmentListHead;
static LIST_ENTRY CacheSegmentLRUListHead;
static LIST_ENTRY ClosedListHead;
ULONG DirtyPageCount=0;
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KGUARDED_MUTEX ViewLock;
#ifdef CACHE_BITMAP
#define CI_CACHESEG_MAPPING_REGION_SIZE (128*1024*1024)
static PVOID CiCacheSegMappingRegionBase = NULL;
static RTL_BITMAP CiCacheSegMappingRegionAllocMap;
static ULONG CiCacheSegMappingRegionHint;
static KSPIN_LOCK CiCacheSegMappingRegionLock;
#endif
NPAGED_LOOKASIDE_LIST iBcbLookasideList;
static NPAGED_LOOKASIDE_LIST BcbLookasideList;
static NPAGED_LOOKASIDE_LIST CacheSegLookasideList;
#if DBG
static void CcRosCacheSegmentIncRefCount_ ( PCACHE_SEGMENT cs, const char* file, int line )
{
++cs->ReferenceCount;
if ( cs->Bcb->Trace )
{
DbgPrint("(%s:%i) CacheSegment %p ++RefCount=%d, Dirty %d, PageOut %d\n",
file, line, cs, cs->ReferenceCount, cs->Dirty, cs->PageOut );
}
}
static void CcRosCacheSegmentDecRefCount_ ( PCACHE_SEGMENT cs, const char* file, int line )
{
--cs->ReferenceCount;
if ( cs->Bcb->Trace )
{
DbgPrint("(%s:%i) CacheSegment %p --RefCount=%d, Dirty %d, PageOut %d\n",
file, line, cs, cs->ReferenceCount, cs->Dirty, cs->PageOut );
}
}
#define CcRosCacheSegmentIncRefCount(cs) CcRosCacheSegmentIncRefCount_(cs,__FILE__,__LINE__)
#define CcRosCacheSegmentDecRefCount(cs) CcRosCacheSegmentDecRefCount_(cs,__FILE__,__LINE__)
#else
#define CcRosCacheSegmentIncRefCount(cs) (++((cs)->ReferenceCount))
#define CcRosCacheSegmentDecRefCount(cs) (--((cs)->ReferenceCount))
#endif
NTSTATUS
CcRosInternalFreeCacheSegment(PCACHE_SEGMENT CacheSeg);
/* FUNCTIONS *****************************************************************/
VOID
NTAPI
CcRosTraceCacheMap (
PBCB Bcb,
BOOLEAN Trace )
{
#if DBG
KIRQL oldirql;
PLIST_ENTRY current_entry;
PCACHE_SEGMENT current;
if ( !Bcb )
return;
Bcb->Trace = Trace;
if ( Trace )
{
DPRINT1("Enabling Tracing for CacheMap 0x%p:\n", Bcb );
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeAcquireGuardedMutex(&ViewLock);
KeAcquireSpinLock(&Bcb->BcbLock, &oldirql);
current_entry = Bcb->BcbSegmentListHead.Flink;
while (current_entry != &Bcb->BcbSegmentListHead)
{
current = CONTAINING_RECORD(current_entry, CACHE_SEGMENT, BcbSegmentListEntry);
current_entry = current_entry->Flink;
DPRINT1(" CacheSegment 0x%p enabled, RefCount %d, Dirty %d, PageOut %d\n",
current, current->ReferenceCount, current->Dirty, current->PageOut );
}
KeReleaseSpinLock(&Bcb->BcbLock, oldirql);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeReleaseGuardedMutex(&ViewLock);
}
else
{
DPRINT1("Disabling Tracing for CacheMap 0x%p:\n", Bcb );
}
#else
Bcb = Bcb;
Trace = Trace;
#endif
}
NTSTATUS
NTAPI
CcRosFlushCacheSegment(PCACHE_SEGMENT CacheSegment)
2002-08-14 David Welch <welch@computer2.darkstar.org> * subsys/smss/init.c (SmPagingFilesQueryRoutine): If possible take the size of the paging file from the registry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section.c (MmCreateDataFileSection): Extend the section if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/pagefile.c (NtCreatePagingFile): Set the file size using the FileAllocationInformation class. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/anonmem.c (MmWritePageVirtualMemory): Implemented function to write anonymous memory pages to the swap file. * ntoskrnl/mm/anonmem.c (MmFreeVirtualMemoryPage): Free any swap page associated with the page. * ntoskrnl/mm/mpw.c (MmWriteDirtyPages): New function to find pages to write to disk. * ntoskrnl/mm/mpw.c (MmMpwThreadMain): Implemented MPW functionality. * ntoskrnl/mm/rmap.c (MmWritePagePhysicalAddress): New function to write a single page back to disk. * ntoskrnl/mm/rmap.c (MmSetCleanAllRmaps, MmSetDirtyAllRmaps, MmIsDirtyPageRmap): New rmap function to support the MPW thread. * ntoskrnl/mm/section.c (MmWritePageSectionView): Implemented function to write back section pages. * ntoskrnl/mm/section.c (MmFreeSectionPage): Free any swap entry associated with the page; mark pages shared with the cache as dirty if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ldr/loader.c (LdrPEProcessModule): Set name of the module into the module text structure. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/io/rw.c (NtReadFile, NtWriteFile): Use the correct test for whether to wait for the completion of i/o. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cm/ntfunc.c (NtFlushKey): Request synchronous i/o from NtOpenFile. * ntoskrnl/cm/regfile (CmiInitPermanentRegistryHive): Request synchronous i/o from NtCreateFile. * ntoskrnl/dbg/kdb_stabs.c (LdrpLoadModuleSymbols): Request synchronous i/o from NtOpenFile. * ntoskrnl/ldr/sysdll.c (LdrpMapSystemDll): Request synchronous i/o from NtOpenFile. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosSuggestFreeCacheSegment): Maintain the correct reference count. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosFlushCacheSegment): New function to write back a modified cache segment. * ntoskrnl/cc/view.c (CcRosFlushDirtyPages): New function to flush some dirty pages from the cache. * ntoskrnl/cc/view.c (CcRosMarkDirtyCacheSegment): New function to mark a cache segment modified while mapped into memory as dirty. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/pin.c (CcMapData, CcUnpinData, CcSetDirtyPinnedData): Store the dirty status in the BCB; don't write back dirty data immediately. 2002-08-14 David Welch <welch@computer2.darkstar.org> * include/ntos/mm.h: Added SEC_XXXX defines from 'Windows NT/2000 Native API Reference' 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/ea.c (VfatSetExtendedAttributes): Empty placeholder for extended attribute functions. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/finfo.c (VfatSetAllocationSizeInformation): Added function to set allocation size. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/fcb.c (vfatFCBInitializeCache): Renamed to vfatFCBInitializeCacheFromVolume. * drivers/fs/vfat/fcb.c (vfatMakeFCBFromDirEntry): Don't initialise the cache with a file object representing the volume unless the FCB is for a directory. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/create.c (VfatPagingFileCreate): Added a new function for handling paging file only code. * drivers/fs/vfat/create.c (VfatSupersedeFile): Added a new function for doing a file supersede. * drivers/fs/vfat/create.c (VfatCreateFile): Reformatted and adjusted control flow. Set allocation size and extended attributes on create. * drivers/fs/vfat/create.c (VfatCreate): Removed goto. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/close.c (VfatCloseFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (updEntry): Renamed to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (addEntry): Renamed to VfatAddEntry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * apps/tests/sectest/sectest.c (main): Fixed formatting. svn path=/trunk/; revision=3331
2002-08-14 20:58:39 +00:00
{
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
NTSTATUS Status;
KIRQL oldIrql;
Status = WriteCacheSegment(CacheSegment);
if (NT_SUCCESS(Status))
2002-08-14 David Welch <welch@computer2.darkstar.org> * subsys/smss/init.c (SmPagingFilesQueryRoutine): If possible take the size of the paging file from the registry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section.c (MmCreateDataFileSection): Extend the section if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/pagefile.c (NtCreatePagingFile): Set the file size using the FileAllocationInformation class. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/anonmem.c (MmWritePageVirtualMemory): Implemented function to write anonymous memory pages to the swap file. * ntoskrnl/mm/anonmem.c (MmFreeVirtualMemoryPage): Free any swap page associated with the page. * ntoskrnl/mm/mpw.c (MmWriteDirtyPages): New function to find pages to write to disk. * ntoskrnl/mm/mpw.c (MmMpwThreadMain): Implemented MPW functionality. * ntoskrnl/mm/rmap.c (MmWritePagePhysicalAddress): New function to write a single page back to disk. * ntoskrnl/mm/rmap.c (MmSetCleanAllRmaps, MmSetDirtyAllRmaps, MmIsDirtyPageRmap): New rmap function to support the MPW thread. * ntoskrnl/mm/section.c (MmWritePageSectionView): Implemented function to write back section pages. * ntoskrnl/mm/section.c (MmFreeSectionPage): Free any swap entry associated with the page; mark pages shared with the cache as dirty if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ldr/loader.c (LdrPEProcessModule): Set name of the module into the module text structure. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/io/rw.c (NtReadFile, NtWriteFile): Use the correct test for whether to wait for the completion of i/o. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cm/ntfunc.c (NtFlushKey): Request synchronous i/o from NtOpenFile. * ntoskrnl/cm/regfile (CmiInitPermanentRegistryHive): Request synchronous i/o from NtCreateFile. * ntoskrnl/dbg/kdb_stabs.c (LdrpLoadModuleSymbols): Request synchronous i/o from NtOpenFile. * ntoskrnl/ldr/sysdll.c (LdrpMapSystemDll): Request synchronous i/o from NtOpenFile. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosSuggestFreeCacheSegment): Maintain the correct reference count. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosFlushCacheSegment): New function to write back a modified cache segment. * ntoskrnl/cc/view.c (CcRosFlushDirtyPages): New function to flush some dirty pages from the cache. * ntoskrnl/cc/view.c (CcRosMarkDirtyCacheSegment): New function to mark a cache segment modified while mapped into memory as dirty. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/pin.c (CcMapData, CcUnpinData, CcSetDirtyPinnedData): Store the dirty status in the BCB; don't write back dirty data immediately. 2002-08-14 David Welch <welch@computer2.darkstar.org> * include/ntos/mm.h: Added SEC_XXXX defines from 'Windows NT/2000 Native API Reference' 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/ea.c (VfatSetExtendedAttributes): Empty placeholder for extended attribute functions. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/finfo.c (VfatSetAllocationSizeInformation): Added function to set allocation size. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/fcb.c (vfatFCBInitializeCache): Renamed to vfatFCBInitializeCacheFromVolume. * drivers/fs/vfat/fcb.c (vfatMakeFCBFromDirEntry): Don't initialise the cache with a file object representing the volume unless the FCB is for a directory. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/create.c (VfatPagingFileCreate): Added a new function for handling paging file only code. * drivers/fs/vfat/create.c (VfatSupersedeFile): Added a new function for doing a file supersede. * drivers/fs/vfat/create.c (VfatCreateFile): Reformatted and adjusted control flow. Set allocation size and extended attributes on create. * drivers/fs/vfat/create.c (VfatCreate): Removed goto. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/close.c (VfatCloseFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (updEntry): Renamed to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (addEntry): Renamed to VfatAddEntry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * apps/tests/sectest/sectest.c (main): Fixed formatting. svn path=/trunk/; revision=3331
2002-08-14 20:58:39 +00:00
{
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeAcquireGuardedMutex(&ViewLock);
KeAcquireSpinLock(&CacheSegment->Bcb->BcbLock, &oldIrql);
CacheSegment->Dirty = FALSE;
RemoveEntryList(&CacheSegment->DirtySegmentListEntry);
DirtyPageCount -= CacheSegment->Bcb->CacheSegmentSize / PAGE_SIZE;
CcRosCacheSegmentDecRefCount ( CacheSegment );
KeReleaseSpinLock(&CacheSegment->Bcb->BcbLock, oldIrql);
KeReleaseGuardedMutex(&ViewLock);
2002-08-14 David Welch <welch@computer2.darkstar.org> * subsys/smss/init.c (SmPagingFilesQueryRoutine): If possible take the size of the paging file from the registry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section.c (MmCreateDataFileSection): Extend the section if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/pagefile.c (NtCreatePagingFile): Set the file size using the FileAllocationInformation class. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/anonmem.c (MmWritePageVirtualMemory): Implemented function to write anonymous memory pages to the swap file. * ntoskrnl/mm/anonmem.c (MmFreeVirtualMemoryPage): Free any swap page associated with the page. * ntoskrnl/mm/mpw.c (MmWriteDirtyPages): New function to find pages to write to disk. * ntoskrnl/mm/mpw.c (MmMpwThreadMain): Implemented MPW functionality. * ntoskrnl/mm/rmap.c (MmWritePagePhysicalAddress): New function to write a single page back to disk. * ntoskrnl/mm/rmap.c (MmSetCleanAllRmaps, MmSetDirtyAllRmaps, MmIsDirtyPageRmap): New rmap function to support the MPW thread. * ntoskrnl/mm/section.c (MmWritePageSectionView): Implemented function to write back section pages. * ntoskrnl/mm/section.c (MmFreeSectionPage): Free any swap entry associated with the page; mark pages shared with the cache as dirty if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ldr/loader.c (LdrPEProcessModule): Set name of the module into the module text structure. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/io/rw.c (NtReadFile, NtWriteFile): Use the correct test for whether to wait for the completion of i/o. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cm/ntfunc.c (NtFlushKey): Request synchronous i/o from NtOpenFile. * ntoskrnl/cm/regfile (CmiInitPermanentRegistryHive): Request synchronous i/o from NtCreateFile. * ntoskrnl/dbg/kdb_stabs.c (LdrpLoadModuleSymbols): Request synchronous i/o from NtOpenFile. * ntoskrnl/ldr/sysdll.c (LdrpMapSystemDll): Request synchronous i/o from NtOpenFile. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosSuggestFreeCacheSegment): Maintain the correct reference count. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosFlushCacheSegment): New function to write back a modified cache segment. * ntoskrnl/cc/view.c (CcRosFlushDirtyPages): New function to flush some dirty pages from the cache. * ntoskrnl/cc/view.c (CcRosMarkDirtyCacheSegment): New function to mark a cache segment modified while mapped into memory as dirty. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/pin.c (CcMapData, CcUnpinData, CcSetDirtyPinnedData): Store the dirty status in the BCB; don't write back dirty data immediately. 2002-08-14 David Welch <welch@computer2.darkstar.org> * include/ntos/mm.h: Added SEC_XXXX defines from 'Windows NT/2000 Native API Reference' 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/ea.c (VfatSetExtendedAttributes): Empty placeholder for extended attribute functions. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/finfo.c (VfatSetAllocationSizeInformation): Added function to set allocation size. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/fcb.c (vfatFCBInitializeCache): Renamed to vfatFCBInitializeCacheFromVolume. * drivers/fs/vfat/fcb.c (vfatMakeFCBFromDirEntry): Don't initialise the cache with a file object representing the volume unless the FCB is for a directory. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/create.c (VfatPagingFileCreate): Added a new function for handling paging file only code. * drivers/fs/vfat/create.c (VfatSupersedeFile): Added a new function for doing a file supersede. * drivers/fs/vfat/create.c (VfatCreateFile): Reformatted and adjusted control flow. Set allocation size and extended attributes on create. * drivers/fs/vfat/create.c (VfatCreate): Removed goto. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/close.c (VfatCloseFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (updEntry): Renamed to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (addEntry): Renamed to VfatAddEntry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * apps/tests/sectest/sectest.c (main): Fixed formatting. svn path=/trunk/; revision=3331
2002-08-14 20:58:39 +00:00
}
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
return(Status);
2002-08-14 David Welch <welch@computer2.darkstar.org> * subsys/smss/init.c (SmPagingFilesQueryRoutine): If possible take the size of the paging file from the registry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section.c (MmCreateDataFileSection): Extend the section if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/pagefile.c (NtCreatePagingFile): Set the file size using the FileAllocationInformation class. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/anonmem.c (MmWritePageVirtualMemory): Implemented function to write anonymous memory pages to the swap file. * ntoskrnl/mm/anonmem.c (MmFreeVirtualMemoryPage): Free any swap page associated with the page. * ntoskrnl/mm/mpw.c (MmWriteDirtyPages): New function to find pages to write to disk. * ntoskrnl/mm/mpw.c (MmMpwThreadMain): Implemented MPW functionality. * ntoskrnl/mm/rmap.c (MmWritePagePhysicalAddress): New function to write a single page back to disk. * ntoskrnl/mm/rmap.c (MmSetCleanAllRmaps, MmSetDirtyAllRmaps, MmIsDirtyPageRmap): New rmap function to support the MPW thread. * ntoskrnl/mm/section.c (MmWritePageSectionView): Implemented function to write back section pages. * ntoskrnl/mm/section.c (MmFreeSectionPage): Free any swap entry associated with the page; mark pages shared with the cache as dirty if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ldr/loader.c (LdrPEProcessModule): Set name of the module into the module text structure. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/io/rw.c (NtReadFile, NtWriteFile): Use the correct test for whether to wait for the completion of i/o. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cm/ntfunc.c (NtFlushKey): Request synchronous i/o from NtOpenFile. * ntoskrnl/cm/regfile (CmiInitPermanentRegistryHive): Request synchronous i/o from NtCreateFile. * ntoskrnl/dbg/kdb_stabs.c (LdrpLoadModuleSymbols): Request synchronous i/o from NtOpenFile. * ntoskrnl/ldr/sysdll.c (LdrpMapSystemDll): Request synchronous i/o from NtOpenFile. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosSuggestFreeCacheSegment): Maintain the correct reference count. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosFlushCacheSegment): New function to write back a modified cache segment. * ntoskrnl/cc/view.c (CcRosFlushDirtyPages): New function to flush some dirty pages from the cache. * ntoskrnl/cc/view.c (CcRosMarkDirtyCacheSegment): New function to mark a cache segment modified while mapped into memory as dirty. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/pin.c (CcMapData, CcUnpinData, CcSetDirtyPinnedData): Store the dirty status in the BCB; don't write back dirty data immediately. 2002-08-14 David Welch <welch@computer2.darkstar.org> * include/ntos/mm.h: Added SEC_XXXX defines from 'Windows NT/2000 Native API Reference' 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/ea.c (VfatSetExtendedAttributes): Empty placeholder for extended attribute functions. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/finfo.c (VfatSetAllocationSizeInformation): Added function to set allocation size. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/fcb.c (vfatFCBInitializeCache): Renamed to vfatFCBInitializeCacheFromVolume. * drivers/fs/vfat/fcb.c (vfatMakeFCBFromDirEntry): Don't initialise the cache with a file object representing the volume unless the FCB is for a directory. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/create.c (VfatPagingFileCreate): Added a new function for handling paging file only code. * drivers/fs/vfat/create.c (VfatSupersedeFile): Added a new function for doing a file supersede. * drivers/fs/vfat/create.c (VfatCreateFile): Reformatted and adjusted control flow. Set allocation size and extended attributes on create. * drivers/fs/vfat/create.c (VfatCreate): Removed goto. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/close.c (VfatCloseFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (updEntry): Renamed to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (addEntry): Renamed to VfatAddEntry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * apps/tests/sectest/sectest.c (main): Fixed formatting. svn path=/trunk/; revision=3331
2002-08-14 20:58:39 +00:00
}
NTSTATUS
NTAPI
2002-08-14 David Welch <welch@computer2.darkstar.org> * subsys/smss/init.c (SmPagingFilesQueryRoutine): If possible take the size of the paging file from the registry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section.c (MmCreateDataFileSection): Extend the section if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/pagefile.c (NtCreatePagingFile): Set the file size using the FileAllocationInformation class. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/anonmem.c (MmWritePageVirtualMemory): Implemented function to write anonymous memory pages to the swap file. * ntoskrnl/mm/anonmem.c (MmFreeVirtualMemoryPage): Free any swap page associated with the page. * ntoskrnl/mm/mpw.c (MmWriteDirtyPages): New function to find pages to write to disk. * ntoskrnl/mm/mpw.c (MmMpwThreadMain): Implemented MPW functionality. * ntoskrnl/mm/rmap.c (MmWritePagePhysicalAddress): New function to write a single page back to disk. * ntoskrnl/mm/rmap.c (MmSetCleanAllRmaps, MmSetDirtyAllRmaps, MmIsDirtyPageRmap): New rmap function to support the MPW thread. * ntoskrnl/mm/section.c (MmWritePageSectionView): Implemented function to write back section pages. * ntoskrnl/mm/section.c (MmFreeSectionPage): Free any swap entry associated with the page; mark pages shared with the cache as dirty if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ldr/loader.c (LdrPEProcessModule): Set name of the module into the module text structure. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/io/rw.c (NtReadFile, NtWriteFile): Use the correct test for whether to wait for the completion of i/o. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cm/ntfunc.c (NtFlushKey): Request synchronous i/o from NtOpenFile. * ntoskrnl/cm/regfile (CmiInitPermanentRegistryHive): Request synchronous i/o from NtCreateFile. * ntoskrnl/dbg/kdb_stabs.c (LdrpLoadModuleSymbols): Request synchronous i/o from NtOpenFile. * ntoskrnl/ldr/sysdll.c (LdrpMapSystemDll): Request synchronous i/o from NtOpenFile. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosSuggestFreeCacheSegment): Maintain the correct reference count. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosFlushCacheSegment): New function to write back a modified cache segment. * ntoskrnl/cc/view.c (CcRosFlushDirtyPages): New function to flush some dirty pages from the cache. * ntoskrnl/cc/view.c (CcRosMarkDirtyCacheSegment): New function to mark a cache segment modified while mapped into memory as dirty. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/pin.c (CcMapData, CcUnpinData, CcSetDirtyPinnedData): Store the dirty status in the BCB; don't write back dirty data immediately. 2002-08-14 David Welch <welch@computer2.darkstar.org> * include/ntos/mm.h: Added SEC_XXXX defines from 'Windows NT/2000 Native API Reference' 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/ea.c (VfatSetExtendedAttributes): Empty placeholder for extended attribute functions. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/finfo.c (VfatSetAllocationSizeInformation): Added function to set allocation size. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/fcb.c (vfatFCBInitializeCache): Renamed to vfatFCBInitializeCacheFromVolume. * drivers/fs/vfat/fcb.c (vfatMakeFCBFromDirEntry): Don't initialise the cache with a file object representing the volume unless the FCB is for a directory. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/create.c (VfatPagingFileCreate): Added a new function for handling paging file only code. * drivers/fs/vfat/create.c (VfatSupersedeFile): Added a new function for doing a file supersede. * drivers/fs/vfat/create.c (VfatCreateFile): Reformatted and adjusted control flow. Set allocation size and extended attributes on create. * drivers/fs/vfat/create.c (VfatCreate): Removed goto. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/close.c (VfatCloseFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (updEntry): Renamed to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (addEntry): Renamed to VfatAddEntry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * apps/tests/sectest/sectest.c (main): Fixed formatting. svn path=/trunk/; revision=3331
2002-08-14 20:58:39 +00:00
CcRosFlushDirtyPages(ULONG Target, PULONG Count)
{
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
PLIST_ENTRY current_entry;
PCACHE_SEGMENT current;
ULONG PagesPerSegment;
BOOLEAN Locked;
NTSTATUS Status;
static ULONG WriteCount[4] = {0, 0, 0, 0};
ULONG NewTarget;
DPRINT("CcRosFlushDirtyPages(Target %d)\n", Target);
(*Count) = 0;
KeEnterCriticalRegion();
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeAcquireGuardedMutex(&ViewLock);
WriteCount[0] = WriteCount[1];
WriteCount[1] = WriteCount[2];
WriteCount[2] = WriteCount[3];
WriteCount[3] = 0;
NewTarget = WriteCount[0] + WriteCount[1] + WriteCount[2];
if (NewTarget < DirtyPageCount)
{
NewTarget = (DirtyPageCount - NewTarget + 3) / 4;
WriteCount[0] += NewTarget;
WriteCount[1] += NewTarget;
WriteCount[2] += NewTarget;
WriteCount[3] += NewTarget;
}
NewTarget = WriteCount[0];
Target = max(NewTarget, Target);
current_entry = DirtySegmentListHead.Flink;
if (current_entry == &DirtySegmentListHead)
{
DPRINT("No Dirty pages\n");
}
while (current_entry != &DirtySegmentListHead && Target > 0)
{
current = CONTAINING_RECORD(current_entry, CACHE_SEGMENT,
DirtySegmentListEntry);
current_entry = current_entry->Flink;
Locked = current->Bcb->Callbacks->AcquireForLazyWrite(
current->Bcb->LazyWriteContext, FALSE);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
if (!Locked)
{
continue;
}
Locked = ExTryToAcquirePushLockExclusive(&current->Lock);
if (!Locked)
{
current->Bcb->Callbacks->ReleaseFromLazyWrite(
current->Bcb->LazyWriteContext);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
continue;
}
ASSERT(current->Dirty);
if (current->ReferenceCount > 1)
{
ExReleasePushLock(&current->Lock);
current->Bcb->Callbacks->ReleaseFromLazyWrite(
current->Bcb->LazyWriteContext);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
continue;
}
PagesPerSegment = current->Bcb->CacheSegmentSize / PAGE_SIZE;
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeReleaseGuardedMutex(&ViewLock);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
Status = CcRosFlushCacheSegment(current);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
ExReleasePushLock(&current->Lock);
current->Bcb->Callbacks->ReleaseFromLazyWrite(
current->Bcb->LazyWriteContext);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
if (!NT_SUCCESS(Status) && (Status != STATUS_END_OF_FILE))
{
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
DPRINT1("CC: Failed to flush cache segment.\n");
}
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
else
{
(*Count) += PagesPerSegment;
Target -= PagesPerSegment;
}
KeAcquireGuardedMutex(&ViewLock);
current_entry = DirtySegmentListHead.Flink;
2002-08-14 David Welch <welch@computer2.darkstar.org> * subsys/smss/init.c (SmPagingFilesQueryRoutine): If possible take the size of the paging file from the registry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section.c (MmCreateDataFileSection): Extend the section if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/pagefile.c (NtCreatePagingFile): Set the file size using the FileAllocationInformation class. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/anonmem.c (MmWritePageVirtualMemory): Implemented function to write anonymous memory pages to the swap file. * ntoskrnl/mm/anonmem.c (MmFreeVirtualMemoryPage): Free any swap page associated with the page. * ntoskrnl/mm/mpw.c (MmWriteDirtyPages): New function to find pages to write to disk. * ntoskrnl/mm/mpw.c (MmMpwThreadMain): Implemented MPW functionality. * ntoskrnl/mm/rmap.c (MmWritePagePhysicalAddress): New function to write a single page back to disk. * ntoskrnl/mm/rmap.c (MmSetCleanAllRmaps, MmSetDirtyAllRmaps, MmIsDirtyPageRmap): New rmap function to support the MPW thread. * ntoskrnl/mm/section.c (MmWritePageSectionView): Implemented function to write back section pages. * ntoskrnl/mm/section.c (MmFreeSectionPage): Free any swap entry associated with the page; mark pages shared with the cache as dirty if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ldr/loader.c (LdrPEProcessModule): Set name of the module into the module text structure. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/io/rw.c (NtReadFile, NtWriteFile): Use the correct test for whether to wait for the completion of i/o. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cm/ntfunc.c (NtFlushKey): Request synchronous i/o from NtOpenFile. * ntoskrnl/cm/regfile (CmiInitPermanentRegistryHive): Request synchronous i/o from NtCreateFile. * ntoskrnl/dbg/kdb_stabs.c (LdrpLoadModuleSymbols): Request synchronous i/o from NtOpenFile. * ntoskrnl/ldr/sysdll.c (LdrpMapSystemDll): Request synchronous i/o from NtOpenFile. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosSuggestFreeCacheSegment): Maintain the correct reference count. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosFlushCacheSegment): New function to write back a modified cache segment. * ntoskrnl/cc/view.c (CcRosFlushDirtyPages): New function to flush some dirty pages from the cache. * ntoskrnl/cc/view.c (CcRosMarkDirtyCacheSegment): New function to mark a cache segment modified while mapped into memory as dirty. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/pin.c (CcMapData, CcUnpinData, CcSetDirtyPinnedData): Store the dirty status in the BCB; don't write back dirty data immediately. 2002-08-14 David Welch <welch@computer2.darkstar.org> * include/ntos/mm.h: Added SEC_XXXX defines from 'Windows NT/2000 Native API Reference' 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/ea.c (VfatSetExtendedAttributes): Empty placeholder for extended attribute functions. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/finfo.c (VfatSetAllocationSizeInformation): Added function to set allocation size. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/fcb.c (vfatFCBInitializeCache): Renamed to vfatFCBInitializeCacheFromVolume. * drivers/fs/vfat/fcb.c (vfatMakeFCBFromDirEntry): Don't initialise the cache with a file object representing the volume unless the FCB is for a directory. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/create.c (VfatPagingFileCreate): Added a new function for handling paging file only code. * drivers/fs/vfat/create.c (VfatSupersedeFile): Added a new function for doing a file supersede. * drivers/fs/vfat/create.c (VfatCreateFile): Reformatted and adjusted control flow. Set allocation size and extended attributes on create. * drivers/fs/vfat/create.c (VfatCreate): Removed goto. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/close.c (VfatCloseFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (updEntry): Renamed to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (addEntry): Renamed to VfatAddEntry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * apps/tests/sectest/sectest.c (main): Fixed formatting. svn path=/trunk/; revision=3331
2002-08-14 20:58:39 +00:00
}
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
if (*Count < NewTarget)
{
WriteCount[1] += (NewTarget - *Count);
}
KeReleaseGuardedMutex(&ViewLock);
KeLeaveCriticalRegion();
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
DPRINT("CcRosFlushDirtyPages() finished\n");
return(STATUS_SUCCESS);
2002-08-14 David Welch <welch@computer2.darkstar.org> * subsys/smss/init.c (SmPagingFilesQueryRoutine): If possible take the size of the paging file from the registry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section.c (MmCreateDataFileSection): Extend the section if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/pagefile.c (NtCreatePagingFile): Set the file size using the FileAllocationInformation class. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/anonmem.c (MmWritePageVirtualMemory): Implemented function to write anonymous memory pages to the swap file. * ntoskrnl/mm/anonmem.c (MmFreeVirtualMemoryPage): Free any swap page associated with the page. * ntoskrnl/mm/mpw.c (MmWriteDirtyPages): New function to find pages to write to disk. * ntoskrnl/mm/mpw.c (MmMpwThreadMain): Implemented MPW functionality. * ntoskrnl/mm/rmap.c (MmWritePagePhysicalAddress): New function to write a single page back to disk. * ntoskrnl/mm/rmap.c (MmSetCleanAllRmaps, MmSetDirtyAllRmaps, MmIsDirtyPageRmap): New rmap function to support the MPW thread. * ntoskrnl/mm/section.c (MmWritePageSectionView): Implemented function to write back section pages. * ntoskrnl/mm/section.c (MmFreeSectionPage): Free any swap entry associated with the page; mark pages shared with the cache as dirty if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ldr/loader.c (LdrPEProcessModule): Set name of the module into the module text structure. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/io/rw.c (NtReadFile, NtWriteFile): Use the correct test for whether to wait for the completion of i/o. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cm/ntfunc.c (NtFlushKey): Request synchronous i/o from NtOpenFile. * ntoskrnl/cm/regfile (CmiInitPermanentRegistryHive): Request synchronous i/o from NtCreateFile. * ntoskrnl/dbg/kdb_stabs.c (LdrpLoadModuleSymbols): Request synchronous i/o from NtOpenFile. * ntoskrnl/ldr/sysdll.c (LdrpMapSystemDll): Request synchronous i/o from NtOpenFile. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosSuggestFreeCacheSegment): Maintain the correct reference count. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosFlushCacheSegment): New function to write back a modified cache segment. * ntoskrnl/cc/view.c (CcRosFlushDirtyPages): New function to flush some dirty pages from the cache. * ntoskrnl/cc/view.c (CcRosMarkDirtyCacheSegment): New function to mark a cache segment modified while mapped into memory as dirty. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/pin.c (CcMapData, CcUnpinData, CcSetDirtyPinnedData): Store the dirty status in the BCB; don't write back dirty data immediately. 2002-08-14 David Welch <welch@computer2.darkstar.org> * include/ntos/mm.h: Added SEC_XXXX defines from 'Windows NT/2000 Native API Reference' 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/ea.c (VfatSetExtendedAttributes): Empty placeholder for extended attribute functions. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/finfo.c (VfatSetAllocationSizeInformation): Added function to set allocation size. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/fcb.c (vfatFCBInitializeCache): Renamed to vfatFCBInitializeCacheFromVolume. * drivers/fs/vfat/fcb.c (vfatMakeFCBFromDirEntry): Don't initialise the cache with a file object representing the volume unless the FCB is for a directory. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/create.c (VfatPagingFileCreate): Added a new function for handling paging file only code. * drivers/fs/vfat/create.c (VfatSupersedeFile): Added a new function for doing a file supersede. * drivers/fs/vfat/create.c (VfatCreateFile): Reformatted and adjusted control flow. Set allocation size and extended attributes on create. * drivers/fs/vfat/create.c (VfatCreate): Removed goto. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/close.c (VfatCloseFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (updEntry): Renamed to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (addEntry): Renamed to VfatAddEntry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * apps/tests/sectest/sectest.c (main): Fixed formatting. svn path=/trunk/; revision=3331
2002-08-14 20:58:39 +00:00
}
NTSTATUS
CcRosTrimCache(ULONG Target, ULONG Priority, PULONG NrFreed)
/*
* FUNCTION: Try to free some memory from the file cache.
* ARGUMENTS:
* Target - The number of pages to be freed.
* Priority - The priority of free (currently unused).
* NrFreed - Points to a variable where the number of pages
* actually freed is returned.
*/
{
PLIST_ENTRY current_entry;
PCACHE_SEGMENT current;
ULONG PagesPerSegment;
ULONG PagesFreed;
KIRQL oldIrql;
LIST_ENTRY FreeList;
PFN_NUMBER Page;
ULONG i;
DPRINT("CcRosTrimCache(Target %d)\n", Target);
InitializeListHead(&FreeList);
/* Flush dirty pages to disk */
CcRosFlushDirtyPages(Target, NrFreed);
if ((*NrFreed) != 0) DPRINT1("Flushed %d dirty cache pages to disk\n", (*NrFreed));
*NrFreed = 0;
KeAcquireGuardedMutex(&ViewLock);
current_entry = CacheSegmentLRUListHead.Flink;
while (current_entry != &CacheSegmentLRUListHead)
{
current = CONTAINING_RECORD(current_entry, CACHE_SEGMENT,
CacheSegmentLRUListEntry);
current_entry = current_entry->Flink;
KeAcquireSpinLock(&current->Bcb->BcbLock, &oldIrql);
/* Reference the cache segment */
CcRosCacheSegmentIncRefCount(current);
/* Check if it's mapped and not dirty */
if (current->MappedCount > 0 && !current->Dirty)
{
/* We have to break these locks because Cc sucks */
KeReleaseSpinLock(&current->Bcb->BcbLock, oldIrql);
KeReleaseGuardedMutex(&ViewLock);
/* Page out the segment */
for (i = 0; i < current->Bcb->CacheSegmentSize / PAGE_SIZE; i++)
{
Page = (PFN_NUMBER)(MmGetPhysicalAddress((PUCHAR)current->BaseAddress + (i * PAGE_SIZE)).QuadPart >> PAGE_SHIFT);
MmPageOutPhysicalAddress(Page);
}
/* Reacquire the locks */
KeAcquireGuardedMutex(&ViewLock);
KeAcquireSpinLock(&current->Bcb->BcbLock, &oldIrql);
}
/* Dereference the cache segment */
CcRosCacheSegmentDecRefCount(current);
/* Check if we can free this entry now */
if (current->ReferenceCount == 0)
{
ASSERT(!current->Dirty);
ASSERT(!current->MappedCount);
RemoveEntryList(&current->BcbSegmentListEntry);
RemoveEntryList(&current->CacheSegmentListEntry);
RemoveEntryList(&current->CacheSegmentLRUListEntry);
InsertHeadList(&FreeList, &current->BcbSegmentListEntry);
/* Calculate how many pages we freed for Mm */
PagesPerSegment = current->Bcb->CacheSegmentSize / PAGE_SIZE;
PagesFreed = min(PagesPerSegment, Target);
Target -= PagesFreed;
(*NrFreed) += PagesFreed;
}
KeReleaseSpinLock(&current->Bcb->BcbLock, oldIrql);
}
KeReleaseGuardedMutex(&ViewLock);
while (!IsListEmpty(&FreeList))
{
current_entry = RemoveHeadList(&FreeList);
current = CONTAINING_RECORD(current_entry, CACHE_SEGMENT,
BcbSegmentListEntry);
CcRosInternalFreeCacheSegment(current);
}
DPRINT1("Evicted %d cache pages\n", (*NrFreed));
return(STATUS_SUCCESS);
}
NTSTATUS
NTAPI
CcRosReleaseCacheSegment(PBCB Bcb,
PCACHE_SEGMENT CacheSeg,
BOOLEAN Valid,
BOOLEAN Dirty,
BOOLEAN Mapped)
{
BOOLEAN WasDirty = CacheSeg->Dirty;
KIRQL oldIrql;
ASSERT(Bcb);
DPRINT("CcReleaseCacheSegment(Bcb 0x%p, CacheSeg 0x%p, Valid %d)\n",
Bcb, CacheSeg, Valid);
CacheSeg->Valid = Valid;
CacheSeg->Dirty = CacheSeg->Dirty || Dirty;
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeAcquireGuardedMutex(&ViewLock);
if (!WasDirty && CacheSeg->Dirty)
{
InsertTailList(&DirtySegmentListHead, &CacheSeg->DirtySegmentListEntry);
DirtyPageCount += Bcb->CacheSegmentSize / PAGE_SIZE;
}
RemoveEntryList(&CacheSeg->CacheSegmentLRUListEntry);
InsertTailList(&CacheSegmentLRUListHead, &CacheSeg->CacheSegmentLRUListEntry);
if (Mapped)
{
CacheSeg->MappedCount++;
}
KeAcquireSpinLock(&Bcb->BcbLock, &oldIrql);
CcRosCacheSegmentDecRefCount(CacheSeg);
if (Mapped && CacheSeg->MappedCount == 1)
{
CcRosCacheSegmentIncRefCount(CacheSeg);
}
if (!WasDirty && CacheSeg->Dirty)
{
CcRosCacheSegmentIncRefCount(CacheSeg);
}
KeReleaseSpinLock(&Bcb->BcbLock, oldIrql);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeReleaseGuardedMutex(&ViewLock);
ExReleasePushLock(&CacheSeg->Lock);
return(STATUS_SUCCESS);
}
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
/* Returns with Cache Segment Lock Held! */
PCACHE_SEGMENT
NTAPI
CcRosLookupCacheSegment(PBCB Bcb, ULONG FileOffset)
{
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
PLIST_ENTRY current_entry;
PCACHE_SEGMENT current;
KIRQL oldIrql;
ASSERT(Bcb);
DPRINT("CcRosLookupCacheSegment(Bcb -x%p, FileOffset %d)\n", Bcb, FileOffset);
KeAcquireSpinLock(&Bcb->BcbLock, &oldIrql);
current_entry = Bcb->BcbSegmentListHead.Flink;
while (current_entry != &Bcb->BcbSegmentListHead)
{
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
current = CONTAINING_RECORD(current_entry, CACHE_SEGMENT,
BcbSegmentListEntry);
if (current->FileOffset <= FileOffset &&
(current->FileOffset + Bcb->CacheSegmentSize) > FileOffset)
{
CcRosCacheSegmentIncRefCount(current);
KeReleaseSpinLock(&Bcb->BcbLock, oldIrql);
ExAcquirePushLockExclusive(&current->Lock);
return(current);
}
current_entry = current_entry->Flink;
}
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeReleaseSpinLock(&Bcb->BcbLock, oldIrql);
return(NULL);
}
2002-08-14 David Welch <welch@computer2.darkstar.org> * subsys/smss/init.c (SmPagingFilesQueryRoutine): If possible take the size of the paging file from the registry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section.c (MmCreateDataFileSection): Extend the section if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/pagefile.c (NtCreatePagingFile): Set the file size using the FileAllocationInformation class. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/anonmem.c (MmWritePageVirtualMemory): Implemented function to write anonymous memory pages to the swap file. * ntoskrnl/mm/anonmem.c (MmFreeVirtualMemoryPage): Free any swap page associated with the page. * ntoskrnl/mm/mpw.c (MmWriteDirtyPages): New function to find pages to write to disk. * ntoskrnl/mm/mpw.c (MmMpwThreadMain): Implemented MPW functionality. * ntoskrnl/mm/rmap.c (MmWritePagePhysicalAddress): New function to write a single page back to disk. * ntoskrnl/mm/rmap.c (MmSetCleanAllRmaps, MmSetDirtyAllRmaps, MmIsDirtyPageRmap): New rmap function to support the MPW thread. * ntoskrnl/mm/section.c (MmWritePageSectionView): Implemented function to write back section pages. * ntoskrnl/mm/section.c (MmFreeSectionPage): Free any swap entry associated with the page; mark pages shared with the cache as dirty if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ldr/loader.c (LdrPEProcessModule): Set name of the module into the module text structure. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/io/rw.c (NtReadFile, NtWriteFile): Use the correct test for whether to wait for the completion of i/o. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cm/ntfunc.c (NtFlushKey): Request synchronous i/o from NtOpenFile. * ntoskrnl/cm/regfile (CmiInitPermanentRegistryHive): Request synchronous i/o from NtCreateFile. * ntoskrnl/dbg/kdb_stabs.c (LdrpLoadModuleSymbols): Request synchronous i/o from NtOpenFile. * ntoskrnl/ldr/sysdll.c (LdrpMapSystemDll): Request synchronous i/o from NtOpenFile. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosSuggestFreeCacheSegment): Maintain the correct reference count. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosFlushCacheSegment): New function to write back a modified cache segment. * ntoskrnl/cc/view.c (CcRosFlushDirtyPages): New function to flush some dirty pages from the cache. * ntoskrnl/cc/view.c (CcRosMarkDirtyCacheSegment): New function to mark a cache segment modified while mapped into memory as dirty. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/pin.c (CcMapData, CcUnpinData, CcSetDirtyPinnedData): Store the dirty status in the BCB; don't write back dirty data immediately. 2002-08-14 David Welch <welch@computer2.darkstar.org> * include/ntos/mm.h: Added SEC_XXXX defines from 'Windows NT/2000 Native API Reference' 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/ea.c (VfatSetExtendedAttributes): Empty placeholder for extended attribute functions. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/finfo.c (VfatSetAllocationSizeInformation): Added function to set allocation size. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/fcb.c (vfatFCBInitializeCache): Renamed to vfatFCBInitializeCacheFromVolume. * drivers/fs/vfat/fcb.c (vfatMakeFCBFromDirEntry): Don't initialise the cache with a file object representing the volume unless the FCB is for a directory. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/create.c (VfatPagingFileCreate): Added a new function for handling paging file only code. * drivers/fs/vfat/create.c (VfatSupersedeFile): Added a new function for doing a file supersede. * drivers/fs/vfat/create.c (VfatCreateFile): Reformatted and adjusted control flow. Set allocation size and extended attributes on create. * drivers/fs/vfat/create.c (VfatCreate): Removed goto. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/close.c (VfatCloseFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (updEntry): Renamed to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (addEntry): Renamed to VfatAddEntry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * apps/tests/sectest/sectest.c (main): Fixed formatting. svn path=/trunk/; revision=3331
2002-08-14 20:58:39 +00:00
NTSTATUS
NTAPI
CcRosMarkDirtyCacheSegment(PBCB Bcb, ULONG FileOffset)
2002-08-14 David Welch <welch@computer2.darkstar.org> * subsys/smss/init.c (SmPagingFilesQueryRoutine): If possible take the size of the paging file from the registry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section.c (MmCreateDataFileSection): Extend the section if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/pagefile.c (NtCreatePagingFile): Set the file size using the FileAllocationInformation class. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/anonmem.c (MmWritePageVirtualMemory): Implemented function to write anonymous memory pages to the swap file. * ntoskrnl/mm/anonmem.c (MmFreeVirtualMemoryPage): Free any swap page associated with the page. * ntoskrnl/mm/mpw.c (MmWriteDirtyPages): New function to find pages to write to disk. * ntoskrnl/mm/mpw.c (MmMpwThreadMain): Implemented MPW functionality. * ntoskrnl/mm/rmap.c (MmWritePagePhysicalAddress): New function to write a single page back to disk. * ntoskrnl/mm/rmap.c (MmSetCleanAllRmaps, MmSetDirtyAllRmaps, MmIsDirtyPageRmap): New rmap function to support the MPW thread. * ntoskrnl/mm/section.c (MmWritePageSectionView): Implemented function to write back section pages. * ntoskrnl/mm/section.c (MmFreeSectionPage): Free any swap entry associated with the page; mark pages shared with the cache as dirty if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ldr/loader.c (LdrPEProcessModule): Set name of the module into the module text structure. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/io/rw.c (NtReadFile, NtWriteFile): Use the correct test for whether to wait for the completion of i/o. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cm/ntfunc.c (NtFlushKey): Request synchronous i/o from NtOpenFile. * ntoskrnl/cm/regfile (CmiInitPermanentRegistryHive): Request synchronous i/o from NtCreateFile. * ntoskrnl/dbg/kdb_stabs.c (LdrpLoadModuleSymbols): Request synchronous i/o from NtOpenFile. * ntoskrnl/ldr/sysdll.c (LdrpMapSystemDll): Request synchronous i/o from NtOpenFile. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosSuggestFreeCacheSegment): Maintain the correct reference count. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosFlushCacheSegment): New function to write back a modified cache segment. * ntoskrnl/cc/view.c (CcRosFlushDirtyPages): New function to flush some dirty pages from the cache. * ntoskrnl/cc/view.c (CcRosMarkDirtyCacheSegment): New function to mark a cache segment modified while mapped into memory as dirty. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/pin.c (CcMapData, CcUnpinData, CcSetDirtyPinnedData): Store the dirty status in the BCB; don't write back dirty data immediately. 2002-08-14 David Welch <welch@computer2.darkstar.org> * include/ntos/mm.h: Added SEC_XXXX defines from 'Windows NT/2000 Native API Reference' 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/ea.c (VfatSetExtendedAttributes): Empty placeholder for extended attribute functions. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/finfo.c (VfatSetAllocationSizeInformation): Added function to set allocation size. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/fcb.c (vfatFCBInitializeCache): Renamed to vfatFCBInitializeCacheFromVolume. * drivers/fs/vfat/fcb.c (vfatMakeFCBFromDirEntry): Don't initialise the cache with a file object representing the volume unless the FCB is for a directory. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/create.c (VfatPagingFileCreate): Added a new function for handling paging file only code. * drivers/fs/vfat/create.c (VfatSupersedeFile): Added a new function for doing a file supersede. * drivers/fs/vfat/create.c (VfatCreateFile): Reformatted and adjusted control flow. Set allocation size and extended attributes on create. * drivers/fs/vfat/create.c (VfatCreate): Removed goto. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/close.c (VfatCloseFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (updEntry): Renamed to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (addEntry): Renamed to VfatAddEntry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * apps/tests/sectest/sectest.c (main): Fixed formatting. svn path=/trunk/; revision=3331
2002-08-14 20:58:39 +00:00
{
PCACHE_SEGMENT CacheSeg;
KIRQL oldIrql;
ASSERT(Bcb);
DPRINT("CcRosMarkDirtyCacheSegment(Bcb 0x%p, FileOffset %d)\n", Bcb, FileOffset);
2002-08-14 David Welch <welch@computer2.darkstar.org> * subsys/smss/init.c (SmPagingFilesQueryRoutine): If possible take the size of the paging file from the registry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section.c (MmCreateDataFileSection): Extend the section if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/pagefile.c (NtCreatePagingFile): Set the file size using the FileAllocationInformation class. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/anonmem.c (MmWritePageVirtualMemory): Implemented function to write anonymous memory pages to the swap file. * ntoskrnl/mm/anonmem.c (MmFreeVirtualMemoryPage): Free any swap page associated with the page. * ntoskrnl/mm/mpw.c (MmWriteDirtyPages): New function to find pages to write to disk. * ntoskrnl/mm/mpw.c (MmMpwThreadMain): Implemented MPW functionality. * ntoskrnl/mm/rmap.c (MmWritePagePhysicalAddress): New function to write a single page back to disk. * ntoskrnl/mm/rmap.c (MmSetCleanAllRmaps, MmSetDirtyAllRmaps, MmIsDirtyPageRmap): New rmap function to support the MPW thread. * ntoskrnl/mm/section.c (MmWritePageSectionView): Implemented function to write back section pages. * ntoskrnl/mm/section.c (MmFreeSectionPage): Free any swap entry associated with the page; mark pages shared with the cache as dirty if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ldr/loader.c (LdrPEProcessModule): Set name of the module into the module text structure. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/io/rw.c (NtReadFile, NtWriteFile): Use the correct test for whether to wait for the completion of i/o. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cm/ntfunc.c (NtFlushKey): Request synchronous i/o from NtOpenFile. * ntoskrnl/cm/regfile (CmiInitPermanentRegistryHive): Request synchronous i/o from NtCreateFile. * ntoskrnl/dbg/kdb_stabs.c (LdrpLoadModuleSymbols): Request synchronous i/o from NtOpenFile. * ntoskrnl/ldr/sysdll.c (LdrpMapSystemDll): Request synchronous i/o from NtOpenFile. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosSuggestFreeCacheSegment): Maintain the correct reference count. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosFlushCacheSegment): New function to write back a modified cache segment. * ntoskrnl/cc/view.c (CcRosFlushDirtyPages): New function to flush some dirty pages from the cache. * ntoskrnl/cc/view.c (CcRosMarkDirtyCacheSegment): New function to mark a cache segment modified while mapped into memory as dirty. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/pin.c (CcMapData, CcUnpinData, CcSetDirtyPinnedData): Store the dirty status in the BCB; don't write back dirty data immediately. 2002-08-14 David Welch <welch@computer2.darkstar.org> * include/ntos/mm.h: Added SEC_XXXX defines from 'Windows NT/2000 Native API Reference' 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/ea.c (VfatSetExtendedAttributes): Empty placeholder for extended attribute functions. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/finfo.c (VfatSetAllocationSizeInformation): Added function to set allocation size. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/fcb.c (vfatFCBInitializeCache): Renamed to vfatFCBInitializeCacheFromVolume. * drivers/fs/vfat/fcb.c (vfatMakeFCBFromDirEntry): Don't initialise the cache with a file object representing the volume unless the FCB is for a directory. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/create.c (VfatPagingFileCreate): Added a new function for handling paging file only code. * drivers/fs/vfat/create.c (VfatSupersedeFile): Added a new function for doing a file supersede. * drivers/fs/vfat/create.c (VfatCreateFile): Reformatted and adjusted control flow. Set allocation size and extended attributes on create. * drivers/fs/vfat/create.c (VfatCreate): Removed goto. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/close.c (VfatCloseFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (updEntry): Renamed to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (addEntry): Renamed to VfatAddEntry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * apps/tests/sectest/sectest.c (main): Fixed formatting. svn path=/trunk/; revision=3331
2002-08-14 20:58:39 +00:00
CacheSeg = CcRosLookupCacheSegment(Bcb, FileOffset);
if (CacheSeg == NULL)
{
KeBugCheck(CACHE_MANAGER);
2002-08-14 David Welch <welch@computer2.darkstar.org> * subsys/smss/init.c (SmPagingFilesQueryRoutine): If possible take the size of the paging file from the registry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section.c (MmCreateDataFileSection): Extend the section if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/pagefile.c (NtCreatePagingFile): Set the file size using the FileAllocationInformation class. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/anonmem.c (MmWritePageVirtualMemory): Implemented function to write anonymous memory pages to the swap file. * ntoskrnl/mm/anonmem.c (MmFreeVirtualMemoryPage): Free any swap page associated with the page. * ntoskrnl/mm/mpw.c (MmWriteDirtyPages): New function to find pages to write to disk. * ntoskrnl/mm/mpw.c (MmMpwThreadMain): Implemented MPW functionality. * ntoskrnl/mm/rmap.c (MmWritePagePhysicalAddress): New function to write a single page back to disk. * ntoskrnl/mm/rmap.c (MmSetCleanAllRmaps, MmSetDirtyAllRmaps, MmIsDirtyPageRmap): New rmap function to support the MPW thread. * ntoskrnl/mm/section.c (MmWritePageSectionView): Implemented function to write back section pages. * ntoskrnl/mm/section.c (MmFreeSectionPage): Free any swap entry associated with the page; mark pages shared with the cache as dirty if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ldr/loader.c (LdrPEProcessModule): Set name of the module into the module text structure. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/io/rw.c (NtReadFile, NtWriteFile): Use the correct test for whether to wait for the completion of i/o. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cm/ntfunc.c (NtFlushKey): Request synchronous i/o from NtOpenFile. * ntoskrnl/cm/regfile (CmiInitPermanentRegistryHive): Request synchronous i/o from NtCreateFile. * ntoskrnl/dbg/kdb_stabs.c (LdrpLoadModuleSymbols): Request synchronous i/o from NtOpenFile. * ntoskrnl/ldr/sysdll.c (LdrpMapSystemDll): Request synchronous i/o from NtOpenFile. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosSuggestFreeCacheSegment): Maintain the correct reference count. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosFlushCacheSegment): New function to write back a modified cache segment. * ntoskrnl/cc/view.c (CcRosFlushDirtyPages): New function to flush some dirty pages from the cache. * ntoskrnl/cc/view.c (CcRosMarkDirtyCacheSegment): New function to mark a cache segment modified while mapped into memory as dirty. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/pin.c (CcMapData, CcUnpinData, CcSetDirtyPinnedData): Store the dirty status in the BCB; don't write back dirty data immediately. 2002-08-14 David Welch <welch@computer2.darkstar.org> * include/ntos/mm.h: Added SEC_XXXX defines from 'Windows NT/2000 Native API Reference' 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/ea.c (VfatSetExtendedAttributes): Empty placeholder for extended attribute functions. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/finfo.c (VfatSetAllocationSizeInformation): Added function to set allocation size. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/fcb.c (vfatFCBInitializeCache): Renamed to vfatFCBInitializeCacheFromVolume. * drivers/fs/vfat/fcb.c (vfatMakeFCBFromDirEntry): Don't initialise the cache with a file object representing the volume unless the FCB is for a directory. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/create.c (VfatPagingFileCreate): Added a new function for handling paging file only code. * drivers/fs/vfat/create.c (VfatSupersedeFile): Added a new function for doing a file supersede. * drivers/fs/vfat/create.c (VfatCreateFile): Reformatted and adjusted control flow. Set allocation size and extended attributes on create. * drivers/fs/vfat/create.c (VfatCreate): Removed goto. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/close.c (VfatCloseFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (updEntry): Renamed to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (addEntry): Renamed to VfatAddEntry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * apps/tests/sectest/sectest.c (main): Fixed formatting. svn path=/trunk/; revision=3331
2002-08-14 20:58:39 +00:00
}
if (!CacheSeg->Dirty)
{
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeAcquireGuardedMutex(&ViewLock);
2002-08-14 David Welch <welch@computer2.darkstar.org> * subsys/smss/init.c (SmPagingFilesQueryRoutine): If possible take the size of the paging file from the registry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section.c (MmCreateDataFileSection): Extend the section if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/pagefile.c (NtCreatePagingFile): Set the file size using the FileAllocationInformation class. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/anonmem.c (MmWritePageVirtualMemory): Implemented function to write anonymous memory pages to the swap file. * ntoskrnl/mm/anonmem.c (MmFreeVirtualMemoryPage): Free any swap page associated with the page. * ntoskrnl/mm/mpw.c (MmWriteDirtyPages): New function to find pages to write to disk. * ntoskrnl/mm/mpw.c (MmMpwThreadMain): Implemented MPW functionality. * ntoskrnl/mm/rmap.c (MmWritePagePhysicalAddress): New function to write a single page back to disk. * ntoskrnl/mm/rmap.c (MmSetCleanAllRmaps, MmSetDirtyAllRmaps, MmIsDirtyPageRmap): New rmap function to support the MPW thread. * ntoskrnl/mm/section.c (MmWritePageSectionView): Implemented function to write back section pages. * ntoskrnl/mm/section.c (MmFreeSectionPage): Free any swap entry associated with the page; mark pages shared with the cache as dirty if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ldr/loader.c (LdrPEProcessModule): Set name of the module into the module text structure. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/io/rw.c (NtReadFile, NtWriteFile): Use the correct test for whether to wait for the completion of i/o. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cm/ntfunc.c (NtFlushKey): Request synchronous i/o from NtOpenFile. * ntoskrnl/cm/regfile (CmiInitPermanentRegistryHive): Request synchronous i/o from NtCreateFile. * ntoskrnl/dbg/kdb_stabs.c (LdrpLoadModuleSymbols): Request synchronous i/o from NtOpenFile. * ntoskrnl/ldr/sysdll.c (LdrpMapSystemDll): Request synchronous i/o from NtOpenFile. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosSuggestFreeCacheSegment): Maintain the correct reference count. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosFlushCacheSegment): New function to write back a modified cache segment. * ntoskrnl/cc/view.c (CcRosFlushDirtyPages): New function to flush some dirty pages from the cache. * ntoskrnl/cc/view.c (CcRosMarkDirtyCacheSegment): New function to mark a cache segment modified while mapped into memory as dirty. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/pin.c (CcMapData, CcUnpinData, CcSetDirtyPinnedData): Store the dirty status in the BCB; don't write back dirty data immediately. 2002-08-14 David Welch <welch@computer2.darkstar.org> * include/ntos/mm.h: Added SEC_XXXX defines from 'Windows NT/2000 Native API Reference' 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/ea.c (VfatSetExtendedAttributes): Empty placeholder for extended attribute functions. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/finfo.c (VfatSetAllocationSizeInformation): Added function to set allocation size. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/fcb.c (vfatFCBInitializeCache): Renamed to vfatFCBInitializeCacheFromVolume. * drivers/fs/vfat/fcb.c (vfatMakeFCBFromDirEntry): Don't initialise the cache with a file object representing the volume unless the FCB is for a directory. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/create.c (VfatPagingFileCreate): Added a new function for handling paging file only code. * drivers/fs/vfat/create.c (VfatSupersedeFile): Added a new function for doing a file supersede. * drivers/fs/vfat/create.c (VfatCreateFile): Reformatted and adjusted control flow. Set allocation size and extended attributes on create. * drivers/fs/vfat/create.c (VfatCreate): Removed goto. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/close.c (VfatCloseFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (updEntry): Renamed to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (addEntry): Renamed to VfatAddEntry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * apps/tests/sectest/sectest.c (main): Fixed formatting. svn path=/trunk/; revision=3331
2002-08-14 20:58:39 +00:00
InsertTailList(&DirtySegmentListHead, &CacheSeg->DirtySegmentListEntry);
DirtyPageCount += Bcb->CacheSegmentSize / PAGE_SIZE;
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeReleaseGuardedMutex(&ViewLock);
2002-08-14 David Welch <welch@computer2.darkstar.org> * subsys/smss/init.c (SmPagingFilesQueryRoutine): If possible take the size of the paging file from the registry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section.c (MmCreateDataFileSection): Extend the section if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/pagefile.c (NtCreatePagingFile): Set the file size using the FileAllocationInformation class. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/anonmem.c (MmWritePageVirtualMemory): Implemented function to write anonymous memory pages to the swap file. * ntoskrnl/mm/anonmem.c (MmFreeVirtualMemoryPage): Free any swap page associated with the page. * ntoskrnl/mm/mpw.c (MmWriteDirtyPages): New function to find pages to write to disk. * ntoskrnl/mm/mpw.c (MmMpwThreadMain): Implemented MPW functionality. * ntoskrnl/mm/rmap.c (MmWritePagePhysicalAddress): New function to write a single page back to disk. * ntoskrnl/mm/rmap.c (MmSetCleanAllRmaps, MmSetDirtyAllRmaps, MmIsDirtyPageRmap): New rmap function to support the MPW thread. * ntoskrnl/mm/section.c (MmWritePageSectionView): Implemented function to write back section pages. * ntoskrnl/mm/section.c (MmFreeSectionPage): Free any swap entry associated with the page; mark pages shared with the cache as dirty if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ldr/loader.c (LdrPEProcessModule): Set name of the module into the module text structure. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/io/rw.c (NtReadFile, NtWriteFile): Use the correct test for whether to wait for the completion of i/o. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cm/ntfunc.c (NtFlushKey): Request synchronous i/o from NtOpenFile. * ntoskrnl/cm/regfile (CmiInitPermanentRegistryHive): Request synchronous i/o from NtCreateFile. * ntoskrnl/dbg/kdb_stabs.c (LdrpLoadModuleSymbols): Request synchronous i/o from NtOpenFile. * ntoskrnl/ldr/sysdll.c (LdrpMapSystemDll): Request synchronous i/o from NtOpenFile. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosSuggestFreeCacheSegment): Maintain the correct reference count. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosFlushCacheSegment): New function to write back a modified cache segment. * ntoskrnl/cc/view.c (CcRosFlushDirtyPages): New function to flush some dirty pages from the cache. * ntoskrnl/cc/view.c (CcRosMarkDirtyCacheSegment): New function to mark a cache segment modified while mapped into memory as dirty. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/pin.c (CcMapData, CcUnpinData, CcSetDirtyPinnedData): Store the dirty status in the BCB; don't write back dirty data immediately. 2002-08-14 David Welch <welch@computer2.darkstar.org> * include/ntos/mm.h: Added SEC_XXXX defines from 'Windows NT/2000 Native API Reference' 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/ea.c (VfatSetExtendedAttributes): Empty placeholder for extended attribute functions. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/finfo.c (VfatSetAllocationSizeInformation): Added function to set allocation size. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/fcb.c (vfatFCBInitializeCache): Renamed to vfatFCBInitializeCacheFromVolume. * drivers/fs/vfat/fcb.c (vfatMakeFCBFromDirEntry): Don't initialise the cache with a file object representing the volume unless the FCB is for a directory. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/create.c (VfatPagingFileCreate): Added a new function for handling paging file only code. * drivers/fs/vfat/create.c (VfatSupersedeFile): Added a new function for doing a file supersede. * drivers/fs/vfat/create.c (VfatCreateFile): Reformatted and adjusted control flow. Set allocation size and extended attributes on create. * drivers/fs/vfat/create.c (VfatCreate): Removed goto. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/close.c (VfatCloseFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (updEntry): Renamed to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (addEntry): Renamed to VfatAddEntry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * apps/tests/sectest/sectest.c (main): Fixed formatting. svn path=/trunk/; revision=3331
2002-08-14 20:58:39 +00:00
}
else
{
KeAcquireSpinLock(&Bcb->BcbLock, &oldIrql);
CcRosCacheSegmentDecRefCount(CacheSeg);
KeReleaseSpinLock(&Bcb->BcbLock, oldIrql);
}
2002-08-14 David Welch <welch@computer2.darkstar.org> * subsys/smss/init.c (SmPagingFilesQueryRoutine): If possible take the size of the paging file from the registry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section.c (MmCreateDataFileSection): Extend the section if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/pagefile.c (NtCreatePagingFile): Set the file size using the FileAllocationInformation class. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/anonmem.c (MmWritePageVirtualMemory): Implemented function to write anonymous memory pages to the swap file. * ntoskrnl/mm/anonmem.c (MmFreeVirtualMemoryPage): Free any swap page associated with the page. * ntoskrnl/mm/mpw.c (MmWriteDirtyPages): New function to find pages to write to disk. * ntoskrnl/mm/mpw.c (MmMpwThreadMain): Implemented MPW functionality. * ntoskrnl/mm/rmap.c (MmWritePagePhysicalAddress): New function to write a single page back to disk. * ntoskrnl/mm/rmap.c (MmSetCleanAllRmaps, MmSetDirtyAllRmaps, MmIsDirtyPageRmap): New rmap function to support the MPW thread. * ntoskrnl/mm/section.c (MmWritePageSectionView): Implemented function to write back section pages. * ntoskrnl/mm/section.c (MmFreeSectionPage): Free any swap entry associated with the page; mark pages shared with the cache as dirty if necessary. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ldr/loader.c (LdrPEProcessModule): Set name of the module into the module text structure. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/io/rw.c (NtReadFile, NtWriteFile): Use the correct test for whether to wait for the completion of i/o. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cm/ntfunc.c (NtFlushKey): Request synchronous i/o from NtOpenFile. * ntoskrnl/cm/regfile (CmiInitPermanentRegistryHive): Request synchronous i/o from NtCreateFile. * ntoskrnl/dbg/kdb_stabs.c (LdrpLoadModuleSymbols): Request synchronous i/o from NtOpenFile. * ntoskrnl/ldr/sysdll.c (LdrpMapSystemDll): Request synchronous i/o from NtOpenFile. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosSuggestFreeCacheSegment): Maintain the correct reference count. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/view.c (CcRosFlushCacheSegment): New function to write back a modified cache segment. * ntoskrnl/cc/view.c (CcRosFlushDirtyPages): New function to flush some dirty pages from the cache. * ntoskrnl/cc/view.c (CcRosMarkDirtyCacheSegment): New function to mark a cache segment modified while mapped into memory as dirty. 2002-08-14 David Welch <welch@computer2.darkstar.org> * ntoskrnl/cc/pin.c (CcMapData, CcUnpinData, CcSetDirtyPinnedData): Store the dirty status in the BCB; don't write back dirty data immediately. 2002-08-14 David Welch <welch@computer2.darkstar.org> * include/ntos/mm.h: Added SEC_XXXX defines from 'Windows NT/2000 Native API Reference' 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/ea.c (VfatSetExtendedAttributes): Empty placeholder for extended attribute functions. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/finfo.c (VfatSetAllocationSizeInformation): Added function to set allocation size. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/fcb.c (vfatFCBInitializeCache): Renamed to vfatFCBInitializeCacheFromVolume. * drivers/fs/vfat/fcb.c (vfatMakeFCBFromDirEntry): Don't initialise the cache with a file object representing the volume unless the FCB is for a directory. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/create.c (VfatPagingFileCreate): Added a new function for handling paging file only code. * drivers/fs/vfat/create.c (VfatSupersedeFile): Added a new function for doing a file supersede. * drivers/fs/vfat/create.c (VfatCreateFile): Reformatted and adjusted control flow. Set allocation size and extended attributes on create. * drivers/fs/vfat/create.c (VfatCreate): Removed goto. 2002-08-14 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/close.c (VfatCloseFile): Renamed updEntry to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (updEntry): Renamed to VfatUpdateEntry. * drivers/fs/vfat/dirwr.c (addEntry): Renamed to VfatAddEntry. 2002-08-14 David Welch <welch@computer2.darkstar.org> * apps/tests/sectest/sectest.c (main): Fixed formatting. svn path=/trunk/; revision=3331
2002-08-14 20:58:39 +00:00
CacheSeg->Dirty = TRUE;
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
ExReleasePushLock(&CacheSeg->Lock);
return(STATUS_SUCCESS);
}
NTSTATUS
NTAPI
CcRosUnmapCacheSegment(PBCB Bcb, ULONG FileOffset, BOOLEAN NowDirty)
{
PCACHE_SEGMENT CacheSeg;
BOOLEAN WasDirty;
KIRQL oldIrql;
ASSERT(Bcb);
DPRINT("CcRosUnmapCacheSegment(Bcb 0x%p, FileOffset %d, NowDirty %d)\n",
Bcb, FileOffset, NowDirty);
CacheSeg = CcRosLookupCacheSegment(Bcb, FileOffset);
if (CacheSeg == NULL)
{
return(STATUS_UNSUCCESSFUL);
}
WasDirty = CacheSeg->Dirty;
CacheSeg->Dirty = CacheSeg->Dirty || NowDirty;
CacheSeg->MappedCount--;
if (!WasDirty && NowDirty)
{
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeAcquireGuardedMutex(&ViewLock);
InsertTailList(&DirtySegmentListHead, &CacheSeg->DirtySegmentListEntry);
DirtyPageCount += Bcb->CacheSegmentSize / PAGE_SIZE;
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeReleaseGuardedMutex(&ViewLock);
}
KeAcquireSpinLock(&Bcb->BcbLock, &oldIrql);
CcRosCacheSegmentDecRefCount(CacheSeg);
if (!WasDirty && NowDirty)
{
CcRosCacheSegmentIncRefCount(CacheSeg);
}
if (CacheSeg->MappedCount == 0)
{
CcRosCacheSegmentDecRefCount(CacheSeg);
}
KeReleaseSpinLock(&Bcb->BcbLock, oldIrql);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
ExReleasePushLock(&CacheSeg->Lock);
return(STATUS_SUCCESS);
}
static
NTSTATUS
CcRosCreateCacheSegment(PBCB Bcb,
ULONG FileOffset,
PCACHE_SEGMENT* CacheSeg)
{
PCACHE_SEGMENT current;
PCACHE_SEGMENT previous;
PLIST_ENTRY current_entry;
NTSTATUS Status;
KIRQL oldIrql;
#ifdef CACHE_BITMAP
ULONG StartingOffset;
#endif
PHYSICAL_ADDRESS BoundaryAddressMultiple;
ASSERT(Bcb);
DPRINT("CcRosCreateCacheSegment()\n");
BoundaryAddressMultiple.QuadPart = 0;
if (FileOffset >= Bcb->FileSize.u.LowPart)
{
CacheSeg = NULL;
return STATUS_INVALID_PARAMETER;
}
current = ExAllocateFromNPagedLookasideList(&CacheSegLookasideList);
current->Valid = FALSE;
current->Dirty = FALSE;
current->PageOut = FALSE;
current->FileOffset = ROUND_DOWN(FileOffset, Bcb->CacheSegmentSize);
current->Bcb = Bcb;
#if DBG
if ( Bcb->Trace )
{
DPRINT1("CacheMap 0x%p: new Cache Segment: 0x%p\n", Bcb, current );
}
#endif
current->MappedCount = 0;
current->DirtySegmentListEntry.Flink = NULL;
current->DirtySegmentListEntry.Blink = NULL;
current->ReferenceCount = 1;
ExInitializePushLock(&current->Lock);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
ExAcquirePushLockExclusive(&current->Lock);
KeAcquireGuardedMutex(&ViewLock);
*CacheSeg = current;
/* There is window between the call to CcRosLookupCacheSegment
* and CcRosCreateCacheSegment. We must check if a segment on
* the fileoffset exist. If there exist a segment, we release
* our new created segment and return the existing one.
*/
KeAcquireSpinLock(&Bcb->BcbLock, &oldIrql);
current_entry = Bcb->BcbSegmentListHead.Flink;
previous = NULL;
while (current_entry != &Bcb->BcbSegmentListHead)
{
current = CONTAINING_RECORD(current_entry, CACHE_SEGMENT,
BcbSegmentListEntry);
if (current->FileOffset <= FileOffset &&
(current->FileOffset + Bcb->CacheSegmentSize) > FileOffset)
{
CcRosCacheSegmentIncRefCount(current);
KeReleaseSpinLock(&Bcb->BcbLock, oldIrql);
#if DBG
if ( Bcb->Trace )
{
DPRINT1("CacheMap 0x%p: deleting newly created Cache Segment 0x%p ( found existing one 0x%p )\n",
Bcb,
(*CacheSeg),
current );
}
#endif
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
ExReleasePushLock(&(*CacheSeg)->Lock);
KeReleaseGuardedMutex(&ViewLock);
ExFreeToNPagedLookasideList(&CacheSegLookasideList, *CacheSeg);
*CacheSeg = current;
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
ExAcquirePushLockExclusive(&current->Lock);
return STATUS_SUCCESS;
}
if (current->FileOffset < FileOffset)
{
if (previous == NULL)
{
previous = current;
}
else
{
if (previous->FileOffset < current->FileOffset)
{
previous = current;
}
}
}
current_entry = current_entry->Flink;
}
/* There was no existing segment. */
current = *CacheSeg;
if (previous)
{
InsertHeadList(&previous->BcbSegmentListEntry, &current->BcbSegmentListEntry);
}
else
{
InsertHeadList(&Bcb->BcbSegmentListHead, &current->BcbSegmentListEntry);
}
KeReleaseSpinLock(&Bcb->BcbLock, oldIrql);
InsertTailList(&CacheSegmentListHead, &current->CacheSegmentListEntry);
InsertTailList(&CacheSegmentLRUListHead, &current->CacheSegmentLRUListEntry);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeReleaseGuardedMutex(&ViewLock);
#ifdef CACHE_BITMAP
KeAcquireSpinLock(&CiCacheSegMappingRegionLock, &oldIrql);
StartingOffset = RtlFindClearBitsAndSet(&CiCacheSegMappingRegionAllocMap, Bcb->CacheSegmentSize / PAGE_SIZE, CiCacheSegMappingRegionHint);
if (StartingOffset == 0xffffffff)
{
DPRINT1("Out of CacheSeg mapping space\n");
KeBugCheck(CACHE_MANAGER);
}
current->BaseAddress = CiCacheSegMappingRegionBase + StartingOffset * PAGE_SIZE;
if (CiCacheSegMappingRegionHint == StartingOffset)
{
CiCacheSegMappingRegionHint += Bcb->CacheSegmentSize / PAGE_SIZE;
}
KeReleaseSpinLock(&CiCacheSegMappingRegionLock, oldIrql);
#else
MmLockAddressSpace(MmGetKernelAddressSpace());
current->BaseAddress = NULL;
Status = MmCreateMemoryArea(MmGetKernelAddressSpace(),
0, // nothing checks for cache_segment mareas, so set to 0
&current->BaseAddress,
Bcb->CacheSegmentSize,
PAGE_READWRITE,
(PMEMORY_AREA*)&current->MemoryArea,
FALSE,
0,
BoundaryAddressMultiple);
MmUnlockAddressSpace(MmGetKernelAddressSpace());
if (!NT_SUCCESS(Status))
{
KeBugCheck(CACHE_MANAGER);
}
#endif
/* Create a virtual mapping for this memory area */
MI_SET_USAGE(MI_USAGE_CACHE);
#if MI_TRACE_PFNS
PWCHAR pos = NULL;
ULONG len = 0;
if ((Bcb->FileObject) && (Bcb->FileObject->FileName.Buffer))
{
pos = wcsrchr(Bcb->FileObject->FileName.Buffer, '\\');
len = wcslen(pos) * sizeof(WCHAR);
if (pos) snprintf(MI_PFN_CURRENT_PROCESS_NAME, min(16, len), "%S", pos);
}
#endif
MmMapMemoryArea(current->BaseAddress, Bcb->CacheSegmentSize,
MC_CACHE, PAGE_READWRITE);
return(STATUS_SUCCESS);
}
NTSTATUS
NTAPI
CcRosGetCacheSegmentChain(PBCB Bcb,
ULONG FileOffset,
ULONG Length,
PCACHE_SEGMENT* CacheSeg)
{
PCACHE_SEGMENT current;
ULONG i;
PCACHE_SEGMENT* CacheSegList;
PCACHE_SEGMENT Previous = NULL;
ASSERT(Bcb);
DPRINT("CcRosGetCacheSegmentChain()\n");
Length = ROUND_UP(Length, Bcb->CacheSegmentSize);
CacheSegList = _alloca(sizeof(PCACHE_SEGMENT) *
(Length / Bcb->CacheSegmentSize));
/*
* Look for a cache segment already mapping the same data.
*/
for (i = 0; i < (Length / Bcb->CacheSegmentSize); i++)
{
ULONG CurrentOffset = FileOffset + (i * Bcb->CacheSegmentSize);
current = CcRosLookupCacheSegment(Bcb, CurrentOffset);
if (current != NULL)
{
CacheSegList[i] = current;
}
else
{
CcRosCreateCacheSegment(Bcb, CurrentOffset, &current);
CacheSegList[i] = current;
}
}
for (i = 0; i < (Length / Bcb->CacheSegmentSize); i++)
{
if (i == 0)
{
*CacheSeg = CacheSegList[i];
Previous = CacheSegList[i];
}
else
{
Previous->NextInChain = CacheSegList[i];
Previous = CacheSegList[i];
}
}
ASSERT(Previous);
Previous->NextInChain = NULL;
return(STATUS_SUCCESS);
}
NTSTATUS
NTAPI
CcRosGetCacheSegment(PBCB Bcb,
ULONG FileOffset,
PULONG BaseOffset,
PVOID* BaseAddress,
PBOOLEAN UptoDate,
PCACHE_SEGMENT* CacheSeg)
{
PCACHE_SEGMENT current;
NTSTATUS Status;
ASSERT(Bcb);
DPRINT("CcRosGetCacheSegment()\n");
/*
* Look for a cache segment already mapping the same data.
*/
current = CcRosLookupCacheSegment(Bcb, FileOffset);
if (current == NULL)
{
/*
* Otherwise create a new segment.
*/
Status = CcRosCreateCacheSegment(Bcb, FileOffset, &current);
if (!NT_SUCCESS(Status))
{
return Status;
}
}
/*
* Return information about the segment to the caller.
*/
*UptoDate = current->Valid;
*BaseAddress = current->BaseAddress;
DPRINT("*BaseAddress 0x%.8X\n", *BaseAddress);
*CacheSeg = current;
*BaseOffset = current->FileOffset;
return(STATUS_SUCCESS);
}
NTSTATUS NTAPI
CcRosRequestCacheSegment(PBCB Bcb,
ULONG FileOffset,
PVOID* BaseAddress,
PBOOLEAN UptoDate,
PCACHE_SEGMENT* CacheSeg)
/*
* FUNCTION: Request a page mapping for a BCB
*/
{
ULONG BaseOffset;
ASSERT(Bcb);
if ((FileOffset % Bcb->CacheSegmentSize) != 0)
{
DPRINT1("Bad fileoffset %x should be multiple of %x",
FileOffset, Bcb->CacheSegmentSize);
KeBugCheck(CACHE_MANAGER);
}
return(CcRosGetCacheSegment(Bcb,
FileOffset,
&BaseOffset,
BaseAddress,
UptoDate,
CacheSeg));
}
#ifdef CACHE_BITMAP
#else
static VOID
CcFreeCachePage(PVOID Context, MEMORY_AREA* MemoryArea, PVOID Address,
PFN_NUMBER Page, SWAPENTRY SwapEntry, BOOLEAN Dirty)
{
ASSERT(SwapEntry == 0);
if (Page != 0)
{
MmReleasePageMemoryConsumer(MC_CACHE, Page);
}
}
#endif
NTSTATUS
CcRosInternalFreeCacheSegment(PCACHE_SEGMENT CacheSeg)
/*
* FUNCTION: Releases a cache segment associated with a BCB
*/
{
#ifdef CACHE_BITMAP
ULONG i;
ULONG RegionSize;
ULONG Base;
PFN_NUMBER Page;
KIRQL oldIrql;
#endif
DPRINT("Freeing cache segment 0x%p\n", CacheSeg);
#if DBG
if ( CacheSeg->Bcb->Trace )
{
DPRINT1("CacheMap 0x%p: deleting Cache Segment: 0x%p\n", CacheSeg->Bcb, CacheSeg );
}
#endif
#ifdef CACHE_BITMAP
RegionSize = CacheSeg->Bcb->CacheSegmentSize / PAGE_SIZE;
/* Unmap all the pages. */
for (i = 0; i < RegionSize; i++)
{
MmDeleteVirtualMapping(NULL,
CacheSeg->BaseAddress + (i * PAGE_SIZE),
FALSE,
NULL,
&Page);
MmReleasePageMemoryConsumer(MC_CACHE, Page);
}
KeAcquireSpinLock(&CiCacheSegMappingRegionLock, &oldIrql);
/* Deallocate all the pages used. */
Base = (ULONG)(CacheSeg->BaseAddress - CiCacheSegMappingRegionBase) / PAGE_SIZE;
RtlClearBits(&CiCacheSegMappingRegionAllocMap, Base, RegionSize);
CiCacheSegMappingRegionHint = min (CiCacheSegMappingRegionHint, Base);
KeReleaseSpinLock(&CiCacheSegMappingRegionLock, oldIrql);
#else
MmLockAddressSpace(MmGetKernelAddressSpace());
MmFreeMemoryArea(MmGetKernelAddressSpace(),
CacheSeg->MemoryArea,
CcFreeCachePage,
NULL);
MmUnlockAddressSpace(MmGetKernelAddressSpace());
#endif
ExFreeToNPagedLookasideList(&CacheSegLookasideList, CacheSeg);
return(STATUS_SUCCESS);
}
NTSTATUS
NTAPI
CcRosFreeCacheSegment(PBCB Bcb, PCACHE_SEGMENT CacheSeg)
{
NTSTATUS Status;
KIRQL oldIrql;
ASSERT(Bcb);
DPRINT("CcRosFreeCacheSegment(Bcb 0x%p, CacheSeg 0x%p)\n",
Bcb, CacheSeg);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeAcquireGuardedMutex(&ViewLock);
KeAcquireSpinLock(&Bcb->BcbLock, &oldIrql);
RemoveEntryList(&CacheSeg->BcbSegmentListEntry);
RemoveEntryList(&CacheSeg->CacheSegmentListEntry);
RemoveEntryList(&CacheSeg->CacheSegmentLRUListEntry);
if (CacheSeg->Dirty)
{
RemoveEntryList(&CacheSeg->DirtySegmentListEntry);
DirtyPageCount -= Bcb->CacheSegmentSize / PAGE_SIZE;
}
KeReleaseSpinLock(&Bcb->BcbLock, oldIrql);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeReleaseGuardedMutex(&ViewLock);
Status = CcRosInternalFreeCacheSegment(CacheSeg);
return(Status);
}
/*
* @implemented
*/
VOID NTAPI
CcFlushCache(IN PSECTION_OBJECT_POINTERS SectionObjectPointers,
IN PLARGE_INTEGER FileOffset OPTIONAL,
IN ULONG Length,
OUT PIO_STATUS_BLOCK IoStatus)
{
PBCB Bcb;
LARGE_INTEGER Offset;
PCACHE_SEGMENT current;
NTSTATUS Status;
KIRQL oldIrql;
DPRINT("CcFlushCache(SectionObjectPointers 0x%p, FileOffset 0x%p, Length %d, IoStatus 0x%p)\n",
SectionObjectPointers, FileOffset, Length, IoStatus);
if (SectionObjectPointers && SectionObjectPointers->SharedCacheMap)
{
Bcb = (PBCB)SectionObjectPointers->SharedCacheMap;
ASSERT(Bcb);
if (FileOffset)
{
Offset = *FileOffset;
}
else
{
Offset.QuadPart = (LONGLONG)0;
Length = Bcb->FileSize.u.LowPart;
}
if (IoStatus)
{
IoStatus->Status = STATUS_SUCCESS;
IoStatus->Information = 0;
}
while (Length > 0)
{
current = CcRosLookupCacheSegment (Bcb, Offset.u.LowPart);
if (current != NULL)
{
if (current->Dirty)
{
Status = CcRosFlushCacheSegment(current);
if (!NT_SUCCESS(Status) && IoStatus != NULL)
{
IoStatus->Status = Status;
}
}
KeAcquireSpinLock(&Bcb->BcbLock, &oldIrql);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
ExReleasePushLock(&current->Lock);
CcRosCacheSegmentDecRefCount(current);
KeReleaseSpinLock(&Bcb->BcbLock, oldIrql);
}
Offset.QuadPart += Bcb->CacheSegmentSize;
if (Length > Bcb->CacheSegmentSize)
{
Length -= Bcb->CacheSegmentSize;
}
else
{
Length = 0;
}
}
}
else
{
if (IoStatus)
{
IoStatus->Status = STATUS_INVALID_PARAMETER;
}
}
}
NTSTATUS
NTAPI
CcRosDeleteFileCache(PFILE_OBJECT FileObject, PBCB Bcb)
/*
* FUNCTION: Releases the BCB associated with a file object
*/
{
PLIST_ENTRY current_entry;
PCACHE_SEGMENT current;
LIST_ENTRY FreeList;
KIRQL oldIrql;
ASSERT(Bcb);
Bcb->RefCount++;
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeReleaseGuardedMutex(&ViewLock);
CcFlushCache(FileObject->SectionObjectPointer, NULL, 0, NULL);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeAcquireGuardedMutex(&ViewLock);
Bcb->RefCount--;
if (Bcb->RefCount == 0)
{
if (Bcb->BcbRemoveListEntry.Flink != NULL)
{
RemoveEntryList(&Bcb->BcbRemoveListEntry);
Bcb->BcbRemoveListEntry.Flink = NULL;
}
FileObject->SectionObjectPointer->SharedCacheMap = NULL;
/*
* Release all cache segments.
*/
InitializeListHead(&FreeList);
KeAcquireSpinLock(&Bcb->BcbLock, &oldIrql);
current_entry = Bcb->BcbSegmentListHead.Flink;
while (!IsListEmpty(&Bcb->BcbSegmentListHead))
{
current_entry = RemoveTailList(&Bcb->BcbSegmentListHead);
current = CONTAINING_RECORD(current_entry, CACHE_SEGMENT, BcbSegmentListEntry);
RemoveEntryList(&current->CacheSegmentListEntry);
RemoveEntryList(&current->CacheSegmentLRUListEntry);
if (current->Dirty)
{
RemoveEntryList(&current->DirtySegmentListEntry);
DirtyPageCount -= Bcb->CacheSegmentSize / PAGE_SIZE;
DPRINT1("Freeing dirty segment\n");
}
InsertHeadList(&FreeList, &current->BcbSegmentListEntry);
}
#if DBG
Bcb->Trace = FALSE;
#endif
KeReleaseSpinLock(&Bcb->BcbLock, oldIrql);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeReleaseGuardedMutex(&ViewLock);
ObDereferenceObject (Bcb->FileObject);
while (!IsListEmpty(&FreeList))
{
current_entry = RemoveTailList(&FreeList);
current = CONTAINING_RECORD(current_entry, CACHE_SEGMENT, BcbSegmentListEntry);
CcRosInternalFreeCacheSegment(current);
}
ExFreeToNPagedLookasideList(&BcbLookasideList, Bcb);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeAcquireGuardedMutex(&ViewLock);
}
return(STATUS_SUCCESS);
}
VOID
NTAPI
CcRosReferenceCache(PFILE_OBJECT FileObject)
{
PBCB Bcb;
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeAcquireGuardedMutex(&ViewLock);
Bcb = (PBCB)FileObject->SectionObjectPointer->SharedCacheMap;
ASSERT(Bcb);
if (Bcb->RefCount == 0)
{
ASSERT(Bcb->BcbRemoveListEntry.Flink != NULL);
RemoveEntryList(&Bcb->BcbRemoveListEntry);
Bcb->BcbRemoveListEntry.Flink = NULL;
}
else
{
ASSERT(Bcb->BcbRemoveListEntry.Flink == NULL);
}
Bcb->RefCount++;
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeReleaseGuardedMutex(&ViewLock);
}
VOID
NTAPI
CcRosSetRemoveOnClose(PSECTION_OBJECT_POINTERS SectionObjectPointer)
{
PBCB Bcb;
DPRINT("CcRosSetRemoveOnClose()\n");
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeAcquireGuardedMutex(&ViewLock);
Bcb = (PBCB)SectionObjectPointer->SharedCacheMap;
if (Bcb)
{
Bcb->RemoveOnClose = TRUE;
if (Bcb->RefCount == 0)
{
CcRosDeleteFileCache(Bcb->FileObject, Bcb);
}
}
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeReleaseGuardedMutex(&ViewLock);
}
VOID
NTAPI
CcRosDereferenceCache(PFILE_OBJECT FileObject)
{
PBCB Bcb;
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeAcquireGuardedMutex(&ViewLock);
Bcb = (PBCB)FileObject->SectionObjectPointer->SharedCacheMap;
ASSERT(Bcb);
if (Bcb->RefCount > 0)
{
Bcb->RefCount--;
if (Bcb->RefCount == 0)
{
MmFreeSectionSegments(Bcb->FileObject);
CcRosDeleteFileCache(FileObject, Bcb);
}
}
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeReleaseGuardedMutex(&ViewLock);
}
NTSTATUS NTAPI
CcRosReleaseFileCache(PFILE_OBJECT FileObject)
2002-08-08 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section (NtQuerySection): Return the right result length. 2002-08-08 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ke/usertrap.c (print_user_address): Check for a NULL LDR structure in the PEB; copy the LDR pointer in safely. 2002-08-08 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ke/apc.c (KiDeliverUserApc): Deliver all present APCs; release the APC spinlock while acccessing user memory. 2002-08-08 David Welch <welch@computer2.darkstar.org> * include/internal/ps.h: Adjusted offsets into the ETHREAD structure. * include/internal/ps.h: Removed redundant members from the KTHREAD structure. * ntoskrnl/ke/kthread.c (KeInitializeThread): Removed redundant members from the KTHREAD structure. 2002-08-08 David Welch <welch@computer2.darkstar.org> * ntoskrnl/dbg/kdb.c (KdbEnterDebuggerException): New function to enter the debugger on an exception. * ntoskrnl/kd/kdebug.c (KdInitSystem): Initialize the local kernel debugger if enabled. * ntoskrnl/ke/catch.c (KiDispatchException): Enter the local kernel debugger on an exception. 2002-08-08 David Welch <welch@computer2.darkstar.org> * include/ntdll/ldr.h: Added definition for a DLL entrypoint. * lib/kernel32/process/create.c (KlCreateFirstThread): Put the argument to the NtProcessStartup function on the stack. * lib/kernel32/process/create.c (KlInitPeb): Read the base address of the new image from the PEB. * lib/kernel32/process/create.c (CreateProcessW): Start the first thread at the entrypoint of the new image. * lib/ntdll/ldr/startup.c (LdrInitializeThunk): If the function is called after the initial startup then just call the entrypoints for the loaded DLLs with DLL_THREAD_ATTACH. Don't call the entrypoint of the image. * lib/ntdll/rtl/process.c (RtlpCreateFirstThread): Put the argument to the NtProcessStartup function on the stack. * lib/ntdll/rtl/process.c (KlInitPeb): Read the base address of the new image from the PEB. * lib/ntdll/rtl/process.c (RtlCreateUserProcess): Start the first thread at the entrypoint of the new image. * ntoskrnl/ke/i386/bthread.S (PsBeginThreadWithContextInternal): Use the system call path to begin a usermode thread. * ntoskrnl/ke/i386/thread.c (Ke386InitThreadWithContext): Convert the supplied context into a trap frame. * ntoskrnl/ldr/init.c (LdrLoadInitialProcess): Put the PEB argument to the NtProcessStartup function on the new stack; start the first thread at the entrypoint of the image. * ntoskrnl/ps/create.c (NtCreateThread): Create an APC to call LdrInitializeThunk in the context of a new thread before its entrypoint. 2002-08-08 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Uninitialise the cache on file cleanup. * drivers/fs/vfat/fcb.c (vfatReleaseFcb): Don't uninitialise the cache on file close. * ntoskrnl/cc/copy.c: Renamed zero page global variable. * ntoskrnl/cc/view.c: Added cache delete function. svn path=/trunk/; revision=3323
2002-08-08 17:54:16 +00:00
/*
* FUNCTION: Called by the file system when a handle to a file object
* has been closed.
*/
{
PBCB Bcb;
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeAcquireGuardedMutex(&ViewLock);
if (FileObject->SectionObjectPointer->SharedCacheMap != NULL)
{
Bcb = FileObject->SectionObjectPointer->SharedCacheMap;
if (FileObject->PrivateCacheMap != NULL)
{
FileObject->PrivateCacheMap = NULL;
if (Bcb->RefCount > 0)
{
Bcb->RefCount--;
if (Bcb->RefCount == 0)
{
MmFreeSectionSegments(Bcb->FileObject);
CcRosDeleteFileCache(FileObject, Bcb);
}
}
}
}
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeReleaseGuardedMutex(&ViewLock);
2002-08-08 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section (NtQuerySection): Return the right result length. 2002-08-08 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ke/usertrap.c (print_user_address): Check for a NULL LDR structure in the PEB; copy the LDR pointer in safely. 2002-08-08 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ke/apc.c (KiDeliverUserApc): Deliver all present APCs; release the APC spinlock while acccessing user memory. 2002-08-08 David Welch <welch@computer2.darkstar.org> * include/internal/ps.h: Adjusted offsets into the ETHREAD structure. * include/internal/ps.h: Removed redundant members from the KTHREAD structure. * ntoskrnl/ke/kthread.c (KeInitializeThread): Removed redundant members from the KTHREAD structure. 2002-08-08 David Welch <welch@computer2.darkstar.org> * ntoskrnl/dbg/kdb.c (KdbEnterDebuggerException): New function to enter the debugger on an exception. * ntoskrnl/kd/kdebug.c (KdInitSystem): Initialize the local kernel debugger if enabled. * ntoskrnl/ke/catch.c (KiDispatchException): Enter the local kernel debugger on an exception. 2002-08-08 David Welch <welch@computer2.darkstar.org> * include/ntdll/ldr.h: Added definition for a DLL entrypoint. * lib/kernel32/process/create.c (KlCreateFirstThread): Put the argument to the NtProcessStartup function on the stack. * lib/kernel32/process/create.c (KlInitPeb): Read the base address of the new image from the PEB. * lib/kernel32/process/create.c (CreateProcessW): Start the first thread at the entrypoint of the new image. * lib/ntdll/ldr/startup.c (LdrInitializeThunk): If the function is called after the initial startup then just call the entrypoints for the loaded DLLs with DLL_THREAD_ATTACH. Don't call the entrypoint of the image. * lib/ntdll/rtl/process.c (RtlpCreateFirstThread): Put the argument to the NtProcessStartup function on the stack. * lib/ntdll/rtl/process.c (KlInitPeb): Read the base address of the new image from the PEB. * lib/ntdll/rtl/process.c (RtlCreateUserProcess): Start the first thread at the entrypoint of the new image. * ntoskrnl/ke/i386/bthread.S (PsBeginThreadWithContextInternal): Use the system call path to begin a usermode thread. * ntoskrnl/ke/i386/thread.c (Ke386InitThreadWithContext): Convert the supplied context into a trap frame. * ntoskrnl/ldr/init.c (LdrLoadInitialProcess): Put the PEB argument to the NtProcessStartup function on the new stack; start the first thread at the entrypoint of the image. * ntoskrnl/ps/create.c (NtCreateThread): Create an APC to call LdrInitializeThunk in the context of a new thread before its entrypoint. 2002-08-08 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Uninitialise the cache on file cleanup. * drivers/fs/vfat/fcb.c (vfatReleaseFcb): Don't uninitialise the cache on file close. * ntoskrnl/cc/copy.c: Renamed zero page global variable. * ntoskrnl/cc/view.c: Added cache delete function. svn path=/trunk/; revision=3323
2002-08-08 17:54:16 +00:00
return(STATUS_SUCCESS);
}
NTSTATUS
NTAPI
CcTryToInitializeFileCache(PFILE_OBJECT FileObject)
{
PBCB Bcb;
NTSTATUS Status;
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeAcquireGuardedMutex(&ViewLock);
ASSERT(FileObject->SectionObjectPointer);
Bcb = FileObject->SectionObjectPointer->SharedCacheMap;
if (Bcb == NULL)
{
Status = STATUS_UNSUCCESSFUL;
}
else
{
if (FileObject->PrivateCacheMap == NULL)
{
FileObject->PrivateCacheMap = Bcb;
Bcb->RefCount++;
}
if (Bcb->BcbRemoveListEntry.Flink != NULL)
{
RemoveEntryList(&Bcb->BcbRemoveListEntry);
Bcb->BcbRemoveListEntry.Flink = NULL;
}
Status = STATUS_SUCCESS;
}
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeReleaseGuardedMutex(&ViewLock);
return Status;
}
NTSTATUS NTAPI
CcRosInitializeFileCache(PFILE_OBJECT FileObject,
ULONG CacheSegmentSize,
PCACHE_MANAGER_CALLBACKS CallBacks,
PVOID LazyWriterContext)
/*
* FUNCTION: Initializes a BCB for a file object
*/
{
PBCB Bcb;
Bcb = FileObject->SectionObjectPointer->SharedCacheMap;
DPRINT("CcRosInitializeFileCache(FileObject 0x%p, Bcb 0x%p, CacheSegmentSize %d)\n",
FileObject, Bcb, CacheSegmentSize);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeAcquireGuardedMutex(&ViewLock);
if (Bcb == NULL)
{
Bcb = ExAllocateFromNPagedLookasideList(&BcbLookasideList);
if (Bcb == NULL)
{
KeReleaseGuardedMutex(&ViewLock);
return(STATUS_UNSUCCESSFUL);
}
memset(Bcb, 0, sizeof(BCB));
ObReferenceObjectByPointer(FileObject,
FILE_ALL_ACCESS,
NULL,
KernelMode);
Bcb->FileObject = FileObject;
Bcb->CacheSegmentSize = CacheSegmentSize;
Bcb->Callbacks = CallBacks;
Bcb->LazyWriteContext = LazyWriterContext;
if (FileObject->FsContext)
{
Bcb->AllocationSize =
((PFSRTL_COMMON_FCB_HEADER)FileObject->FsContext)->AllocationSize;
Bcb->FileSize =
((PFSRTL_COMMON_FCB_HEADER)FileObject->FsContext)->FileSize;
}
KeInitializeSpinLock(&Bcb->BcbLock);
InitializeListHead(&Bcb->BcbSegmentListHead);
FileObject->SectionObjectPointer->SharedCacheMap = Bcb;
}
if (FileObject->PrivateCacheMap == NULL)
{
FileObject->PrivateCacheMap = Bcb;
Bcb->RefCount++;
}
if (Bcb->BcbRemoveListEntry.Flink != NULL)
{
RemoveEntryList(&Bcb->BcbRemoveListEntry);
Bcb->BcbRemoveListEntry.Flink = NULL;
}
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeReleaseGuardedMutex(&ViewLock);
return(STATUS_SUCCESS);
}
/*
* @implemented
*/
PFILE_OBJECT NTAPI
CcGetFileObjectFromSectionPtrs(IN PSECTION_OBJECT_POINTERS SectionObjectPointers)
{
PBCB Bcb;
if (SectionObjectPointers && SectionObjectPointers->SharedCacheMap)
{
Bcb = (PBCB)SectionObjectPointers->SharedCacheMap;
ASSERT(Bcb);
return Bcb->FileObject;
}
return NULL;
}
VOID
INIT_FUNCTION
NTAPI
CcInitView(VOID)
{
#ifdef CACHE_BITMAP
PMEMORY_AREA marea;
PVOID Buffer;
PHYSICAL_ADDRESS BoundaryAddressMultiple;
#endif
DPRINT("CcInitView()\n");
#ifdef CACHE_BITMAP
BoundaryAddressMultiple.QuadPart = 0;
CiCacheSegMappingRegionHint = 0;
CiCacheSegMappingRegionBase = NULL;
MmLockAddressSpace(MmGetKernelAddressSpace());
Status = MmCreateMemoryArea(MmGetKernelAddressSpace(),
MEMORY_AREA_CACHE_SEGMENT,
&CiCacheSegMappingRegionBase,
CI_CACHESEG_MAPPING_REGION_SIZE,
PAGE_READWRITE,
&marea,
FALSE,
0,
BoundaryAddressMultiple);
MmUnlockAddressSpace(MmGetKernelAddressSpace());
if (!NT_SUCCESS(Status))
{
KeBugCheck(CACHE_MANAGER);
}
Buffer = ExAllocatePool(NonPagedPool, CI_CACHESEG_MAPPING_REGION_SIZE / (PAGE_SIZE * 8));
if (!Buffer)
{
KeBugCheck(CACHE_MANAGER);
}
RtlInitializeBitMap(&CiCacheSegMappingRegionAllocMap, Buffer, CI_CACHESEG_MAPPING_REGION_SIZE / PAGE_SIZE);
RtlClearAllBits(&CiCacheSegMappingRegionAllocMap);
KeInitializeSpinLock(&CiCacheSegMappingRegionLock);
#endif
InitializeListHead(&CacheSegmentListHead);
InitializeListHead(&DirtySegmentListHead);
InitializeListHead(&CacheSegmentLRUListHead);
InitializeListHead(&ClosedListHead);
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00
KeInitializeGuardedMutex(&ViewLock);
ExInitializeNPagedLookasideList (&iBcbLookasideList,
NULL,
NULL,
0,
sizeof(INTERNAL_BCB),
TAG_IBCB,
20);
ExInitializeNPagedLookasideList (&BcbLookasideList,
NULL,
NULL,
0,
sizeof(BCB),
TAG_BCB,
20);
ExInitializeNPagedLookasideList (&CacheSegLookasideList,
NULL,
NULL,
0,
sizeof(CACHE_SEGMENT),
TAG_CSEG,
20);
MmInitializeMemoryConsumer(MC_CACHE, CcRosTrimCache);
2002-08-08 David Welch <welch@computer2.darkstar.org> * ntoskrnl/mm/section (NtQuerySection): Return the right result length. 2002-08-08 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ke/usertrap.c (print_user_address): Check for a NULL LDR structure in the PEB; copy the LDR pointer in safely. 2002-08-08 David Welch <welch@computer2.darkstar.org> * ntoskrnl/ke/apc.c (KiDeliverUserApc): Deliver all present APCs; release the APC spinlock while acccessing user memory. 2002-08-08 David Welch <welch@computer2.darkstar.org> * include/internal/ps.h: Adjusted offsets into the ETHREAD structure. * include/internal/ps.h: Removed redundant members from the KTHREAD structure. * ntoskrnl/ke/kthread.c (KeInitializeThread): Removed redundant members from the KTHREAD structure. 2002-08-08 David Welch <welch@computer2.darkstar.org> * ntoskrnl/dbg/kdb.c (KdbEnterDebuggerException): New function to enter the debugger on an exception. * ntoskrnl/kd/kdebug.c (KdInitSystem): Initialize the local kernel debugger if enabled. * ntoskrnl/ke/catch.c (KiDispatchException): Enter the local kernel debugger on an exception. 2002-08-08 David Welch <welch@computer2.darkstar.org> * include/ntdll/ldr.h: Added definition for a DLL entrypoint. * lib/kernel32/process/create.c (KlCreateFirstThread): Put the argument to the NtProcessStartup function on the stack. * lib/kernel32/process/create.c (KlInitPeb): Read the base address of the new image from the PEB. * lib/kernel32/process/create.c (CreateProcessW): Start the first thread at the entrypoint of the new image. * lib/ntdll/ldr/startup.c (LdrInitializeThunk): If the function is called after the initial startup then just call the entrypoints for the loaded DLLs with DLL_THREAD_ATTACH. Don't call the entrypoint of the image. * lib/ntdll/rtl/process.c (RtlpCreateFirstThread): Put the argument to the NtProcessStartup function on the stack. * lib/ntdll/rtl/process.c (KlInitPeb): Read the base address of the new image from the PEB. * lib/ntdll/rtl/process.c (RtlCreateUserProcess): Start the first thread at the entrypoint of the new image. * ntoskrnl/ke/i386/bthread.S (PsBeginThreadWithContextInternal): Use the system call path to begin a usermode thread. * ntoskrnl/ke/i386/thread.c (Ke386InitThreadWithContext): Convert the supplied context into a trap frame. * ntoskrnl/ldr/init.c (LdrLoadInitialProcess): Put the PEB argument to the NtProcessStartup function on the new stack; start the first thread at the entrypoint of the image. * ntoskrnl/ps/create.c (NtCreateThread): Create an APC to call LdrInitializeThunk in the context of a new thread before its entrypoint. 2002-08-08 David Welch <welch@computer2.darkstar.org> * drivers/fs/vfat/cleanup.c (VfatCleanupFile): Uninitialise the cache on file cleanup. * drivers/fs/vfat/fcb.c (vfatReleaseFcb): Don't uninitialise the cache on file close. * ntoskrnl/cc/copy.c: Renamed zero page global variable. * ntoskrnl/cc/view.c: Added cache delete function. svn path=/trunk/; revision=3323
2002-08-08 17:54:16 +00:00
CcInitCacheZeroPage();
}
/* EOF */
- Okay so...listen up. First off: When you acquire a lock such as a fast mutex, you should never acquire it recursively. For example, when you handle a page fault in a section, then page fault while handling that page fault (which is perfectly okay), you shouldn't be trying to re-acquire the address space lock that you're already holding. After this fix, this scenario works and countless others. Apps like QTInfo now work and load, and PictureViewer doesn't BSOD the system anymore. I've fixed this by changing the lock to a pushlock. It not only increases speed inside the memory manager significantly (such as during page fault handling), but does allow recursive acquisition without any problems. - Now if that wasn't bad enough, here's a couple more tips. Fast Mutexes actually require APC_LEVEL to be effective. If you're going to be using a Fast Mutex and calling it with the "Unsafe" version, then don't expect anything to work. Also, using functions like "CcTryToAcquireBrokenMutex" where correct code is duplicated then hacked to work isn't a big help either. And that's not all. Fast Mutex disables kernel APCs by setting the KernelApcDisable flag on, and it's expected that the count inside the fast mutex will match the count inside the thread. In other words, LOCK ACQUISITION AND RELEASE MUST BE ORDERED. You can't acquire LOCK A and B, and then release lock A and B, because that leads to deadlocks and other issues. So of course, the Cache Manager acquired a view lock, then acquired a segment lock, then released the view lock, then released the segment lock, then re-acquired the view lock. Uh, no, that won't work. You know what else doesn't work so well? Disabling APCs about 6-9 times to acquire a single lock, and using spinlocks in the same code path as well. Just how paranoid are you about thread safety, but still manage to get it wrong? Okay, so we've got recursion, out-of-order lock acquision and release, made-up "broken" acquire functions, and using a lock that depends on APC_LEVEL at PASSIVE_LEVEL. The best part is when Cc builds an array of cache segments, and locks each of them... then during release, the list gets parsed head-first, so the first acquired locks get released first. So locks a, b, c, d get acquired, then a, b, c, d get released. Great! Sounds about right for ReactOS's Cache Manager design. I've changed the view lock to a guarded mutex -- which actually properly disables APCs and works at PASSIVE_LEVEL, and changed the segment locks to be push locks. First it'll be 10 times faster then acquiring a bazillion fast mutexes, especially since APCs have already been disabled at this point, and it also allows you to do most of the stupid things the Cache Manager does. Out-of-order release is still not going to work well, so eventually on a multi-processor machine the code will completely die -- but at least it'll work on UP for now. In the end, this makes things like the Inkscape installer and Quicktime Installer to work, and probably countless other things that generated ASSERTS in the fast mutex code. -- Alex Ionescu svn path=/trunk/; revision=30401
2007-11-12 19:00:26 +00:00