Kernel stacks that re freed, can be placed on an SLIST for quick reuse. The old code was using a member of the PFN of the last stack page as the SLIST_ENTRY. This relies on the following (non-portable) assumptions:
- A stack always has a PTE associated with it.
- This PTE has a PFN associated with it.
- The PFN has an empty field that can be re-used as an SLIST_ENTRY.
- The PFN has another field that points back to the PTE, which then can be used to get the stack base.
Specifically: On x64 the PFN field is not 16 bytes aligned, so it cannot be used as an SLIST_ENTRY. (In a "usermode kernel" the other assumptions are also invalid).
The new code does what Windows does (and which seems absolutely obvious to do): Place the SLIST_ENTRY directly on the stack.
It's hardly understandable and doesn't really makes sense.
Furthermore, it breaks compatibility with 3rd party FSD that
don't implement such FSCTL.
Obviously, Windows doesn't do this.
For user mode, when probing output buffer, if it's null, length
will also be set to 0.
This avoids user mode applications being able to trigger various
asserts in ReactOS (and thus BSOD when no debugger is plugged ;-)).
NDK: Define PLUGPLAY_CONTROL_PROPERTY_DATA.Properties and PLUGPLAY_CONTROL_DEVICE_RELATIONS_DATA.Relations values.
NTOSKRNL: Map PLUGPLAY_CONTROL_PROPERTY_DATA.Properties values to IoGetDeviceProperty properties and add (dummy) code for unsupported cases.
UMPNPMGR: Use PLUGPLAY_CONTROL_PROPERTY_DATA.Properties values in PNP_GetDeviceRegProp.
- Overhaul SepCreateToken() and SepDuplicateToken() so that they
implement the "variable information area" of the token, where
immutable lists of user & groups and privileges reside, and the
"dynamic information area" (allocated separately in paged pool),
where mutable data such as the token's default DACL is stored.
Perform the necessary adaptations in SepDeleteToken() and in
NtSetInformationToken().
- Actually dereference the token's logon session, when needed, in the
'TokenSessionReference' case in NtSetInformationToken().
- Overhaul SepFindPrimaryGroupAndDefaultOwner() so that it returns
the indices of candidate primary group and default owner within the
token's user & groups array. This allows for fixing the 'TokenOwner'
and 'TokenPrimaryGroup' cases of NtSetInformationToken(), since the
owner or primary group being set *MUST* already exist in the token's
user & groups array (as a by-product, memory corruptions that existed
before due to the broken way of setting these properties disappear too).
- Lock tokens every time operations are performed on them (NOTE: we
still use a global token lock!).
- Touch the ModifiedId LUID member of tokens everytime a write operation
(property change, etc...) is made on them.
- Fix some group attributes in the SYSTEM process token, SepCreateSystemProcessToken().
- Make the SeCreateTokenPrivilege mandatory when calling NtCreateToken().
- Update the token pool tags.
- Explicitly use the Ex*ResourceLite() versions of the locking functions
in the token locking macros.
- Use TRUE/FALSE instead of 1/0 for booleans.
- Use NULL instead of 0 for null pointers.
- Print 0x prefix for hex values in DPRINTs.
- Use new annotations for SepCreateToken() and SepDuplicateToken().
Caught while debugging, in the case the ImpersonationLevel value was
uninitialized, due to the fact it was left untouched on purpose by
PsReferenceEffectiveToken().
kmtest:NtCreateSection calls CcInitializeCacheMap with a
NULL value for SectionObjectPointers. This will cause an exception when
trying to access it, which in Windows can be handled gracefully.
However accessing it while holding ViewLock means the lock will not be
released, leading to an APC_INDEX_MISMATCH bugcheck.
This solves the problem by allocating SharedCacheMap outside the lock,
then freeing it again under lock if another thread has updated SharedCacheMap
in the mean time. This is also What Windows Does(TM).
This avoids a really nasty race condition in our cache controler where
two concurrents could try to initialize cache on the same file.
This had two nasty effects: first shared map was purely leaked and erased
by the second one. And the private cache map, allocated on the first shared
cache map couldn't be freed and was leading to Mm BSOD (free in a middle of
a block).
This was often triggered while building ReactOS on ReactOS (with multi threads).
With that patch, I cannot crash anylonger while building ReactOS.
CORE-14634
Oneliner of the day... This typo just prevented the
whole feature to work properly. Because any allocated
work item would miserably fail to be freed.
This will obviously help real world FSD relying on
StackOverflow worker from FsRtl to work better!
CORE-14611
In the lazy writer run, first post items that are queued for this.
Only then, start executing deferred writes if any.
If there were any, reschedule immediately a lazy writer run, to keep
Cc warm and to make it unqueue write faster in case of high IOs situation.
To make second lazy writer run happen faster, we keep our state active to
use short delay (1s) instead of standard idle (3s).
Recent changes seem to show that it's not
required to be exclusive on VACB to be able
to flush it.
This commit goes with f2c44aa and fixes the
last issues going with copying huge files.
There are no longer BSODs (be it in Mm or Cc).
I could, with 750MB RAM extract a 2GB file from
a 53MB archive and copy a 2,5GB file from a VBox
share to the disk. Note that writes are often
deferred, so if copy works, it's not that fast for now.
Note that it also brings some beloved behavior from
Windows: copy times are totally unreliable now when
writes are deferred. Little remaining times when
actively copying, high remaining times when deferred
writes in action. And goes between both... Sorry! ;-)
https://xkcd.com/612/
CORE-9696
CORE-11175
Same mechanism exists in Windows (even their Cc
is way different from ours...) where when Cc is
out of memory (in their case, out of VACB), we
will start scavenge old & unused VACB to free
some of the memory.
It's useful in case we're operating we big files
operations, we may run out of memory where to map
VACB for them, so start to scavenge VACB to free
some of that memory.
With this, I am able to install Qt 4.8.6 with 2,5GB of RAM,
scavenging acting when needed!
CORE-12081
CORE-14582
Adjusting refcount and enabling lazy-write for pinned
VACB makes it actually more efficient, often purging
data to disk, reducing memory stress for the system.
This is required for defering writes.
This commit unfortunately (?) reverts a previous revert.
CORE-12081
CORE-14582
CORE-14313
When no name is set in the file object, try to read the name
from the FCB. We only support FastFAT (ours) FCB for now.
This is clearly a hack, but for a kdbg command, so ;-)
It seems that on process killing, some VACB may be deleted while
still mapped. With current reference counting, they will actually
not be deleted, but leaked, and an ASSERT will be triggered.
CORE-14578
- For non-PnP devices reported to the PnP manager through the
IoReportDetectedDevice() function, store the corresponding
service/driver name and (non-)legacy information inside their
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Enum\Root\ entries.
- Drivers flagged as "DRVO_BUILTIN_DRIVER" (basically, only those
created via a IoCreateDriver() call) have their "Service" name that
contain "\Driver\", which should be stripped before being used in
building e.g. the corresponding "DETECTEDxxx" PnP compatible IDs.
CORE-14247
- Use explicit REG_OPTION_NON_VOLATILE flag where needed in the
IopCreateDeviceKeyPath() calls.
- Save NULL-terminated REG-SZ string properties in the enumeration tree
for each device enumerated inside \Enum\Root\.
- Always use upcased key name for the "LEGACY_***" elements in \Enum\Root\.
- Add a default "ConfigFlags" value for the legacy elements.
- Simplify few parts of code.
The size is in bytes, not in pages! On x86 we got away with it, since PEB and TEB require only a single page and the 1 passed to MiInsertVadEx() was aligned up to PAGE_SIZE. On x64 this doesn't work, since the size is 2 pages.
This has have several benefits for ReactOS Cc:
- It helps reducing potential deadlocks situations in Cc
- It speeds up ReactOS by reducing locks
- It gets us a bit closer to Windows VACB
CORE-14349
The avoids race conditions where attempts to read from disk to
not fully initialized VACB were performed.
Also, added more debug prints in such situations.
CORE-14349
Doing this is not only wrong because it acquires the same spinlock twice,
it also completely breaks the TLB flushing logic in MiMapPageInHyperSpace.
If the PTE with Offset 1 is still valid when a wrap-around to 0 happens,
the TLB flush on wrap-around will not clear the entry for this previous page.
After another loop around all hyperspace pages, page 1 is re-used but its
TLB entry has not been flushed, which may result into incorrect translation.
This allows being consistent between newly created and looked up
so that VACB can always safely be released.
Should really help with reference issues.
CORE-14481
CORE-14480
CORE-14482
This fixes various bugs linked to VACB counting:
- VACB not released when it should be
- Reference count expectations not being accurate
For the record, VACB should always have at least a reference
count of 1, unless you are to free it and removed it from
any linked list.
This commit also adds a bunch of asserts that should
help triggering invalid reference counting.
It should also fix numerous ASSERT currently triggered and
may help fixing random behaviours in Cc.
CORE-14285
CORE-14401
CORE-14293
Previously, we would keep sampling the CPU frequency until two subsequent
samples differed by at most 1 MHz. This could take several seconds, and would
unnecessarily delay boot.
Instead, if sampling is too unreliable, just give up and calculate the average
frequency from 10 samples. This is no worse than picking the frequency that
just happened to be returned twice in a row.
The fact that this method of sampling fails could indicate that there's a
problem with our performance counter implementation or timer interrupt,
but that's a separate issue...
This avoids locking Cc for too long by trying to read-ahead data which
is already in cache.
We now will only schedule a read ahead if next read should bring us
to a new VACB (perhaps not in cache).
This notably fixes Inkscape setup which was slown down by read-ahead
due to continous 1 byte reads.
Thanks to Thomas for his help on this issue.
CORE-14395
The reserve IRP is an IRP which is allocated on system boot and kept during
the whole system life. Its purpose is to allow page reads in case of
low-memory situations where the system doesn't have enough memory left
to allocate an IRP to read from the page file (would be catastrophic situation).
Standard shared cache map provides space for a private cache map, do the same
and make it available for the first handle. It avoids two allocations in a row.
This halfplements CcScheduleReadAhead() which is responsible for finding the next reads
to perform given last read and previous reads. I made it very basic for now, at least
to test the whole process.
This also introduces the CcExpressWorkQueue in the lazy writer which is responsible
for dealing with read ahead items and which is dealt with before the regular queue.
In CcCopyData(), if read was fine, schedule read ahead so that it can happen in background
without the FSD to notice it! Also, update the read history so that scheduling as a
bit of data.
Implement (à la "old Cc" ;-)) CcPerformReadAhead() which is responsible for performing
the read. It's only to be called by the worker thread.
Side note on the modifications done in CcRosReleaseFileCache(). Private cache map
is tied to a handle. If it goes away, private cache map gets deleted. Read ahead
can run after the handle was closed (and thus, private cache map deleted), so
it is mandatory to always lock the master lock before accessing the structure in
read ahead or before deleting it in CcRosReleaseFileCache(). Otherwise, you'll
just break everything. You've been warned!
This commit also partly reverts f8b5d27.
CORE-14312
before the call to CcRosReleaseFileCache() which expects to have it to properly clean the file.
So, move deletion code to CcRosReleaseFileCache() so that he's the only one to handle private map.
Should hopefully fix all the recent buildbots issues (and the universe perhaps, who knows?)
This reverts BCB being lazy written when marked dirty.
We'll go back to this behavior when this part will have been reworked and stabilized.
CORE-14263
CORE-14279
CORE-14285
We get rid of the old iLazyWriterNotify event in favor of work items
that contain an event that lazy writer will set once its done.
To implement this, we rely on the newly introduced CcPostTickWorkQueue work queue
that will contain work items that are to be queued once lazy writer is done.
Move the CcWaitForCurrentLazyWriterActivity() implementation to the
lazy writer file, and reimplemented it using the new support mechanisms
Instead move to a threading model like the Windows one.
We'll queue several work items to be executed in a system thread (Cc worker)
when there are VACB that have been marked as dirty. Furthermore, some delay
will be observed before action to avoid killing the system with IOs.
This new threading model opens way for read ahead and write behind implementation.
Also, moved the initialization of the lazy writer to CcInitializeCacheManager()
it has nothing to do with views and shouldn't be initialized there.
Also, moved the lazy writer implementation to its own file.
Modified CcDeferWrite() and CcRosMarkDirtyVacb() to take into account the new threading model.
Introduced new functions:
- CcPostWorkQueue(): post an item to be handled by Cc worker and spawn a worker if required
- CcScanDpc(): called after some time (not to have lazy writer always running) to queue a lazy scan
- CcLazyWriteScan(): the lazy writer we used to have
- CcScheduleLazyWriteScan(): function to call when you want to start a lazy writer run. It will make a DPC after some time and queue execution
- CcWorkerThread(): the worker thread that will handle lazy write, read ahead, and so on
This means that MmSystemCacheStart, MmSystemCacheEnd, MmSizeOfSystemCacheInPages
have now a valid value.
System cache is not used atm the moment though. MmMapViewInSystemCache() is to
be implemented, and Cc is to be made aware of this.
CORE-14259
- Change MM_SYSTEM_SPACE_START to 0xFFFFF88000000000
- Move MI_DEBUG_MAPPING to the end of the system PTE range
- Add MI_SYSTEM_CACHE_START and MI_SYSTEM_CACHE_END, which is in the range that Vista uses as dynamic VA space for cache and other allocations
- Wrap x86 specific code that makes now invalid assumptions about the address space layout in #ifdef _M_IX86
- CcUnpinDataForThread() only release VACB when the last BCB reference is gone. This avoids having a valid BCB with an invalid VACB
- CcRosMarkDirtyVacb() will only accept non-dirty VACB now. This avoids a major bug where a an already dirty VACB was over-dereferenced
- Thanks to previous point, simplify CcRosUnmapVacb(), CcRosReleaseVacb() implementation
- And only set VACB dirty once in CcSetDirtyPinnedData()
- Add a few sanity checks
With that I can again install ReactOS with 128MB RAM :-).
CORE-14263
CORE-14268
Namely, implement CcSetDirtyPageThreshold() and add support for it
in CcCanIWrite().
Also added my name in the headers of the few files I touched tonight.
CORE-14235
Namely, implement CcCanIWrite() (very basic, and likely wrong).
And implement CcDeferWrite() which will queue the write operation.
In CciLazyWriter() (which may be renamed CcWorkerThread() ;-)),
handle the queued write operations one by one. This is likely
not to be accurate, but, given we have only on FS supporting
this for now (NFS / RDBSS / Shares), this is OK.
CORE-14235
Experiment and MSDN tend to show that a dirty BCB is queued for lazy write.
This will do the job here!
Also, renamed CcRosMarkDirtyFile() which is more accurate, and added a new
function CcRosMarkDirtyVacb() which just takes a VACB as arg (expected locked)
and marks it dirty (using previous implementation). Make CcRosMarkDirtyFile()
use it.
CORE-14235
CcDirtyPageThreshold is not here to compute when you have to write,
but to know where you have to deny writes.
More commits to come in that direction!
CORE-14235
This removes the "modified page writer" thread in Mm that was regularly blindly
attempting to flush dirty pages to the disk.
Instead, this commit introduces a lazy writer that will monitor dirty pages count
and will flush them to disk when this count is above a threshold. The threshold is
computed on Cc init.
Compared to what was done previously, this lazy writer will only write down files
that are not marked as temporary.
The mechanisms involved in this lazy writer worker are well described in Windows
Internals 4th editions (constants are coming from it ;-)).
Also fixed a bad (and old!) bug in CcRosFlushDirtyPages() where target count could
be overflow and the function would spin forever while holding the VACBs lock. This is
mandatory as now lazy writer will call it with "random" values.
This also allows implementing CcWaitForCurrentLazyWriterActivity() :-).
Also renamed DirtyPageCount to its MS equivalent.
CORE-14235
This would affect reads/writes on large volumes where offset is higher than what a ULONG can hold.
This really nasty bug was hitting CcMapData() but also CcPinRead() (due to the nature of its implementation)
and both were returning garbage data under certain circumstances with Ext2Fsd.
This should (I hope!) help some other FSDs to work better in ROS.
CORE-12456
Note: before we had a BOOLEAN parameter called StoreInstruction, but in reality it was not specifying whether the fault was from a store store instruction, but whether it was an access violation rather than a page-not-present fault. On x86 without PAE there are only 2 kinds of access violations: (1) Access of a kernel mode page from user mode, which is handled early and (2) access of a read-only (or COW) page with a writing instruction. Therefore we could get away with this, even though it relied on the wrong assumption that a fault, which was not a page-not-present-fault, was automatically a write access. This commit only changes one thing: we pass the full fault-code to MmAccessFault and handle the rest from there in exactly the same way as before. More changes are coming to make things clear.
Based on an original patch by Timo Kreuzer, with modifications by me to adapt it to latest HEAD and use a single exit path through the Cleanup label. This reliably frees all allocated handles.
The original code returns STATUS_SUCCESS for many cases. This has been preserved.
In the future, it should be checked though whether returning success is appropriate for all these cases.
CORE-6844
[REACTOS] Misc 64 bit fixes
* [NTOS:MM] Allow MEM_DOS_LIM in NtMapViewOfSection on x64 as well
* [NTOS:MM] Implement x64 version of MmIsDisabledPage
* [HAL] Remove obsolete code
* [NTOS:KE] Fix amd64 version of KeContextToTrapFrame and KeTrapFrameToContext
* [XDK] Fix CONTEXT_XSTATE definition
* [PCNET] Convert physical address types from pointers to PHYSICAL_ADDRESS