+ Improve related comments.
Registry hives are opened in shared read access when NT is loaded in PE
mode (MININT) or from network (the hives residing on a network share).
This is true in particular for the main system hives (SYSTEM, SOFTWARE,
DEFAULT, ...).
However, in PE mode, we can allow other hives, e.g. those loaded by the
user (with NtLoadKey) to be loaded with full read/write access, since we
boot from a local computer.
In particular remove some extra-parentheses around single code tokens,
and replace few "DPRINT1 + while (TRUE);" by UNIMPLEMENTED_DBGBREAK.
+ Improve some comments.
Sometimes repairing a broken hive with a hive log does not always guarantee the hive
in question has fully recovered. In worst cases it could happen the LOG itself is even
corrupt too and that would certainly lead to a total unbootable system. This is most likely
if the victim hive is the SYSTEM hive.
This can be anyhow solved by the help of a mirror hive, or also called an "alternate hive".
Alternate hives serve the purpose as backup hives for primary hives of which there is still
a risk that is not worth taking. For now only the SYSTEM hive is granted the right to have
a backup alternate hive.
=== NOTE ===
Currently the SYSTEM hive can only base upon the alternate SYSTEM.ALT hive, which means the
corresponding LOG file never gets updated. When time comes the existing code must be adapted
to allow the possibility to use .ALT and .LOG hives simultaneously.
=== DOCUMENTATION REMARKS ===
This implements (also enables some parts of code been decayed for years) the transacted writing of the registry. Transacted writing (or writing into registry in a transactional way) is an operation that ensures the successfulness can be achieved by monitoring two main points.
In CMLIB, such points are what we internally call them the primary and secondary sequences. A sequence is a numeric field that is incremented each time a writing operation (namely done with the FileWrite function and such) has successfully completed.
The primary sequence is incremented to suggest that the initial work of syncing the registry is in progress. During this phase, the base block header is written into the primary hive file and registry data is being written to said file in form of blocks. Afterwards the seconady sequence
is increment to report completion of the transactional writing of the registry. This operation occurs in HvpWriteHive function (invoked by HvSyncHive for syncing). If the transactional writing fails or if the lazy flushing of the registry fails, LOG files come into play.
Like HvpWriteHive, LOGs are updated by the HvpWriteLog which writes dirty data (base block header included) to the LOG themselves. These files serve for recovery and emergency purposes in case the primary machine hive has been damaged due to previous forced interruption of writing stuff into
the registry hive. With specific recovery algorithms, the data that's been gathered from a LOG will be applied to the primary hive, salvaging it. But if a LOG file is corrupt as well, then the system will perform resuscitation techniques by reconstructing the base block header to reasonable values,
reset the registry signature and whatnot.
This work is an inspiration from PR #3932 by mrmks04 (aka Max Korostil). I have continued his work by doing some more tweaks and whatnot. In addition to that, the whole transaction writing code is documented.
=== IMPORTANT NOTES ===
HvpWriteLog -- Currently this function lacks the ability to grow the log file size since we pretty much lack the necessary code that deals with hive shrinking and log shrinking/growing as well. This part is not super critical for us so this shall be left as a TODO for future.
HvLoadHive -- Currently there's a hack that prevents us from refactoring this function in a proper way. That is, we should not be reading the whole and prepare the hive storage using HvpInitializeMemoryHive which is strictly used for HINIT_MEMORY but rather we must read the hive file block by block
and deconstruct the read buffer from the file so that we can get the bins that we read from the file. With the hive bins we got the hive storage will be prepared based on such bins. If one of the bins is corrupt, self healing is applied in such scenario.
For this matter, if in any case the hive we'll be reading is corrupt we could potentially read corrupt data and lead the system into failure. So we have to perform header and data recovery as well before reading the whole hive.
because George is having an open Draft PR since July 2022,
which might also touch this file on master in some years.
And it ofc is easier for me to revert my work now, then for him to
go through the great lengths of merging his work then.