Content Distribution Follow-up
This post is inspired by a question submitted on my previous post: Content Distribution: The Myth. The question was about how ConfigMgr determines if files are different when deciding whether or not to replicate them to distribution points. Specifically, the requestor wanted to know whether a simple name change to a file would cause ConfigMgr to see the file as a new file and thus not re-replicate it to the distribution points.
ConfigMgr uses a hashing function to determine if the files are different. Hashing algorithms are one-way algorithms that produce a distinct value for any given input. The likelihood of any two inputs producing the same result is astronomically small (although still possible). The better the algorithm used, the less likely the chance of this collision. Thus, they are a great method to compare two files without having to actually compare the contents of those files byte by byte.
Knowing this results in two questions:
1. What hashing function does ConfigMgr use?
The answer to this is documented on TechNet on the page titled Technical Reference for Cryptographic Controls Used in Configuration Manager: SHA-256. The actual answer here isn’t really that important though as long we know that a “good” hashing function is being used.
2. What input does ConfigMgr use for the hashing function? Or more pertinent to the question at hand: is the file name used as an input to the function or as a criterion in general to determine if the files are the same or different?
To answer this, I started by getting the actual SHA-256 hash value for the files in the second test package from my first post (using PowerShell of course). I then renamed a couple of the files and retrieved their hash values again.
What this shows is that when the file name is not used as input, which it isn’t for the PowerShell Get-FileHash cmdlet, the hash value remains unaffected because the file itself hasn’t changed at all. The question still remains though, does ConfigMgr use the file name or not?
For that, a simple update of the distribution point for the test package and an examination of the logs files should reveal the answer.
ConfigMgr did, in fact, see the files as being the same and did not re-replicate them. To dig a little further, I opened the Content Library Explorer – a nice new addition to the ConfigMgr toolkit for R2.
Notice that the files which had their name changed are still listed as Shared With the first test package (ONE0004C) from the previous post even though they have different names. Next, also notice how the actual file names listed in the path column are the file hashes we retrieved earlier with PowerShell.
Using the power of hashing and SHA-256, ConfigMgr was able to do the “right” thing and not only not re-replicate the files, but also maintain the integrity of our single instance store even though they were renamed.
– ConfigMgr 2012 Content Library Overview
– An adventure in the sccmcontentlib – single instance store
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.