Content Distribution Magic

Content Distribution Magic

Content Distribution Follow-up

This post is inspired by a question submitted on my previous post: Content Distribution: The Myth. The question was about how ConfigMgr determines if files are different when deciding whether or not to replicate them to distribution points. Specifically, the requestor wanted to know whether a simple name change to a file would cause ConfigMgr to see the file as a new file and thus not re-replicate it to the distribution points.

Hashing

ConfigMgr uses a hashing function to determine if the files are different. Hashing algorithms are one-way algorithms that produce a distinct value for any given input. The likelihood of any two inputs producing the same result is astronomically small (although still possible). The better the algorithm used, the less likely the chance of this collision. Thus, they are a great method to compare two files without having to actually compare the contents of those files byte by byte.

Knowing this results in two questions:

1. What hashing function does ConfigMgr use?

The answer to this is documented on TechNet on the page titled Technical Reference for Cryptographic Controls Used in Configuration Manager: SHA-256. The actual answer here isn’t really that important though as long we know that a “good” hashing function is being used.

2. What input does ConfigMgr use for the hashing function? Or more pertinent to the question at hand: is the file name used as an input to the function or as a criterion in general to determine if the files are the same or different?

To answer this, I started by getting the actual SHA-256 hash value for the files in the second test package from my first post (using PowerShell of course). I then renamed a couple of the files and retrieved their hash values again.

Renaming two files in the test package

Renaming two files in the test package

Getting the SHA-256 hash for the package contents

Getting the SHA-256 hash for the package contents

What this shows is that when the file name is not used as input, which it isn’t for the PowerShell Get-FileHash cmdlet, the hash value remains unaffected because the file itself hasn’t changed at all. The question still remains though, does ConfigMgr use the file name or not?

For that, a simple update of the distribution point for the test package and an examination of the logs files should reveal the answer.

Output of pkgxfermgr.log

Output of pkgxfermgr.log

The Result

ConfigMgr did, in fact, see the files as being the same and did not re-replicate them. To dig a little further, I opened the Content Library Explorer – a nice new addition to the ConfigMgr toolkit for R2.

The test package in Content Library Explorer

The test package in Content Library Explorer

Notice that the files which had their name changed are still listed as Shared With the first test package (ONE0004C) from the previous post even though they have different names. Next, also notice how the actual file names listed in the path column are the file hashes we retrieved earlier with PowerShell.

Using the power of hashing and SHA-256, ConfigMgr was able to do the “right” thing and not only not re-replicate the files, but also maintain the integrity of our single instance store even though they were renamed.

sccmcontentlibrary

To learn more about the ConfigMgr content library, commonly referred to as sccmcontentlib, check out these two great posts:
ConfigMgr 2012 Content Library Overview
An adventure in the sccmcontentlib – single instance store

Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Content Distribution: The Myth

Next Article

Content Distribution: The Myth

No Comments

Cancel