Like any story and issue there are couple sides which are touching the subject. One is pulling strings to his side while the other to his. They both think that they are right and in many cases they do. Specifically in Computer Science there is an option to find a solution that will satisfy couple sides of the picture without harming the other while in the real and material world some things cannot be resolved in a similar way.
Microsoft is a huge company which supplies Desktops, Servers, Mobile and other IT platforms solutions. Their services are global and else then the development of the software they provide security and enhancement updates to their clients. Microsoft is committed to allow their clients fast and reliable updates submission\transfer.
Microsoft spreads updates for a very long time using the Internet but since their software is complex sometimes they are required to spread a huge(in size) and urgent updates. Due to the fact that the world of IT is as wide as the sea, Microsoft uses CDN suppliers these days to spread their software and updates to their clients. But it’s not enough since in local networks the download of updates from a CDN over the Internet can be reduced into the local network using either a local centralized updates service or using some local Peer2Peer file transfer.
In many networks of above 10 Desktops there is a local server which can be used to spread Microsoft Windows updates and still in many cases it’s the right solution and it’s a weird situation that many admins prefer to try other solutions rather that what they were given to use.
From a network admins perspective of things when Microsoft spreads a huge update to their client it can cause a Torrent(not the P2P one) of high network utilization. And since the network lines needs to be available to a more urgent things which are mandatory . They need to somehow Police the traffic in a way that windows updates traffic will not harm other clients.
Small office networks that has 4 Desktops and 1 Server in most cases are not in real trouble unless there is a use of some kind of Web or Cloud service which huge Microsoft Updates will might cause issues with.
For Small and Medium size offices with more then 20 Desktops the situation is that Microsoft Updates repeated download can be prevented or managed in a way that will not cause issues to regular network traffic either by the local System Administrator or using Network level QOS. But when there is no System Administrator or there is some kind of combined job for couple areas, it can be easier to implement a more simple updates solution in the “network” level rather then in a System level.
From an ISP Network Administrator point of view everything is bits. He needs to make sure that the “important” bits will get from the client side of the network to the edge of it. Microsoft Updates can cause a headache of these are repeatedly downloaded over and over again by each and every Microsoft client. The fast solution to allow Microsoft Updates is to slow them down or host a local Instance of Microsoft CDN partners. While for Medium and above ISPs it would be simple to host a CDN instance or server, it would not be the same for small and big networks. Most big networks either do not care about these Updates since their lines are built to handle lots of traffic or they already have one of Microsoft CDN Instances already hosted in their Racks.
For the Small and Medium sized ISP’s the situation is a bit different since they can get “stuck” when they have what can be described as “Network Clients DOS”. The solution for them are mainly local caching or targeted traffic throttling. Any of these solution is not very simple and requires knowledge in both Networking and the upper levels of the connection and only for the purpose of debugging some issues and deciding on the right approach.
In the case of SAT or long distance Wireless connections the admin usually have limited resources which he can spare and Microsoft Updates might not be as important as some GPS based navigation software but they mostly downloaded “automatically” these days so it’s an obstacle and in these cases Microsoft Updates are being blocked in couple layers from the IP to the application.
The issue is not related directly to Microsoft and their Updates but indeed they deserve respect since they will probably continue to be present in Space. Since we are moving continuously towards the future we can assume that updates will be something important. In the case of Distance Space Ships it is feasible to assume that if multiple Ships will be out-there then a centralized distribution point (like WSUS) will probably be used to distribute static content. When it is also possible that the maintenance of such a system will not be under a single layer administration due to the complexity of the task. For these a “cache” or a “store” is the choice to distribute identical content to multiple Space Ships or Space Stations.
When huge Microsoft Updates are being spread around the globe to their clients Network and System Administrators reporting repeatedly about higher consumption of bandwidth and other system resources.
Since Microsoft is a “generic” product their updates are created for many clients and not customized for a specific one. These are static objects\content which being downloaded over and over again and again, due to this it’s possible to prevent repeated downloads of the content.
Couple sides are affected by the same issue.While Microsoft are off-loading or out-sourcing the distribution of the static content to other parties. And by that solving their responsibility to spread the updates and allowing their clients faster download of the updates, other parties on the network level are left to handle a weird situation. Some of these just want to “survive” a huge update while others wants to earn couple more bucks to their monthly revenue. Some are more greedy while others are in-need.
Microsoft do not offer ISP’s or network admins a caching solution for their update since this is not their domain. Microsoft leave the network admins the option to implement any solution they want but it requires an expert that knows a thing about networking HTTP and other areas which are the road and door into the palace.
In order to implement a caching solution for Microsoft Updates there is a need to know first the structure of Microsoft Updates protocols and systems. There are not simple for all but there are couple issues inside this box.
Things to consider when handling the issue:
We can try to approach the target from couple angles but first we need to know: what’s out there?
Also note that these are not a junior System Administrator level task so talk with others about the subject before diving to implement any solution.
Microsoft offers a solution to manage Windows Updates for Domains using a local service: WSUS
It cannot be used in an ISP or IX environment.
For small networks Squid can be used to Intercept traffic of windows updates DATA channel over HTTP and to serve static content locally. Details can be found in the wiki.squid-cache.org .
I have not implemented a caching solution using Nginx or Varnish for Windows Updates but it seems to me that it is possible and the proof for that is that we could cache YouTube static videos using these.
In Varnish VCL you can define each and every object cache key which can somehow help when handling multiple domains for the same content.
In any case that you will approach caching you will need to consider couple angles to the subject. Any solution have Pros and Cons if it’s OpenSource, Free or Paid.
Squid is a great piece of software that can help to cache static content and control web access. But like any software the project have limited resources and due to this and the complexity of the software you would find that caching Windows Updates might not be as simple as it may seems like.
Like any cache solution squid uses the “pulling clients”(downloading) as a force of cache content population. And depends on the design of the cache other options might be better. But if a client will never try to download the cache will never be populated. And in some cached it is possible to predict what needs to be pre-fetched into the cached and in these cases the clients might not be the right cache population force.
Squid handles range requests in a simple way and if the full object wasn’t fully downloaded (ie the cache was not seeded with the object) the request will be served from the remote origin service and the client will not “drive” the cache to fully download the content object.
While it is possible to cause a full fetch based on a range request, currently squid will do that but will force it on the whole service and not only on a specific domain or urls or responses.
A cache can be populated like WSUS does, it fetches the required objects and the spreads them to the local stations. And since a cache service is not a Microsoft legal client it will not be able to access some kind of “list” of updates files\objects which can be pre-fetched. It leaves the cache no options to “predict” and pre-fetch the content but to wait for clients to first initiate contact with the origin service and only after that was done to populate the cache.
Since the objects from Microsoft Updates might contain Private content such as a registered bundled software the cache needs to differentiate between Private and Public objects.
It is possible that if the cache will not honor objects privacy a client will eventually fetch an object which was licensed to another organization which is a series violation and exploitation of trust and credibility.
We can try to catch a bunch of domains using Varnish but since it was not designed to operate this way there are lots of “breakable” parts in such a setup.
Nginx was designed to be a web server and a reverse proxy and white it can be used pretty easy to mirror windows updates in a similar way to what is mentioned in this article it’s not recommended without lots of experience and deep understanding of the subject.
Since Microsoft didn’t published yet any specification about their updates infrastructure and software it will always be considered a “hack” and due to this my recommendation is to have a “terminate” switch available at any point to avoid DOS from your side.
There are many security threats around caching so it is recommended to pull some security expert when implementing any solution. If you will do a bad job it is possible that your clients will get infected by malicious software and as a cache admin you will need to pay repair and data loss fees for your clients.
There are couple ways to “overflow” a cache solution and you will need to take good care in the design of a cache system so it would not crash due to low resources or due to clients content demand.
Keep your cache software safe from both local and remote clients since they can hit your box pretty hard if you will leave it vulnerable and open to the wide world.
The CDN networks are built to handle high load of traffic while a 7.2K RPMs HDD maybe cannot compete a their SSD and the outcome would be slow download speed of updates. So design your cache to be efficient at-least compared to the origin service.
There are couple admins that will want their caching service to slow down the network demand but these are considered “harming” for most networks but in some cases it is the right solution when considering the benefit of critical systems compared to an update.
Since Microsoft Updates are using an encrypted and secure communication channel and another plain web(HTTP) based transfer channel we can cache what’s in the plain public channel without touching any of the private data safely.
The solution will be composed of three parts:
We are aiming to statically store public content which can be stored on a SSD drive which will allow fast downloads speed.
In squid we will use a “store” cache_peer for all the domains including “.download.windowsupdate.com”. The store cache_peer proxy will “log” every request in a way that it can be re-used later.
Every request will eventually end up with the outgoing IP of the special “store” cache_peer proxy so take it into consideration.
Every once in a while you will need to run the “fetcher” software that will use the logged requests in order to populate the store objects.
This solution was designed for small networks but can be used with multiple store servers and a big storage. So test first to understand the situation with your network.
If you are a large ISP(thousands of clients) this solution was probably not designed for you but if you have enough on your shoulder you can test it and see how you might fit it into your organization.
The service and the fetcher script at: GIST
The binaries packs contains static binaries for any OS I could create.
You need to install them at /usr/bin/ or at /usr/local/x/bin , choose the installation location of the binaries and change the location and file names inside the scripts accordingly.
Add into your squid.conf the next: acl wu dstdom_regex download\.windowsupdate\.com$ acl wu dstdom_regex download\.microsoft\.com$ acl wu-rejects dstdom_regex stats acl GET method GET cache_peer 127.0.0.1 parent 8080 0 proxy-only no-tproxy no-digest no-query no-netdb-exchange name=ms1 cache_peer_access ms1 allow GET wu !wu-rejects cache_peer_access ms1 deny all never_direct allow GET wu !wu-rejects never_direct deny all
If you have a specific volume that you want to store the static updates in then change the location “/var/storedata” both at the fetcher script and the systemd service file.
You can run the fetcher script under as a cron job according to your preference but it is recommended to run it in interval of at-least 6 hours.
Feel free to contact me directly via email or other options.
Windows 7 SP1 update path from 0