Microsoft Announces Plans to Deprecate NTLM from Windows
Microsoft and Apple tend to develop their desktop operating systems under a fundamentally different paradigm: Apple doesn’t bother to support legacy code or features after just a few years, while Microsoft seems content to let you use the same mechanisms in Windows for decades. It’s one reason why Windows, as a client OS, is so much easier to support at a corporate scale than Mac OS: you don’t have to go back and re-package all your configurations and apps every 3 years. But while this builds convenience from an administrative perspective, it also means supporting legacy security configurations long past their expiration dates.
Remember LM (LanManager) authentication? In the later days of NT 4.0, even before Active Directory, Microsoft was furiously trying to get organizations to move away from LM and even NTLMv1, as those protocols were obsolete and not secure. At that time, there was a hard push to enable NTLMv2, and a number of early domain policy configurations to support that migration.
Active Directory was introduced and designed around Kerberos authentication, but from my position on the ground the messaging was still stronger around eliminating LM than enabling Kerberos.
At a core level, Kerberos uses line-of-sight access to a key distribution center (KDC)—usually a Domain Controller--to mutually authenticate the client and the remote service. Kerberos uses tickets with user ID, requested service information, and validation lifetime to create a session key for access to the remote service. No credential pairs are transmitted between the user and the app or file-share, so the actual mechanism for authenticating the user is flexible, as is the encryption mechanism.
NTLM uses a challenge/response mechanism that always includes a credential pair: username & password. That password is hashed, but it is always present and cannot be replaced with modern authentication methods or SSO. In NTLM v1, the password is not even salted. NTLM remains present in most modern environments because it is the default fallback for when Kerberos does not work, which turns out to be pretty regularly. For instance, when a client doesn’t have on-going line-of-sight to a domain controller or KDC. Offline access cannot leverage Kerberos. NTLM is also hard-coded to use RC4 encryption, which is obsolete.
One other interesting feature of NTLM was that it wasn’t particularly picky about names. With the introduction of AD, Microsoft reiterated their long-standing position that they would support “the way we’ve always done things”, i.e. users browsing to file-shares just like they did in the good ol’ NetBEUI days! And while that meant we could so some neat forgeries in WINS, plenty of admins (myself included) were eager to embrace the power of DNS and started abstracting server names from service names, just like we do on the internet.
Consider that whatever server is actually hosting this content is not named www.synergy-technical.com, but probably some horribly user-unfriendly name like web_134rasdvzaw3rt_svc001. Nobody would expect a user to know that name, and so we hide it behind a friendly name. While public-internet facing machines have to layer in cryptography nowadays, back in 2000 you could easily tell IIS4.0 to ignore the hostname and focus on serving up the directory structure. This came with the magnificent benefit of being able to replace web servers without taking down the website. Hallelujah! Uptime achieved!
We could, and did, replicate this configuration with NetBIOS names in DNS.Push a default DNS search suffix list to your clients, jam a bunch of host names and aliases into your zone files, and suddenly file servers became replaceable. (side note: DFS did exist at the time but was super flaky and really loved replicating deletions with no recycle-bin support) My files could be on XENUWARRIOR09.company.local (it was 2000 and .local was what all the cool kids were doing), but SMB clients could easily browse to the NetBIOS name \\finance\share to get their files. It was simple. It was magnificent. It was, apparently, very wrong.
When Microsoft released Windows XP and Server 2003, Something Changed™. Suddenly those replaceable servers started throwing client-side errors. That change was the default reliance on Kerberos, and the symptom was an auth failure for a server-name mismatch. Remember that Kerberos performs mutual client/service authentication, so the client expects the server to be truthful in its response AND match the name in the service request.
A few dozen support calls later, Microsoft’s official suggestion was to deploy a GPO setting a single simple Registry fix: disablestrictnamechecking=1. Boom: done. Client errors zapped to oblivion, users happily browsing away on XENUWARRIOR09 with no cares in the world.
Except that that particular Registry fix was paving over a very serious future security problem for a quick win. That setting explicitly tells Windows client not to bother with a foundational component of Kerberos: mutual auth. And if Kerberos doesn’t work, we fall back to NTLM. So while the setting doesn’t explicitly force client communications in an AD environment to use *only* NTLM, it prevents Kerberos from happening at all. We’re not forcing NTLM, we’re just breaking Kerberos.
So why do we care about the jury-rigged fixes of 2003 in 2023? Because Microsoft has recently unveiled plans to deprecate NTLM and eventually remove support from Windows.
To do this requires a long journey of change for Windows clients, which again cannot use Kerberos without line-of-sight to a KDC. It means introducing a new mechanism for local Kerberos authentication into every Windows client, and it means cleaning up old configurations and code.
Microsoft broke down global statistics on NTLM use:
- 29% of NTLM is local or offline
- 14% is “misconfigured” infrastructure.
- 50/50 split of hard-coded IP addresses and name mismatches
- 52% of NTLM is apps hard-coding NTLM as the only authentication protocol
- 60/40 split of MS-developed apps and consumer apps
- 5% “other”
Of these, Microsoft is tackling the lift for about 60% through offline support and reconfiguring their own apps. This will be a Herculean lift tackled by every stratum of their development teams. Windows must first have a mechanism to process local Kerberos, which will be introduced through a local KDC and a protocol called IAKerb in a future version of Windows, but also literally every single app in the MS portfolio will have to be examined to ensure it supports Kerberos authentication. Shockingly, the effort actually skews to this being the slightly higher challenge, even in late 2023.
Outside the Microsoft efforts, which will take at least 2 years to implement, and another several to forcibly remove NTLM from the Windows stack, the remaining ~40% falls to us, and aligns over 3 main areas:
- Application reconfiguration (~21%)
- Hard-coded IP addresses (~7%)
- Service-name mismatches (~7%)
The first and biggest group of changes will fall to app developers. Sysadmins rejoice! But while they have a lot to change, according to the content shown in the webinar, it won’t be a huge lift for most. In most cases it will be simply changing an authentication string from ‘ntlm’ to ‘negotiate’ in their code. 1 line—really just 1 word.
That 1 simple word will be the key going forward. We’ve had ‘negotiate’ as an auth option in Windows since forever-and-a-day ago, and it cycles through the Rolodex of available authentication options from strongest to weakest, rather than using a single hard-coded authentication method with no backup in case of failure. We know it as SPNEGO, and as Microsoft moves through this journey, it will default to these authentication protocols, in order:
Whatever comes after Kerberos will simply be inserted into the stack, because SPNEGO is extensible and flexible where NTLM is not.
The other 14% will be a mix of responsibilities, and for this Microsoft is releasing auditing tools to help admins understand what NTLM traffic exists today and how to correct it…mostly.
For the 7% that is hard-coded IP addresses, think of your old Oracle implementations that looked down upon DNS with utter disdain. Those apps and services will have to be torn open to have IP’s replaced with names. Admins, likewise, who have served up printers or Windows 2012 clustered SMB shares via IP address will have an unpleasant day (not to mention 2012 is end-of-life as of this month).
But back to that GPO and the remaining 7%... I asked very specifically about accessing file servers via CNAME entries in DNS and was told that if the server name does not match the name request from the client, then the server is “lying” about its identity and this is a Bad Thing and a misconfiguration. I won’t dispute the stance as it does force NTLM where Kerberos should be used, but Microsoft’s own official guidance for those 20-year-old client-side connection errors is still to create CNAME entries for servers in AD DNS, though with the caveat that this should not be done “in the future”.
That same article strongly recommends that, in order to continue using alternate server names in the future, server admins will have to add an SPN to AD with a “netdom computername/ADD” command. This may fix the Kerberos issue, but will re-introduce the challenge of name portability, as that alias name will be officially registered to the hosting server. Replacing that server then becomes a disruptive event with downtime for the alias name.
I think it’s exciting and long overdue for MS to nuke NTLM from Windows. Seeing this work unfold over the past couple of years, along with the introduction of Kerberos authentication for Azure resources, has been an amazing journey that began with an almost-impossible mission, given the line-of-sight requirements that Kerberos imposes. And I don’t think most organizations will be at risk of the 7% of traffic enabled by that 20-year-old workaround, if mostly because Active Directory environments born after 2002 will have simply not built these configurations.
But I do caution organizations with AD’s that date back to the dawn of Active Directory to check your GPO’s. Find any that are imposing ‘disablestrictnamechecking’, or indeed any that lower NTLM requirements for client/server communications, and start working to resolve those challenges now. Check your CNAME records in DNS and ensure you’re pointing users to actual server names or following current best practices on redirections. Look through your printer queues and find the ports that are being served up as raw IP addresses, and have a plan for clustered SMB shares. The odds are strong that if you have any of these configurations, you probably have several challenges ahead.
To aid in this, Active Directory has audit policies to help identify NTLM traffic. Now is the time to put these tools in place and prepare for a post-NTLM world!
The average data breach–per incident–costs $3.5 million in damages. Reduce your risk of cyber attacks by running Windows, the most up-to-date operating system for your organization's IT infrastructure, and provide your end-users with a secure desktop experience. Find out how your organization can minimize risk and cost by moving to Windows.