Just over three years ago I shared the struggles my team at the Financial Times faced trying to get Webpack 4 to compile files with consistent names across the 20+ separate codebases which serve FT.com. We fought to achieve consistency so that our users could navigate between our different services without needing to download the same things over and over again and to avoid teams busting the cache multiple times a day each time they pushed changes into production. Whilst we were eventually able to achieve our goal of generating output with a high level of consistency across our apps, getting there was very difficult and required developing some complex solutions.
Webpack 5 was released soon after my post in late 2020 and one change to content hashes in particular would have made our struggles almost entirely redundant.
The test case
To demonstrate the changes to content hashes between Webpack 4 and Webpack 5 I’ve chosen to use the Todo MVC Vanilla ES6 app. It has only 7 source files so it’s small but just complex enough to illustrate Webpack’s new and old behaviour clearly.
I’ve visualised the Todo MVC app dependency graph below using Dependency Cruiser and Graphviz. This graphic shows the relationships between all of the app’s source code modules, starting from app.js
- the entry point - on the left:
I’ll bundle the app using the configuration below. I’ve setup the split chunks plugin to output each module within its own output file - or chunk - rather than bundling everything into a single file to clearly demonstrate the differences between versions of Webpack. I’ll use the same config file for both Webpack 4 and 5 tests.
const path = require('path')
const moduleName = (file) => path.basename(file, path.extname(file))
module.exports = {
mode: 'production',
entry: {
main: './src/app.js',
},
output: {
path: path.resolve(__dirname, 'dist'),
filename: '[name].[contenthash:6].js',
chunkFilename: '[name].[contenthash:6].js',
},
optimization: {
splitChunks: {
chunks: 'all',
cacheGroups: {
modulesToChunks: {
name: (m) => moduleName(m.resource),
enforce: true
}
}
}
}
}
Using this configuration Webpack emits 8 files; the 7 source code modules all have an equivalent output chunk generated, and an extra file named main
has appeared too which contains the Webpack runtime code needed to stitch all of the separate files together again in the browser. Each file also has a 6 character content hash added to its name which is used to track the changes within.
app.5aa1aa.js
controller.4265cf.js
helpers.26e991.js
item.757ae7.js
main.a24d27.js
store.78b475.js
template.a21d0f.js
view.1115f5.js
Now that the app is being compiled as planned it’s time to make some changes to the source code and observe what happens to the output file names. I’m going to modify the application’s entry point by changing the order of its dependencies, switching the first reference into last place:
-import Controller from './controller';
import { $on } from './helpers';
import Template from './template';
import Store from './store';
import View from './view';
+import Controller from './controller';
After running Webpack again the chunk which contains the module I edited has a new content hash as expected but this is not the only file which has a new name…
Chunk name | Has changes? | Original hash | New hash |
---|---|---|---|
app | Yes | 5aa1aa |
ed729d |
controller | No | 4265cf |
e502be |
helpers | No | 26e991 |
26e991 |
item | No | 757ae7 |
757ae7 |
main | No | a24d27 |
a24d27 |
store | No | 78b475 |
056f2f |
template | No | a21d0f |
89aa7e |
view | No | 1115f5 |
58dbd6 |
Oh dear. I made a tiny change to one file but the result is a cascade of new content hashes names, affecting 5 of the 8 output files despite nothing changing inside any of them. Shipping a changeset like this would force users to download assets which are identical to the ones they already have and make their experience slower.
Why were there so many changes?
The [contenthash]
appended to the output file names are not only based on the code they contain but also data points which track the relationships between the source files. In Webpack 4 the hashes are constructed (roughly) like this:
Output chunk hash = the chunk ID
+ the hashes for each module inside the chunk
JS module hash = the module ID
+ a list of module dependency IDs
+ the module source code
+ the names of exported properties
+ the names of exported properties marked as used
Because the change I made caused the source code modules to be discovered in a different order this changed their incrementally assigned IDs. As the module hashes also include the IDs of their dependencies, not only did the module I change get a new hash but every other module with dependencies did too. The only chunks which didn’t get a new hash generated are the Webpack runtime and those containing modules with dependents but no dependencies:
This hashing behaviour is what caused us so many problems at the FT when trying to make 20+ apps all compile identical assets because the order their dependencies were found and exactly how they were used always varied.
Testing with Webpack v5
I’m going to run through the same steps as before; bundling the original app source code and recording the output file names, then make the change to imports and bundle the app again.
This time however I’m going to use Webpack v5 which uses a new content hash algorithm by default:
$ npm install webpack@^v5
And after completing all of the steps I recorded the following sets of output:
Chunk name | Has changes? | Original hash | Hash after change |
---|---|---|---|
app | Yes | 0a4820 |
d97c82 |
controller | No | 42fc14 |
42fc14 |
helpers | No | 658f80 |
658f80 |
item | No | 22b3fd |
22b3fd |
main | No | 8dc1e0 |
8dc1e0 |
store | No | 8ce8c8 |
8ce8c8 |
template | No | b04d02 |
b04d02 |
view | No | 3fa1e6 |
3fa1e6 |
This time the change I made did not cause a cascade of name changes to the other files. Instead the new realContentHash
feature has generated content hashes based only on the final contents of each chunk rather than a combination of development data 🎉
So is it worth upgrading?
If you’re working on projects which are regularly deployed to production and also have regular repeat visitors then I’d recommend upgrading your Webpack build processes to use Webpack v5. It’s new optimisation options do help to avoid unnecessary cache busting - by default - which will make your website faster and improve the experience for your users.