Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Manual chunks #207

Open
intrnl opened this issue Jun 30, 2020 · 28 comments
Open

[Feature] Manual chunks #207

intrnl opened this issue Jun 30, 2020 · 28 comments

Comments

@intrnl
Copy link

intrnl commented Jun 30, 2020

Similar to Rollup's manualChunks option, where you can define custom common chunks

{
  "manualChunks": {
    "common": ["react", "react-dom"]
  }
}
@evanw
Copy link
Owner

evanw commented Jul 1, 2020

Why would that be a better thing to do than automatically-computed chunks? I'm sure there are reasons, I just want to make sure I know why you would want to do this so that I know what this feature request is about. Is it because aesthetically seeing multiple files called chunk isn't desirable? Is it wanting to improve caching for repeated page loads? Is it about wanting to control the number of chunks? Something else?

@intrnl
Copy link
Author

intrnl commented Jul 1, 2020

it's primarily about caching. when your application code and its dependencies is bundled into one, users will have to keep downloading everything even though only your app code changes. so i think that it would be nice if the dependencies are split into its own.

an alternative proposal would be to have a toggle to move everything under node_modules to be its own, but i think providing manual control like this would be better especially if other devs have their own strategy for splitting.

@evanw
Copy link
Owner

evanw commented Jul 1, 2020

If this is about caching, then should this disable tree shaking? Otherwise when tree shaking is active, the size of even manually-specified chunks can grow and shrink between builds.

Also it's worth noting that esbuild will need other changes to achieve these caching goals. Right now all symbols in the bundle are minified together. Optimal caching will require symbols to be minified independently per chunk.

@intrnl
Copy link
Author

intrnl commented Jul 2, 2020

Tree shaking might be problematic yeah, which is quite the tradeoff

Would've liked to say that devs should only be specifying manual chunks for things that can't be tree shaken, but I think a lot will miss the point 😅

@visj
Copy link

visj commented Mar 15, 2021

Closure Compiler allows code splitting manual chunks. If esbuild were to allow manual chunks in the same way, it could also be used as a very strong developer tool for applications using Closure Compiler as esbuild is a magnitude faster.

If you for instance have 5 entry points [a,b,c,d,e], and some code is shared only between [a,b], you could split that code out as well as code shared by all 5 chunks and make only routes [a,b] depend on the first chunk. I think that's how Google splits out their JS when you get different widgets in their search (such as the calculator or stock graphs).

@mjackson
Copy link

If this is about caching, then should this disable tree shaking?

My vote would be yes, it should disable tree shaking in the chunks that are manually defined (or at least have the option to do so) since the content of those chunks could otherwise change between builds.

I am thinking about using this feature to split out dependencies (node_modules) into their own chunks, but I'd still like to have tree shaking in my app code where I would allow esbuild to determine the optimal chunks.

@johnpyp
Copy link

johnpyp commented Mar 30, 2021

This would be a great feature. We have a monorepo with some large dependencies, as well as monorepo dependencies. The monorepo packages change often, and our dependencies don't.

We'd like to have a few separate bundles where we could load an external dependency bundle, as well as a bundle for monorepo dependencies, and finally our entrypoint.

With manual chunking ideally we'd be able to load these bundles sequentially and have them reference the previously loaded chunks globally almost like an external dependency, though I understand the current splitting implementation only allows esm.

As far as tree shaking, each chunk would ideally stay static for caching given the same inputs within that chunk, but would tree shaking per-module within that chunk be possible? For example if you have an index.js entrypoint in that module, you wouldn't need to include test code that isn't referenced from that entrypoint. Though, I don't know how this could break things like sub-path imports, i.e import debounce from 'lodash/debounce'

@lgarron
Copy link
Contributor

lgarron commented Apr 10, 2021

By any chance, could there be a simple way to get parts of the codebase to essentially treat each other as external?

e.g. pretend that each folder index entry point essentially corresponds to its own "npm module", and only allow them to import from each other's indexes so that they can be compiled independently. I imagine it's already possible to hack together a build system that uses esbuild like this, but it doesn't seem like a great use of time.

lgarron added a commit to cubing/cubing.js that referenced this issue Apr 12, 2021
This is a hacky implementation of:

- evanw/esbuild#207
- evanw/esbuild#492

If either of those are implemented, most of this should be obsolete.

For now, we will use this build in CI to enforce an impohgra graph among
the `cubing/*` modules.
lgarron added a commit to cubing/cubing.js that referenced this issue Apr 12, 2021
This is a hacky implementation of:

- evanw/esbuild#207
- evanw/esbuild#492

If either of those are implemented, most of this should be obsolete.

For now, we will use this build in CI to enforce an import graph among
the `cubing/*` modules.
@mohamedmansour
Copy link

This feature will really help us in our large code base, we have 30 entry points in Microsoft Edge (chromium), and overall distribution size is really important to keep at minimum. Our webpack build ran in 110 seconds, meanwhile our esbuild ran in 1.5 seconds. One of our pages was downloading around 60 chunks on page load causing first to interactive to be slower than just downloading one big chunk. Since we have no network latency, managing the chunk algo is important.

If there was a way to say that everything that is treeshaked from some NPM package will always be chunked together, it will be shared with all 30 entry points:

{
  combinedChunks: ['@chromium/framework', '@chromium/common`]
}

Or if there was a way to set a maximum number of chunks, just an idea. The more chunking you have, the more diskspace it will use. There is a balance of just using couple of chunks perhaps defined in the config?

@MuTsunTsai
Copy link

If this is about caching, then should this disable tree shaking?

I don't see why it should. Yes it is true that "the size of even manually-specified chunks can grow and shrink between builds", but it MAY also not, if the parts being imported to the app doesn't change. When tree shaking is enabled, the overall downloading size for the user is at worst the same as before during updating, and in many cases, we do save downloading size by using manual chunks.

@jhirshman
Copy link

I'm also interested in support for a vendors file.
We currently use webpack and want to switch to esbuild.

However, our site code deploy every hour which presents a problem for users with bad internet connections who then have to re-download all of the chunks every time we deploy. We solved this with webpack by building a vendors dll that contains all of the big libraries. This file is ~10MB but updates only once every few months. If we were to switch to esbuild, the total chunk size would be much smaller than 10MB due to better tree shaking but cumulatively over the course of the day, there would still be more data transferred to a user.

We'd be willing to look into other solutions as well. But our ideal would be to list out a number of libraries that would be built externally without tree shaking and then incorporated into the main esbuild run.

@vadistic
Copy link

vadistic commented Feb 9, 2022

I just wanted to mention that it would be great if that feature would somehow fit into esbuild plugin API.

For example build.onResolve could optionally return custom chunk name - then we could write simple plugin to configure our chunks.

@IdeaHunter
Copy link

Me: yarn add esbuild
Me: how do i configure vendor bundle
Me: read this feature
Me: yarn remove esbuild

I have 100k LOC frontend project with bunch of dependencies, with workers which would mean user would be forced to download all vendor dependencies at least twice if i cant create common vendor chunks

@OlegWock
Copy link

This will benefit browser extensions too. It's common practice to have a few entry points (background worker, popup, a few content scripts are bare minimum) there. To reduce extension's size in my projects I use webpack and put all shared code into separate chunk which then loaded before entrypoint.

Tree shaking perfectly fits into this model, since this chunk will be included in extension's dist and will be downloaded every time browser install/updates extension, so no risk of mismatch between chunk and entry points

@garygreen
Copy link

+1 showing my interest for this as well. It's the main blocking feature we have for switching to esbuild.

This feature request is essentially implementing some kind of feature similar to Webpacks splitChunks - being able to configure conditions on when and how chunking occurs. So with splitChunks you could configure to only chunk when files are larger than e.g. 5KB, or if they are part of node_modules then put them in a vendor.js chunk, or exclude certain files, etc.

@jpreynat
Copy link

jpreynat commented Sep 6, 2022

Our team would also be interested in this kind of feature 👍

The main reason for us is also to allow for better caching, especially of 3rd party modules that are not updated frequently.
Currently, each one of our releases creates some really big chunks that include all this code again, but at scale this leads to lots of bandwidth consumption for the exact same code.

@brunoargolo
Copy link

brunoargolo commented Nov 15, 2022

I've come up with a workaround to produce a separate chunk per vendor module, its a bit hacky but maybe it can help someone.
This might actually mess with tree shaking and it actually produces a larger overall bundle.
I'm using this for nodejs apps, not browser, so for me the benefits still outweigh the slightly larger total bundled size.

The problem for me was the chunk/source map sizes being generated caused nodejs to use 3 times the amount of memory.
Producing separate, smaller chunks/sourcemaps bought memory utilization back to normal (unsure of the actual underlying reason, but got the idea from this: nodejs/node#41541 ).

The goal of the script is to create a separate entrypoint that dynamically imports all your vendor dependencies.
something like the below:
await import('module1'); await import('module2'); ...

This will cause esbuild to split those libs automatically.

Here is a script to automate the creation of the entrypoint:

import { createRequire } from "module";
import _fs from 'fs';
const fs = _fs.promises;
const require = createRequire(import.meta.url);

const getExternalModules = async (pkgJsonPath) => {
  const packageJson = require(pkgJsonPath);
  return Object.keys(packageJson.dependencies);
}
const createModuleChunkingEntrypoint = async packageJsonPathList => {
  const modules = [];
  await Promise.all(
    packageJsonPathList.map(async pacakgeJsonPath => {
      modules.push(...await getExternalModules(pacakgeJsonPath));
    })
  );
  //de-dupe modules if needed
  let externalModules = [...new Set(modules)];
  const chunksEntrypoint = './chunks.js';
  await fs.writeFile(chunksEntrypoint, externalModules.map(m => `await import('${m}'); `));
  return chunksEntrypoint;
}
const chunksEntrypoint = await createModuleChunkingEntrypoint(['./package.json', '../../libs/shared/package.json']);

build({
  entryPoints: ['./src/index.js', chunksEntrypoint ],
  splitting: true,
  ...
})

If you notice any other pitfalls from this approach let me know

@arobinson
Copy link

Another reason why this feature is vital is for code coverage. When using esbuild to package source for running unit tests, the 3rd party code is mixed in with 3rd party code. It is problematic, and undesirable, to instrument and perform code coverage of 3rd party code. The problem is that one cannot instrument the results of the esbuild bundling/splitting and only instrument first party code.

This requires "hacks" like https://github.com/hyrious/esbuild-split-vendors-example to force esbuild to separate out the code.

It would be of great value to be able to split the 3rd party code from first party code via esbuild plugins at least similar to some of the webpack split chunks functionality.

@jpike88
Copy link

jpike88 commented Apr 28, 2023

Can there just be an option for the time being to omit vendor source maps from the bundle? They can be huge and don't really offer that much value when debugging. Angular CLI is starting to experiment with esbuild, but it's relatively common for angular projects to have non-trivial vendor bundles. This inability to at the very least split/exclude vendor source map generation is causing performance issues with VSCode/Chrome to the point of it being unusable, meaning I am stuck with the webpack bundler until this is resolved.

angular/angular-cli#25012

@pumano
Copy link

pumano commented May 8, 2023

I just drop my 5 cents to it. Why important to have vendor chunk? It's about how browser consume chunks. When I use vendor chunk (webpack) I have ~ 7 initial js files (chunks) and few will be lazy loaded, but when I use esbuild I got many many chunks (around 65). We know most browsers can download only 6 files in parallel (using http 1.1) other will be postponed until connection will be available for it. That going to result when core web vitals significally dropped when using esbuild instead of vendor chunk of webpack. If you not trust me, do your experiment yourself (for example via lighthouse tab in chrome). Problem should not be existed using http/2.

@jpike88
Copy link

jpike88 commented Jun 16, 2023

@clydin as the issues were auto-locked, I just want you to be aware (if not already) that this problem prevents me from developing using the esbuild pathway, the source maps are massive and cause VSCode to poop itself when I try to step through code, presumably because of a huge sourcemap that's including everything from the vendor bundles.

@brianjenkins94
Copy link

brianjenkins94 commented Aug 12, 2023

Anybody have any luck with what @brunoargolo posted?

It would make for a very nice workaround if it could be made compatible with Rollup's manualChunks configuration option, i.e.:

export default defineConfig({
	"entry": [
		"main.ts",
		await manualChunks({ // <--
			"monaco": [
				"monaco-editor/esm/vs/editor/editor.api.js",
				"vscode/dist/extensions.js",
				"vscode/dist/default-extensions"
			]
		})
	],

I'm working on seeing if this is possible but I don't quite fully understand it yet.

#490 (comment) also seems like it could be useful.

@brianjenkins94
Copy link

brianjenkins94 commented Aug 13, 2023

I'm not sure if this does the exact same thing but it seems close.

// Chunks

async function findParentPackageJson(directory) {
	if (existsSync(path.join(directory, "package.json"))) {
		return path.join(directory, "package.json");
	} else {
		return findParentPackageJson(path.dirname(directory));
	}
}

async function manualChunks(chunkAliases: { [chunkAlias: string]: string[] }) {
	return Promise.all(
		Object.entries(chunkAliases).map(async function([chunkAlias, modules]) {
			const dependencies = [...new Set((await Promise.all(modules.map(async function(module) {
				let modulePath;

				try {
					modulePath = url.fileURLToPath(resolve(module, import.meta.url));
				} catch (error) {
					modulePath = path.join(__dirname, "node_modules", module);

					if (!existsSync(modulePath)) {
						return [];
					}
				}

				const packageJsonPath = await findParentPackageJson(modulePath);

				const packageJson = await fs.readFile(packageJsonPath, { "encoding": "utf8" });

				return Object.keys(JSON.parse(packageJson).dependencies ?? {}).filter(function(module) {
					return existsSync(path.join(__dirname, "node_modules", module));
				});
			}))).flat(Infinity))];

			await fs.writeFile(path.join(__dirname, "chunks", chunkAlias + ".ts"), dependencies.map(function(module) {
				return `import "${module}";\n`;
			}));

			return path.join("chunks/" + chunkAlias + ".ts");
		})
	);
}

// Main Config

export default defineConfig({
	"entry": [
		"main.ts",
		...await manualChunks({
			"monaco": [
				"monaco-editor/esm/vs/editor/editor.api.js",
				"vscode/dist/extensions.js",
				"vscode/dist/default-extensions"
			]
		})
	],

@pumano
Copy link

pumano commented Nov 8, 2023

@evanw any news about this feature? Do you have plans to implement it? 57 likes (votes) here. Also it's very important to have vendor chunk when using http/1.1. due to large amount of connections when project has many chunks. Thats totally ruin core web vitals.

@apastuhov
Copy link

Here is an option how to setup manual chunks, the downside is 2-step build.

src/vendor.ts

export * from "lit";

bundle.js

// Build vendor file
await esbuild.build({
  ...commonConfig,
  entryPoints: ["./src/vendor.ts"],
});

// Build your website code
const publicEntry = "/assets/scripts/";
await esbuild.build({
  ...commonConfig,
  entryPoints: ["./src/app.ts"],
  alias: {
    lit: `${publicEntry}/vendor.js`, // will map dep to vendor "chunk"
  },
  external: [`${publicEntry}/vendor.js`], // will exclude dep from code
});

Then in HTML simply use ESM:

<!-- 27kb - lit deps -->
<script type="module" src="/assets/scripts/vendor.js"></script>
<!-- 3.4kb - app logic -->
<script type="module" src="/assets/scripts/app.js"></script>

p.s. As previously it was mentioned - it should be done intentionally, due to no tree-shaking and no namespaces in vendor file. For example same code but as a single file with tree-shaking - is 25kb total. Alternative would be to lookup for all imports in source-code and export them in vendor.ts, but it is overhead IMHO.

@polRk
Copy link

polRk commented Nov 16, 2024

Something else?

If I build a github action, I want all dependencies to be in a separate chunk, because I directly upload the collected files to git and the diff turns out to be huge, I would like to see the diff of only written code by me, and the diff of the chunk would change only when the dependencies changed

@emutime
Copy link

emutime commented Nov 21, 2024

In Closure Compiler : https://github.com/google/closure-compiler/wiki/Flags-and-Options#code-splitting

In the example below, both chunk page1 and chunk page2 depend on chunk common, so even if page1functions.js and page2function.js reference commonfunctions.js, commonfunctions.js will not be bundled into the page1 and page2 chunks.

--js commonfunctions.js
--chunk common:1
--js page1functions.js
--js page1events.js
--chunk page1:2:common
--js page2function.js
--chunk page2:1:common

In Webpack : https://webpack.js.org/concepts/entry-points/#entrydescription-object

In the entry point config, the dependOn property achieves the same functionality.

module.exports = {
  entry: {
    a2: 'dependingfile.js',
    b2: {
      dependOn: 'a2',
      import: './src/app.js',
    },
  },
};

esbuild is very fast, but the lack of similar functionality prevents us from using it in production environments; we are currently only using it in development environments.

@pumano
Copy link

pumano commented Nov 21, 2024

looks like it's won't be implemented, I'm just waiting rolldown under the hood of vite (hope end of the year or early 2025), where problem will be solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests