-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Contract bytecode changes vastly when independent contracts added to the compiler #14250
Comments
For the record: we did found the cause and we'll fix this with the next release, but I'm not sure what to do for helping with verifying already deployed cases of this. |
If there is a way to recognize this, we can try to compile with everything provided (instead of just the sources in metadata) through the tooling, or warn the user about the issue on the UI. In fact this is what we do for ethereum/sourcify#618 and makes me realize this really is a similar case (and the Solidity bug #12281). I should've mentioned that but it didn't cross my mind 🤦 In that case, however, it is not possible to get a "perfect match" including the metadata, because the metadata will be changed. |
I'm not sure there's a really easy way to recognize this. It's very similar to the older issues we had where additional source files affected compilation results, but a different place in the compiler that we missed back then. The main cause of the difference is that the compiler chooses a different permutation of memory offsets for moving variables from stack to memory depending on some internal IDs that can change if you add more source files... The problem, however, is, that the smallest offset we usually use for these variables is So overall you will see constant pushes before jumps (i.e. offsets into bytecode) to change slightly - and you'll see constant offsets preceding memory loads and stores to change slightly. So the pattern should be |
So we can consider this fixed in newer compiler versions by #14311, right? Based on that I'm closing the issue, but feel free to reopen. |
Yep, we'll handle this ourselves. Thanks |
When a file is parsed, every AST node in it gets an ID. IDs are generated sequentially across all the input files as the compiler goes through them. As long as the files are processed in the same order and have the same content, you get the same IDs. So yeah, in case of bugs like this, where the compiler fails to make code generation completely independent of those IDs, changing the content in files unrelated to a contract may result in different bytecode for that contract. To be more specific, here's what must be avoided if you want to get identical bytecode in presence of such bugs:
Note that:
|
Summary
I came across a contract that could not be verified on Sourcify because the Sourcify's compilation output bytecode is different than the author's (Hardhat). Diving deeper I've found out the difference comes to the surface when some other contracts that are unrelated to the compilation target are added to compilation.
Specifically these two standard JSON inputs yield different bytecode for the same contract
CompoundLens.sol
:CompoundLens-solc-input-Sourcify-bytecode.json.txt
CompoundLens-solc-input-Hardhat-bytecode.json.txt
The only input sources differences are the following which are unrelated to
CompoundLens
:You'll notice the bytecodes differ even when the metadata hashes are the same. This is unexpected as the different sources listed above are not related to
CompoundLens
.CompoundLens
already compiles without those contracts are input inCompoundLens-solc-input-Sourcify-bytecode.json
but when added inCompoundLens-solc-input-Hardhat-bytecode.json
the contract's bytecode changes.To reproduce
Using Solidity version
v0.8.19
Output the bytecodes
Compare the bytecodes:
Background
The contracts are on this Github repo (
verify
branch): https://github.com/meterio/sumer-project/tree/verifyTo compile (and deploy) the original sources:
The contract is also deployed at: https://goerli.etherscan.io/address/0x46df081108b2e2FDf1bF1E84Eeb2D7ec3AdA0061
The bytecode diff between the Sourcify output and Hardhat output was not in the metadata hash or in a certain recognizable pattern for me:
CompoundLens-hardhat-recompiled-creation.txt
CompoundLens-recompiled-creation.txt
To generate the diff:
git diff --word-diff --word-diff-regex=. CompoundLens-hardhat-recompiled-creation.txt CompoundLens-recompiled-creation.txt
What I tried to do was to start from the standard JSON input of Sourcify and try to reach the Hardhat output bytecode in iterations:
initial Sourcify JSON input: CompoundLens-solc-input.json.txt
Hardhat JSON input: CompoundLens-hardhat-solc-input.json.txt
Using the same settings in Hardhat in the initial Sourcify input didn't change the bytecode.
sources
from the Harhat input in the initial Sourcify inputYes by using all of the sources from Hardhat, one generates the Hardhat's bytecode output.
Next, I iteratively copied sources from the Hardhat input to the Sourcify input to see adding which sources cause the change in the bytecode.
Sourcify's initial standard JSON input:
Hardhat's standard JSON input (clipped):
On each step I copied a contract that might be a potential cause of change, resolved the dependencies by also adding them, compiled the new iterated JSON input and compared the bytecodes.
Finally, I found a minimal standard JSON diff, that adding the specific sources would change the bytecode output. These are laid out in the above Summary section.
Environment
v0.8.19
The text was updated successfully, but these errors were encountered: