Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[move] Move out tag cache (safe speculation) #15676

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

georgemitenkov
Copy link
Contributor

@georgemitenkov georgemitenkov commented Jan 6, 2025

Description

This PR moves type tag cache out of the other type cache because it is conceptually different: type tags can be cached speculatively, same as indexed struct names. As a result:

  1. Type tag cache is implemented with tests and added to RuntimeEnvironment.
  2. V2 loader uses a new TypeTagBuilder API to encapsulate conversions from types to tags. Under the hood, it uses a pseudo gas meter. Tests
  3. V1 loader still uses the old path, so the code is versioned.

How Has This Been Tested?

  • Tests for type tag cache.
  • Tests for tag metering.

Key Areas to Review

Type of Change

  • Refactoring

Which Components or Systems Does This Change Impact?

  • Move/Aptos Virtual Machine

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented Jan 6, 2025

⏱️ 48m total CI duration on this PR
Job Cumulative Duration Recent Runs
rust-move-tests 13m 🟩
rust-move-tests 13m 🟩
check-dynamic-deps 8m 🟩🟩🟩🟩
rust-cargo-deny 7m 🟩🟩🟩🟩
rust-move-tests 3m
general-lints 2m 🟩🟩🟩🟩
semgrep/ci 1m 🟩🟩🟩🟩
file_change_determinator 46s 🟩🟩🟩🟩
permission-check 12s 🟩🟩🟩🟩
permission-check 10s 🟩🟩🟩🟩
check-branch-prefix 1s 🟩

settingsfeedbackdocs ⋅ learn more about trunk.io

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Copy link
Contributor

@gelash gelash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits.
Where is the logic for generating ty_tags currently? (for V1?)

@@ -1671,15 +1672,15 @@ pub const VALUE_DEPTH_MAX: u64 = 128;
/// fields for struct types.
const MAX_TYPE_TO_LAYOUT_NODES: u64 = 256;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, random thought but shouldn't such constants be defined in some more centralized place.. there should be something already, maybe @runtian-zhou ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should put them in VM config or even gas schedule. I will moving this code around for lazy loading and V2 loader layout cache, so we can do it then.

pub(crate) struct PseudoGasContext {
pub(crate) max_cost: u64,
pub(crate) cost: u64,
pub(crate) cost_base: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should really better be base_cost, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it aligns with cost_per_byte? So we have cost_... everywhere, but I have no preference here.

let mut gas_context = PseudoGasContext {
cost: 0,
max_cost: loader.vm_config().type_max_cost,
cost_base: loader.vm_config().type_base_cost,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does VM config check max_cost >= base? it would be a nice check to add to some places (here or config or ty_tag_cache)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a constructor for the gas context (cost is 0 initialized, hence, we cannot have < base), and made fields private.

/// Converts a runtime type into a type tag. If the type is too complex (e.g., struct name size
/// too large, or type too deeply nested), an error is returned.
pub(crate) fn ty_to_ty_tag(&self, ty: &Type) -> PartialVMResult<TypeTag> {
let mut gas_context = PseudoGasContext {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should create another constructor for PseudoGasContext (there is one straight from config).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly! Did that

// - maximum allowed cost,
// - base cost for any type to tag conversion,
// - cost for size of a struct tag.
max_cost: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we encapsulate these params and re-use them across here and PseudoGasContext? I.e. pseudogascontext can have a CoW of the params (then when we create context in calls below we can take a reference of params)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a constructor into gas context, now encapsulated.

/// If thread 3 reads the tag of this enum, the read result is always **deterministic** for the
/// fixed type parameters used by thread 3.
pub(crate) struct TypeTagCache {
cache: RwLock<HashMap<StructNameIndex, HashMap<Vec<Type>, PricedStructTag>>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why a two-layer cache and not just key (StructNameIndex, Vec)?
Say if we used DashMap it could be for sharding on the first key, but here I don't see a difference other than having to chain calls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dashmap cannot work because if you have (&idx, &ty_args), you cannot construct &(idx, ty_args)? What we can do is use hashbrown::HashMap which for get() requires key Q: Equivalent<K> of an actual K, so we can get by both (&idx, &ty_args) and &(idx, ty_args)?

gas_context.charge(size * gas_context.cost_per_byte)?;

// Cache the struct tag. Record its gas cost as well.
let priced_tag = PricedStructTag {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to let the max cost go to max cost + cur_cost temporarily, so we can cache the struct and not discard work, but then we can still return the error if it went above max cost.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vgao1996 interesting question in general I think.

/// # Speculative execution safety
///
/// A struct name corresponds to a unique [StructNameIndex]. So all non-generic structs with same
/// names have the same struct tags. If structs are generic, the number of type parameters cannot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a missing piece here is that types also can't change, right?
and maybe worth saying what the tag depends on as an abstract function, and that arguing those don't change (or when they change, e.g. enums, below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tags do not change, correct, because they depend on names -> if we have an index and if we have a name, tag is uniquely defined by these two + type arguments, which are types which also cannot changed because those contain indices.


// Calculate and charge the cost for the struct tag, proportionally to its size.
let size =
(struct_tag.address.len() + struct_tag.module.len() + struct_tag.name.len()) as u64;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, we must be paying so much efficiency in resource groups using these tags as keys @igor-aptos 😢

})
}

fn struct_name_to_ty_tag(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically it's struct name idx, so we could either call it struct_to_ty_tag, or struct_name_idx_to_ty_tag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed!

@georgemitenkov
Copy link
Contributor Author

@gelash V1 loader defines it in its huge mod.rs here:

@georgemitenkov georgemitenkov force-pushed the george/struct-tag-cache branch from 5102271 to a4b33f5 Compare January 11, 2025 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants