Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identifiability #3

Closed
dmetivie opened this issue Nov 8, 2024 · 2 comments
Closed

Identifiability #3

dmetivie opened this issue Nov 8, 2024 · 2 comments
Assignees

Comments

@dmetivie
Copy link

dmetivie commented Nov 8, 2024

This issue is more theoretical: in your code you require a minimal number of sample min_samples = 5 * n_params.
Do you have a ref? Or is it a community heuristic?

From Identifiability of parameters in latent structure models with many observed variables (ES Allman, C Matias, JA Rhodes - 2009), Corollary 5 (for binary categories) and discussion after for categories with cat elements the identifiability results is (using your notations):

n_item ≥ 2 ceil(log(cat, class)) + 1

Note that this result assume the same number of cat for each element. However, I guess using the minimum(cat) should provide the worst case bound.

Maybe that would be cool to actually require that in the package? It is cool when math and code are together!

@yanwenwang24
Copy link
Owner

Many thanks for raising this issue (and others) ❤️️! I will implement this change in the following days.

Just a quick note here that I will read through the notebook and respond to other issues you raised. But it may take a while. I will reach out if any help is needed. Many thanks for your kind suggestions!

@yanwenwang24
Copy link
Owner

@dmetivie I have added the test for identifiability, sending warning messages if:

n_items < 2 * ceil(log(minimum(n_categories), n_classes)) + 1

Minimal sample size is still a debatable question. From what I know, some suggest 300 (Nylund-Gibson & Choi, 2018), but this largely depends on the nature of data as well. Small sample sizes, if with adequate power after simulation, are considered good, too (Muthen & Muthen, 2002). So for now, I have removed the test.

Many thanks 🌹️

@yanwenwang24 yanwenwang24 self-assigned this Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants