-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Efficiency of surrogates for higher dim #106
Comments
Are you running
? (which implicitly runs on fsphere btw). First try a higher initial sigma, such as:
which does the trick for me here, and beats the algorithm without surrogates and with same initial sigma. I guess you have looked at it, but just in case, see https://github.com/beniz/libcmaes/wiki/Using-Surrogates-for-expensive-objective-functions as there's a dedicated Python script that in addition to regular useful values plots both the train and test error of the surrogate. Also, look at additional options with
A potentially useful flag is Finally, to my knowledge, the science of surrogates with CMA-ES is still pretty young. The most informed person to my knowledge is @loshchil. He has studied related techniques that are not available in the lib yet, such as #76. |
@nikohansen there's the obvious question here of TPA + surrogates :) |
What exactly does |
|
It looks like a problem connected to step-size control though, because we see standard deviations and eigenvalues increase systematically and far above 1. This suggest that the step-size is (far) too small, which should not happen under normal circumstances. |
As far as I understand, the problem is in the library, isn't it? If not, is there anything a user can do about controlling step-size? |
Agreed, the user can't really do anything to improve step-size control. What we see is probably a bug in the library, or a (subtle) problem from the coupling of surrogate and CMA. |
I wouldn't rule out a bug indeed. It will take me some time to re-run comparisons to the little witness runs I have from publications. For some reasons, the cluster I used to rely on is now significantly slower than my laptop, and I am trying to work around it. |
Just an update on this issue: I do confirm this is due to several intertwined bugs. I've put a long fight and they should be fixed now, along with better default settings of a few parameters. FYI, what I would now consider to have been the main bug, was messing the final ranking of candidates before the update. I plan to update the original ticket #57 with new reports from runs vs published literature, along with commits. The code has still a few rough angles, so I will mention it here once it is ready to use. Here is the fixed run with active-CMA-ES and surrogate in 30-D on fsphere: |
Great news! Looking forward to testing new version. |
Fix has been pushed,
now yields the desired behavior. Thanks for reporting the initial problem, it did indeed lead to very necessary bug fixes and improvements. |
FYI, not all runs appear to converge equally well, and this is something I am investigating, along other improvements. |
Thank for information. I have not tested it yet. Please let me know when you're done, I will update and run it here. |
Well, in fact there was no problem but an old version of the program in the path on my laptop. On the bright side, this had me check on three different machines and compilers, many runs with 100% success, so a false alarm definitely! |
When I do make install it seems the following files are missing in the $PREFIX: surrogates/rankingsvm.hpp |
try using tests/cma_multiplt_surr.py to plot the results. This is required to plot surrogate data output. |
First, can you confirm that the previous example with fsphere and acmaes does work properly ? (just making sure the proper version of the code is running to get this out of the way). My understanding is that you are pioneering surrogates with CMA and 'unknown' (i.e. out of benchmarks) functions, that is, in addition to potential bugs. From there and the graph with the right python script above, we'll be able to move forward. |
somewhat unfortunate user interface ;-)
I have seen this before. Surrogates are no guaranty for an improvement, but the shown results are not (yet) conclusive. Just to cross-check: are you sure the true function values are shown in the surrogate case? Commenting the results without surrogate:
|
Thanks for your help. I will do all the tests anew in a more systematic way, but maybe solving this installation problem first is a good idea, because it is potential cause of problems for a user: I had to copy these files manually and could make a mistake. |
This has been fixed. It is recommended you use
in between two versions. |
Certainly not, this is due to not using the dedicated script (https://github.com/beniz/libcmaes/wiki/Using-Surrogates-for-expensive-objective-functions). I will work on making a single script instead, but basically the default output function is different whether surrogate is active or not. As for now, running the correct script on the existing output should generate a proper plot. |
btw, well noted, part of #91, still a lot to do. |
Re annotations: one relatively simple way to get all the goodies from the existing plotting routines would be to convert the output file to a format the existing plotting routines can read (one file for each subfigure). This should be relatively straight forward. The first two rows look like this (the remaining rows contain just more data): File
File
File
File
File
The first two columns are always iteration and evaluations and the variable length data start from the 5-th column. |
@agrayver OK, so last fix does the trick, results now on par with literature. Sorry for the inconvenience, this was a nasty and well hidden one. This should now run both faster and perform better. |
@nikohansen good idea, let's work this as #110 |
I cannot compile freshly installed RSVM-related code. From surrcmaes.h these includes do not exist:
changing to
works however. |
@agrayver yes, it does. To 1e-8, I get 5565 f-evals without surrogates, and 2815 with surrogate exploitation. |
@agrayver below are the results for Rosenbrock (that I still need to add to #57), average fevals over 10 runs with std deviation and number of successful runs:
This uses the tests/surr_test exe and default parameters are not the same as in the sample code. |
@beniz would you mind please elaborating on which parameters you think need to be tweaked for the sample code to be more efficient? |
@agrayver try setting same x0 on runs you want to compare, and for Rosenbrock, you can use EDIT: 70 instead of 40 |
Typically, compare (no exploitation)
to (exploitation)
|
Thank you for the help. Using -l ~70 * sqrt(N) seems to work. I see significant reduction (1.5-3x) of function calls for different functions and dimension (still have not tested my real app). At the moment I am testing 100D Rosenbrock and it takes very long time. It's been running for more than 5 hours on my quite modern laptop and it is still far from finishing. I'm hence wondering:
|
Try lowering the number of iterations of the RSVM algo 'rsvm_iter' to something in between 150000 and 1M which is the current very conservative default. |
Roughly speaking, O(niter,l-1) for the algorithm, with l~70_sqrt(N), and O(N_l^2) for the kernel computation. For thorough details, see https://www.lri.fr/~ilya/phd.html page 81 (section 4.1.1) |
Thank you, I got a reasonable performance with testing function. Now, switching to my actual problem I still see that surrogates deteriorate convergence. Here is results with no exploitation, 40D: Here RSVM was used (with -l 500 and rsvm_iter 200000): I used the same starting guess and seed for both. |
At first glance I d suggest you try increasing rsvm_iter and see if at least it does improve convergence. On February 3, 2015 12:06:34 AM GMT+01:00, Alexander Grayver [email protected] wrote:
Sent from my Android device with K-9 Mail. Please excuse my brevity. |
It looks like the initial step-size is at least a factor 1000 (in the previous plot a factor 1e5) too small and the mechanism to prevent one eigenvalue to massively increase doesn't work out here. The second point should be checked in the lib. Addition: it can be useful to use a small initial step-size to check where locally the best solution will be found. In this case I would ideally expect an initial increase by a factor of ten or so. |
This is not utterly clear to me, could you give me more details, thanks! |
Sorry, yes, this is the factor |
Closing for no activity after 15 days. Can be reopened as needed. |
…alues + rank available in candidate object, ref CMA-ES#57, CMA-ES#106
I took examples/surrogates/sample-code-surrogate-rsvm.cc and set dim=30, alg=acmaes, fixed seed=12345. Then, running with set_exploit(true) and set_exploit(false) one sees (figure attached) that exploitation not only does not help, but essentially breaks convergence. For low dimensions, e.g. dim=5-15, it works well however. Any idea how to make it efficient for higher dim?
The text was updated successfully, but these errors were encountered: