Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

way to control how pytest-xdist runs tests in parallel? #175

Closed
pytestbot opened this issue Aug 6, 2012 · 34 comments
Closed

way to control how pytest-xdist runs tests in parallel? #175

pytestbot opened this issue Aug 6, 2012 · 34 comments
Labels
plugin: xdist related to the xdist external plugin type: enhancement new feature or API change, should be merged into features branch

Comments

@pytestbot
Copy link
Contributor

Originally reported by: Anonymous


Apologies if this has been raised and triaged already. I could not find it in the archives.

The enhancement I request in py.test is the ability to control how tests are executed in parallel. It would be really nice to have a way to run multiple classes/modules in parallel, while having serial execution within a given set of classes/modules.

TestNG (Java) provides similar functionality. Please see:
http://testng.org/doc/documentation-main.html#parallel-running

This was also discussed a while ago on StackOverflow. Please see discussion:
http://stackoverflow.com/questions/4637036/is-there-a-way-to-control-how-pytest-xdist-runs-tests-in-parallel


@pytestbot
Copy link
Contributor Author

Original comment by Ronny Pfannschmidt (BitBucket: RonnyPfannschmidt, GitHub: RonnyPfannschmidt):


i started work to prepare xdist for such a refactoring

(the particulars are removing features that are not related to parallel testruns from xdist)

@pytestbot
Copy link
Contributor Author

Original comment by holger krekel (BitBucket: hpk42, GitHub: hpk42):


To the original poster: you are talking about load-scheduling, to speed up tests, right?

@pytestbot
Copy link
Contributor Author

Original comment by Denis K. (BitBucket: krya, GitHub: krya):


would be very usefull in django plugin aswell.
for instance - I need a way to create database only in one process before rest is started and a way to run some tests in a single thread ( transaction tests ) so they wont fight each other trying to flush database

@pytestbot
Copy link
Contributor Author

Original comment by Robin Dunn (BitBucket: robind42, GitHub: robind42):


Another nice-to-have would be options to set the tests-per-process granularity. In other words, to control when an existing testing process is recycled and when it will be replaced by a new process. I have some tests that should have a new process for each test, or at least a new process per module, just to ensure that the tests are running in a pristine process environment.

@pytestbot
Copy link
Contributor Author

Original comment by Ronny Pfannschmidt (BitBucket: RonnyPfannschmidt, GitHub: RonnyPfannschmidt):


thats also part of the plans (the idea is to have --boxed use dist facilities as well, --boxed-mode=module/class is thinkable)

@pytestbot
Copy link
Contributor Author

Original comment by holger krekel (BitBucket: hpk42, GitHub: hpk42):


@RobinD42 @denis_kasakov if you could try to write documentation on the API and/or command line options, you'd like to see, that would be much appreciated. Once we have a clear documented concept/user stories, the implementation might not be that hard.

@pytestbot
Copy link
Contributor Author

Original comment by Dmytro Makhno (BitBucket: dmytro_makhno, GitHub: dmytro_makhno):


https://gist.github.com/dmakhno/9970080#file-test_parallel-py

Here is comparision with nose.

At first it is nice to have similar to nose abilities.

@pytest.distribute("split") # "shared" or "per_class", see examples below
class MyTest(unittest.TestCase)

Next step may be apply decorators to fixture taking into account their ability to be distributed.

@pytest.fixture(scope="class", distribute="shared")
def db_class(request): pass

@pytest.fixture(scope="class", distribute="per_class")
def webdriver_class(request): pass

@pytest.fixture(scope="class") # default distribute="split", how it works now
def mockapi_class(request): pass

What do you think?

@pytestbot
Copy link
Contributor Author

Original comment by Prasanna Santhanam (BitBucket: psanthanam, GitHub: psanthanam):


I've moved away from nose because of its lack of support for this. Was hoping this was not a problem with py.test.

@pytestbot
Copy link
Contributor Author

Original comment by Dmytro Makhno (BitBucket: dmytro_makhno, GitHub: dmytro_makhno):


@psanthanam,
I'm new to both, but py.test seems more convenient for me due to other xUnit experience. Comparing them, nose seems has more options here to deal with this.

2 all,

I do need run setupClass once per class, in my integration tests it is very heavy and expensive operation, but the rest is designed to be well-paralled.
If just anyone can suggest how to achieve this with pytest... I'll be uncountably happy.

Maybe there are some cross-mulitprocess-singleton-fixtures? :)

@pytestbot
Copy link
Contributor Author

Original comment by Ivan Kalinin (BitBucket: PuPSSMaN, GitHub: PuPSSMaN):


So, it there any progress / consideration for this?

At least, where is the scheduling / distribution logic located in the xdist code so one can have a look at that?

@pytestbot
Copy link
Contributor Author

Original comment by Ivan Kalinin (BitBucket: PuPSSMaN, GitHub: PuPSSMaN):


Looks like https://bitbucket.org/hpk42/pytest-xdist/src/5ba5bdb77d302b621508bde6fdca37349f7d17d7/xdist/dsession.py?at=default#cl-153 controls actual distribution. One should be certainly capable of subclassing LoadScheduling to implement per-module distribution, IMO

@pytestbot
Copy link
Contributor Author

Original comment by Bruno Oliveira (BitBucket: nicoddemus, GitHub: nicoddemus):


I've created a PR that's related to this issue, allowing to use pytest.mark to mark tests that should be executed serially. Perhaps this covers OP's needs?

@pytestbot
Copy link
Contributor Author

Original comment by Dmytro Makhno (BitBucket: dmytro_makhno, GitHub: dmytro_makhno):


This is preliminary result that might be:
https://gist.github.com/polusok/4e71f7e3d3dbf437cc25
But still very "hacky" and noisy.

When shared fixture is singleton.
But maybe another way, having singleton fixture, run depending tests on the same slave.

@pupssman, thanks I didn't get chance to dig. What is general intention for *Scheduling classes, how they distribute tests among nodes, key activities? (Please treat me as python newbie)

@pytestbot
Copy link
Contributor Author

Original comment by Pierre-Yves Rofes (BitBucket: piwai, GitHub: piwai):


Hey all,

I think my use case is roughly the same as the original poster, as I have test classes with setUp() methods that are time-consuming, so to avoid running them on each node i'd like to ensure that every test of the same test class run on the same node.
Followin Ivan lead to subclass LoadScheduling, I started something here to implement per-testclass distribution : https://bitbucket.org/piwai/pytest-xdist/commits/all

It works, but it's still a Proof of Concept and it needs more work to improve/document/cleanup... But, before investing more time I'd like to have a feedback from the maintainers to know if this has a reasonable chance of getting merged when finished.

Thanks,

@pytestbot
Copy link
Contributor Author

Original comment by Ivan Kalinin (BitBucket: PuPSSMaN, GitHub: PuPSSMaN):


That's brilliant! Looks like it does the very thing it needs to do, and
does so simply.

BTW, it looks like adding a Scheduling implementation requires a bit of
inner manipulation -- maybe we could, you know, device a plug-in mechanism
for that?

@pytestbot
Copy link
Contributor Author

Original comment by Pierre-Yves Rofes (BitBucket: piwai, GitHub: piwai):


Ok, I cleaned up a bit the duplicated code from LoadScheduling.
@hpk42 , any feedback?

@pytestbot
Copy link
Contributor Author

Original comment by holger krekel (BitBucket: hpk42, GitHub: hpk42):


@piwai this looks good in principle but i wonder if specifying "--dist=class" is a good UI as it would enforce it for all classes. Others might want a per-module split etc. I suggest to think how we can enhance behaviour by default, in a "no new option/API" sense as this brings the benefits to everyone.

The master is currently missing information on which fixtures are in use for a test. With a little work we could probably provide information about used fixtures for each test id that the master sees. When tests don't share setup/fixtures, there is no need to run them in chunk-per-node. When a session/class/module shared fixture is used by multiple tests it would prevent splitting them over test nodes. We probably need a way to mark fixtures as irrelevant for this mechanism.

The master would then see lists of maybe (testid, (session-level fixturenames), (module-level fixturenames) (class-level fixturenames)) tuples. The algorithm for using this information can be written in separate functions and thus be nicely unit-tested as it only needs example inputs and split-outputs but maybe there are some dragons in keeping the scheduling of test items beyond the initial distribution aligned with the fixture/split information.

Hope these thoughts make some sense to you. I am willing to help with sorting this out. Maybe writing some docs and examples first would help to get clarity on the goal.
These enhancements would certainly be a major enhancement to pytest-xdist.

@pytestbot
Copy link
Contributor Author

Original comment by Pierre-Yves Rofes (BitBucket: piwai, GitHub: piwai):


@hpk42 Thanks for the feedback. I agree that the "--dist=class" is clearly not the best UI, but it was the simplest for the PoC. What would you think of something like "--dist=load --scheduler=bytest/byclass/bymodule"? The --scheduler=bytest would be by the default behaviour and can be ommitted, "byclass" would enable the dispatching by class, etc...

I'm glad that your willing to help, because I only looked at dsession.py for the moment, and I'm still discovering the rest of the code. When you talk about doc and examples, you mean the 'example' folder in the sources? I can try to come up with some use cases first, before improving the PoC.

@pytestbot
Copy link
Contributor Author

Original comment by holger krekel (BitBucket: hpk42, GitHub: hpk42):


The problem with a command line option UI is that it applies globally and irrespective of the actual shared fixtures structure. This is why i'd rather like pytest-xdist to work a little harder to automatically take into account the fixture structure and try to minimize the number of shared setups.

If that turns out to be too hard then your suggestion makes sense although the option should maybe be named "--dist-split" or something. Maybe @flub and @bubenkoff can chime in as well as they both have worked heavily with xdist.

@pytestbot
Copy link
Contributor Author

Original comment by Anatoly Bubenkov (BitBucket: bubenkoff, GitHub: bubenkoff):


@hpk42
the whole initiative about the control how xdist runs the tests seems a bit strange to me: it's more natural for me to concentrate on the tests isolation than on the test runner, so im a bit pessimistic here. But i agree with you about it should be as much automatic as possible.

@pytestbot pytestbot added type: enhancement new feature or API change, should be merged into features branch plugin: xdist related to the xdist external plugin labels Jun 15, 2015
@vladlaktionov
Copy link

it is nice to isolate all of the tests of course, but it is not always possible. Sometimes you really need some suites and i think it would be great to have such option in the pocket. Can't properly use xdist without it...

@pupssman
Copy link

So, is anything up here?

@nicoddemus
Copy link
Member

Unfortunately not right now.

I have some ideas and plan to write a proposal to see what people think. I will link it here once it is ready.

@tobixx
Copy link

tobixx commented Nov 19, 2015

Just want to raise my hand here also. My use case is a multiple stages system test where I use parametrized fixtures with class scope. In this constellation I'm unable to use xdist at all (becasue next test needs results from the prior one) but really need it due too the runtime of this tests.

Btw. the incremental testing example will of course also not work as expected while using xdist.

@RonnyPfannschmidt
Copy link
Member

this needs a different internal Design in xdists, its down in my pipeline but not in a soon place

@nchammas
Copy link
Contributor

I have a similar use case.

I have a parameterized, module-scoped fixture. I want any given permutation of the fixture to be spun up only once per module, since it is expensive to create in both time and money.

However, I want as many different permutations of the fixture to be spun up in parallel as possible, since that saves the most time and enables the most number of tests to execute simultaneously.

I would welcome any changes to xdist that enable this use case or that offer a viable alternative.

@nchammas
Copy link
Contributor

nchammas commented Dec 1, 2015

Not sure if I've read the xdist source correctly, but an interim solution that may be easier to offer is some kind of hook that we can use to define how we want to chunk tests up.

Source:

class LoadScheduling:
...
Then the collection gets devided (sic) up in chunks and chunks get submitted to nodes.

Maybe we can get a hook similar to pytest_collection_modifyitems which lets us define a custom chunking of tests to be sent off to test runners.

Does that make sense? Would anyone else find such a hook useful? I'm still getting familiar with pytest's architecture and concepts, but I'm eager to figure out a short-term solution that doesn't get in the way of future changes to xdist.

@RonnyPfannschmidt
Copy link
Member

Any hook we prematurely introduce has the potential to severely cripple doing a proper solution

That and my current time constraints mean its unlikely to happen unless someone tries tries and can handle needing to maintain a temporary fork in case we cannot acceptance his/her solution for strategic reasons

@nchammas
Copy link
Contributor

nchammas commented Dec 1, 2015

No worries, I completely understand. I'll take a look at @heipark's solution here and try to come up with something that works for me. If it seems generally applicable, I'll share it with y'all so you can consider it for inclusion.

The approach I'm looking to take is to offer a hook that people can use to custom define how to chunk their tests up for distribution to test runners. That's it. Options like chunking by class or by module can be implemented on top of this functionality.

@RonnyPfannschmidt
Copy link
Member

there is a general plan to do this for collection time already (so different nodes collect different files)

the main problem is development time and design time (which is hard to get by for personal reasons)

@nicoddemus
Copy link
Member

I put up pytest-dev/pytest-xdist#17 with some ideas, would love to get some feedback! 😄

@nicoddemus
Copy link
Member

Oops, missed @nchammas post! 😅

A hook was an idea discussed before, but it was difficult to design a hook that would cover all possible use cases we had at the time. Also, as @RonnyPfannschmidt mentioned, currently the master node only has the items' node ids to work with in deciding how to schedule the tests, so it is a limiting factor too (of course we could work on that front and change xdist so slaves provide more more information to the master node during collection). But I think it is a good idea and worth pursuing.

The proposal I wrote is meant to allow more fine grained control on how the tests are run as I in my use case there were some classes that should have their tests executed in a single slave while other modules also had the same requirement, all in the same test suite.

Anyway, @nchammas feel free to drop a line if you want to discuss this further (here or at pytest-dev/pytest-xdist#17).

@nchammas
Copy link
Contributor

nchammas commented Dec 1, 2015

No worries @nicoddemus. :) I have changed my position to the one described here. I've commented on your proposal, and I guess we can continue the discussion there.

@RonnyPfannschmidt
Copy link
Member

This issue is superseded by the already referenced ones in the xdist repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plugin: xdist related to the xdist external plugin type: enhancement new feature or API change, should be merged into features branch
Projects
None yet
Development

No branches or pull requests

7 participants