Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running grnboost2 : ValueError: tuple is not allowed for map key #147

Closed
smanne07 opened this issue Mar 5, 2020 · 23 comments
Closed

Comments

@smanne07
Copy link

smanne07 commented Mar 5, 2020

Hi,
Thank you for the pyscenic package(previously used SCENIC R).

I am trying to run pyscenic following the tutorial with test data(pySCENIC - Full pipeline.ipynb).

When i am running the adjacencies = grnboost2(expression_data=ex_matrix, tf_names=tf_names, verbose=True)
I got the following error :

preparing dask client parsing input /home/sasim/my_python-3.6.3/lib/python3.6/site-packages/arboreto/algo.py:214: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead. expression_matrix = expression_data.as_matrix() creating dask graph distributed.protocol.core - CRITICAL - Failed to deserialize Traceback (most recent call last): File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/protocol/core.py", line 108, in loads header = msgpack.loads(header, use_list=False, **msgpack_opts) File "msgpack/_unpacker.pyx", line 195, in msgpack._cmsgpack.unpackb ValueError: tuple is not allowed for map key distributed.core - ERROR - tuple is not allowed for map key Traceback (most recent call last): File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/core.py", line 347, in handle_comm msg = yield comm.read() File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/tornado/gen.py", line 735, in run value = future.result() File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/tornado/gen.py", line 742, in run yielded = self.gen.throw(*exc_info) # type: ignore File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/comm/tcp.py", line 218, in read frames, deserialize=self.deserialize, deserializers=deserializers File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/tornado/gen.py", line 735, in run value = future.result() File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/tornado/gen.py", line 209, in wrapper yielded = next(result) File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/comm/utils.py", line 85, in from_frames res = _from_frames() File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/comm/utils.py", line 71, in _from_frames frames, deserialize=deserialize, deserializers=deserializers File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/protocol/core.py", line 108, in loads header = msgpack.loads(header, use_list=False, **msgpack_opts) File "msgpack/_unpacker.pyx", line 195, in msgpack._cmsgpack.unpackb ValueError: tuple is not allowed for map key distributed.protocol.core - CRITICAL - Failed to deserialize Traceback (most recent call last): File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/protocol/core.py", line 108, in loads header = msgpack.loads(header, use_list=False, **msgpack_opts) File "msgpack/_unpacker.pyx", line 195, in msgpack._cmsgpack.unpackb ValueError: tuple is not allowed for map key shutting down client and local cluster finished Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/arboreto/algo.py", line 41, in grnboost2 early_stop_window_length=early_stop_window_length, limit=limit, seed=seed, verbose=verbose) File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/arboreto/algo.py", line 128, in diy seed=seed) File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/arboreto/core.py", line 403, in create_graph future_tf_matrix = client.scatter(tf_matrix, broadcast=True) File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/client.py", line 2071, in scatter hash=hash, File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/client.py", line 753, in sync return sync(self.loop, func, *args, **kwargs) File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/utils.py", line 331, in sync six.reraise(*error[0]) File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/six.py", line 693, in reraise raise value File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/utils.py", line 316, in f result[0] = yield future File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/tornado/gen.py", line 735, in run value = future.result() File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/tornado/gen.py", line 742, in run yielded = self.gen.throw(*exc_info) # type: ignore File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/client.py", line 1916, in _scatter timeout=timeout, File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/tornado/gen.py", line 735, in run value = future.result() File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/tornado/gen.py", line 742, in run yielded = self.gen.throw(*exc_info) # type: ignore File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/core.py", line 739, in send_recv_from_rpc result = yield send_recv(comm=comm, op=key, **kwargs) File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/tornado/gen.py", line 735, in run value = future.result() File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/tornado/gen.py", line 742, in run yielded = self.gen.throw(*exc_info) # type: ignore File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/core.py", line 533, in send_recv response = yield comm.read(deserializers=deserializers) File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/tornado/gen.py", line 735, in run value = future.result() File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/tornado/gen.py", line 742, in run yielded = self.gen.throw(*exc_info) # type: ignore File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/comm/tcp.py", line 218, in read frames, deserialize=self.deserialize, deserializers=deserializers File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/tornado/gen.py", line 735, in run value = future.result() File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/tornado/gen.py", line 209, in wrapper yielded = next(result) File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/comm/utils.py", line 85, in from_frames res = _from_frames() File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/comm/utils.py", line 71, in _from_frames frames, deserialize=deserialize, deserializers=deserializers File "/home/sasim/my_python-3.6.3/lib/python3.6/site-packages/distributed/protocol/core.py", line 108, in loads header = msgpack.loads(header, use_list=False, **msgpack_opts) File "msgpack/_unpacker.pyx", line 195, in msgpack._cmsgpack.unpackb ValueError: tuple is not allowed for map key

Any help is greatly appreciated.

Best regards
Sasi

@smanne07 smanne07 changed the title Error Error when running grnboost2 : ValueError: tuple is not allowed for map key Mar 5, 2020
@nikostrasan
Copy link

Hello, I am having exactly the same issue when running pyscenic with "-m genie3" mode, either locally or on an HPC cluster. I think there is a compatibility issue with msgpack and dask versions.Is it possible to have a look at it please or give any advice how we can troubleshoot this?

Many thanks
Nikos

@smanne07
Copy link
Author

smanne07 commented Mar 7, 2020

Reinstalling/Upgrading the distribution package helped.

@smanne07 smanne07 closed this as completed Mar 7, 2020
@ms-balzer
Copy link

Reinstalling/Upgrading the distribution package helped.

Unfortunately, this didn't work for me, also Penn HPC user.

@nikostrasan
Copy link

Unfortunately for me neither.
I created a new conda environment and installed it from there. Still having the same issue. Some people suggested upgrading the Msgpack to the latest version to overcome this issue ( dask/distributed#3494 ). However, this creates incompatibility with pyscenic, which doesn't run with the updated msgpack version. It would be much appreciated if you could have a look. Many thanks

@cflerin
Copy link
Contributor

cflerin commented Mar 11, 2020

Does downgrading Dask/distributed help? I've found this to be most stable when using dask==1.0.0 and distributed==1.28.1 (and possibly tornado==6.0.3)

@cflerin cflerin reopened this Mar 11, 2020
@smanne07
Copy link
Author

I used the following installation steps.

pip install pyscenic

Got errors in running grnboost2 as above,did the following install/upgrades

python -m pip install 'fsspec>=0.3.3
python -m pip install dask[dataframe] --upgrade
pip install distributed -U

which fixed the issue.

python3.6.3 on IBM LSF HPC cluster.

dask==2.11.0
pyscenic==0.10.0
distributed==2.11.0

@ms-balzer
Copy link

I used the following installation steps.

pip install pyscenic

Got errors in running grnboost2 as above,did the following install/upgrades

python -m pip install 'fsspec>=0.3.3
python -m pip install dask[dataframe] --upgrade
pip install distributed -U

which fixed the issue.

python3.6.3 on IBM LSF HPC cluster.

dask==2.11.0
pyscenic==0.10.0
distributed==2.11.0

This worked for me too!

@zehualilab
Copy link

Yeah! Though pyscenic requires dask==1.0.0 and distributed < 2.0.0, @smanne07 's answer worked for me too!

@JarningGau
Copy link

I used the following installation steps.

pip install pyscenic

Got errors in running grnboost2 as above,did the following install/upgrades

python -m pip install 'fsspec>=0.3.3
python -m pip install dask[dataframe] --upgrade
pip install distributed -U

which fixed the issue.

python3.6.3 on IBM LSF HPC cluster.

dask==2.11.0
pyscenic==0.10.0
distributed==2.11.0

This also works for me. Thanks a lot!!

@Landau1994
Copy link

I used the following installation steps.

pip install pyscenic

Got errors in running grnboost2 as above,did the following install/upgrades

python -m pip install 'fsspec>=0.3.3
python -m pip install dask[dataframe] --upgrade
pip install distributed -U

which fixed the issue.

python3.6.3 on IBM LSF HPC cluster.

dask==2.11.0
pyscenic==0.10.0
distributed==2.11.0

I am curious about why this work, and why the default requirement doesn't work?

@CadenZhao
Copy link

@smanne07 's answer also work for me.
Solution in FAQ dosen't work for me.

@huytran38
Copy link

Does downgrading Dask/distributed help? I've found this to be most stable when using dask==1.0.0 and distributed==1.28.1 (and possibly tornado==6.0.3)

@cflerin, would you mind providing the commands to downgrade Dask/distributed/tornado?

@cflerin
Copy link
Contributor

cflerin commented Aug 27, 2020

@huytran38 , you would run:

pip install dask==1.0.0 distributed==1.28.1 tornado==6.0.3

to downgrade. But see also #163 for other suggestions (including the suggestion to upgrade above).

@huytran38
Copy link

@cflerin, thank you! would if make any difference if I run the command with conda instead?

conda install dask==1.0.0 distributed==1.28.1 tornado==6.0.3

@huytran38
Copy link

@cflerin , so I downgraded and still have the error. Below is most of the error message I see on screen. If upgrading doesn't help, should I just uninstall anaconda and reinstall?


distributed.protocol.core - CRITICAL - Failed to deserialize
Traceback (most recent call last):
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\distributed\protocol\core.py", line 108, in loads
header = msgpack.loads(header, use_list=False, **msgpack_opts)
File "msgpack/_unpacker.pyx", line 195, in msgpack._cmsgpack.unpackb
ValueError: tuple is not allowed for map key
distributed.core - ERROR - tuple is not allowed for map key
Traceback (most recent call last):
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\distributed\core.py", line 457, in handle_stream
msgs = yield comm.read()
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 735, in run
value = future.result()
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 742, in run
yielded = self.gen.throw(*exc_info) # type: ignore
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\distributed\comm\tcp.py", line 218, in read
frames, deserialize=self.deserialize, deserializers=deserializers
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 735, in run
value = future.result()
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 209, in wrapper
yielded = next(result)
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\distributed\comm\utils.py", line 85, in from_frames
res = _from_frames()
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\distributed\comm\utils.py", line 71, in _from_frames
frames, deserialize=deserialize, deserializers=deserializers
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\distributed\protocol\core.py", line 108, in loads
header = msgpack.loads(header, use_list=False, **msgpack_opts)
File "msgpack/_unpacker.pyx", line 195, in msgpack._cmsgpack.unpackb
ValueError: tuple is not allowed for map key
distributed.core - ERROR - tuple is not allowed for map key
Traceback (most recent call last):
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\distributed\core.py", line 412, in handle_comm
result = yield result
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 735, in run
value = future.result()
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 748, in run
yielded = self.gen.send(value)
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\distributed\scheduler.py", line 2224, in add_client
yield self.handle_stream(comm=comm, extra={"client": client})
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 735, in run
value = future.result()
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 742, in run
yielded = self.gen.throw(*exc_info) # type: ignore
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\distributed\core.py", line 457, in handle_stream
msgs = yield comm.read()
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 735, in run
value = future.result()
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 742, in run
yielded = self.gen.throw(*exc_info) # type: ignore
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\distributed\comm\tcp.py", line 218, in read
frames, deserialize=self.deserialize, deserializers=deserializers
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 735, in run
value = future.result()
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\tornado\gen.py", line 209, in wrapper
yielded = next(result)
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\distributed\comm\utils.py", line 85, in from_frames
res = _from_frames()
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\distributed\comm\utils.py", line 71, in _from_frames
frames, deserialize=deserialize, deserializers=deserializers
File "C:\Users\ht389\AppData\Local\Continuum\anaconda3\lib\site-packages\distributed\protocol\core.py", line 108, in loads
header = msgpack.loads(header, use_list=False, **msgpack_opts)
File "msgpack/_unpacker.pyx", line 195, in msgpack._cmsgpack.unpackb
ValueError: tuple is not allowed for map key

@cflerin
Copy link
Contributor

cflerin commented Aug 28, 2020

@huytran38 , You can use pip install within a conda environment so the original command will work.

But for your ultimate problem, I would highly recommend using the multiprocessing script unless you really need Dask to split your run across multiple compute nodes. See here for details.

@huytran38
Copy link

@cflerin , thank you so much. I'm not familiar with pySCENIC. I'm using Dask is because of its ability to scale computation for a large dataframe, about 16GB. Even if I would want to try using the multiprocessing script, the usage instruction on pySCENIC page is still a bit vague for me.

@saeedfc
Copy link

saeedfc commented Sep 14, 2020

Hi,
Update: I am now running the script since 5 days. it haven't thrown any error so far. So I was guessing it is working. However, in my experience the data of the size I am now running should only take 48 hours or so. Something still strange. I will wait a bit more and interrupt and try running on smaller subset .

Just adding here if it helps anyone!

I first tried as following!

pip install pyscenic
pip install dask==1.0.0
pip install  distributed==1.28.1
pip install tornado==6.0.3

Got the following error
ValueError: tuple is not allowed for map key

So I did as @smanne07 said;
pip install pyscenic==0.10.0 (#explicitly mentioned 0.10.0)

python -m pip install 'fsspec>=0.3.3'
python -m pip install dask[dataframe] --upgrade
pip install distributed -U

Now it seems working. Below are the versions at the moment.
pyscenic==0.10.0
dask==2.11.0
distributed==2.11.0

@saeedfc
Copy link

saeedfc commented Sep 21, 2020

Hi All,

So the larger dataset took unusually large time (5-6 days) given the resources and previous experience with pyscenic and still did not finish since I tried with versions as mentioned above. So I interrupted and ran with a much smaller dataset. I guess the GRNboost2 worked fine.

So I am not sre whether anything is going wrong when I use the large dataset or it is simply that I still have to wait much more (I have analysed similar sized dat with pyscenic before in around 48-72 hours).

Below is the output when I used smaller dataset. Just to get your feedback based on the warnings showed with smaller dataset run to see whether it helps to solve the large dataset issue.

Humbly suggest a progress tracking way as James suggested
#217 (comment)

if __name__ == '__main__':
    DATA_FOLDER = '/mnt/DATA1/Full Scale Analysis/SCENIC/Myeloid Cells'
    RESOURCES_FOLDER ='/mnt/DATA1/Human Integration and Clustering/COVID/Epithelial Cells/SCENIC/RESOURCES_FOLDER'
    DATABASES_GLOB = os.path.join(RESOURCES_FOLDER, "hg38*.mc9nr.feather")
    MOTIF_ANNOTATIONS_FNAME = os.path.join(RESOURCES_FOLDER, "motifs-v9-nr.hgnc-m0.001-o0.0.tbl")
    MM_TFS_FNAME = os.path.join(RESOURCES_FOLDER, 'TFs.txt')
    REGULONS_FNAME = os.path.join(DATA_FOLDER, "Regulons_Myeloid.p")
    MOTIFS_FNAME = os.path.join(DATA_FOLDER, "Regulons_motifs_Myeloid.csv")
    ex_matrix = pd.read_csv("/mnt/DATA1/Full Scale Analysis/SCENIC/Myeloid Cells/myeloid_expression_test.csv", sep = ",", header=0, index_col=0)
    ex_matrix.shape
    tf_names = load_tf_names(MM_TFS_FNAME)
    db_fnames = glob.glob(DATABASES_GLOB)
    
    def name(fname):
        return os.path.basename(fname).split(".")[0]
    dbs = [RankingDatabase(fname=fname, name=name(fname)) for fname in db_fnames]
    dbs
    adjacencies = grnboost2(ex_matrix, tf_names=tf_names, verbose=True, seed = 777)
    adjacencies.to_csv("/mnt/DATA1/Full Scale Analysis/SCENIC/Myeloid Cells/Adjacencies_Myeloid_test.csv", index = False, sep = '\t')
preparing dask client
parsing input
creating dask graph
6 partitions
computing dask graph
shutting down client and local cluster
tornado.application - ERROR - Exception in callback functools.partial(<bound method IOLoop._discard_future_result of <tornado.platform.asyncio.AsyncIOLoop object at 0x7fa117503b00>>, <Task finished coro=<Worker.heartbeat() done, defined at /home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages/distributed/worker.py:883> exception=CommClosedError('in <closed TCP>: Stream is closed')>)
Traceback (most recent call last):
  File "/home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages/distributed/comm/tcp.py", line 190, in read
    lengths = await stream.read_bytes(8 * n_frames)
  File "/home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages/tornado/iostream.py", line 436, in read_bytes
    future = self._start_read()
  File "/home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages/tornado/iostream.py", line 797, in _start_read
    self._check_closed()  # Before reading, check that stream is not closed.
  File "/home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages/tornado/iostream.py", line 1009, in _check_closed
    raise StreamClosedError(real_error=self.error)
tornado.iostream.StreamClosedError: Stream is closed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages/tornado/ioloop.py", line 743, in _run_callback
    ret = callback()
  File "/home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages/tornado/ioloop.py", line 767, in _discard_future_result
    future.result()
  File "/home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages/distributed/worker.py", line 920, in heartbeat
    raise e
  File "/home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages/distributed/worker.py", line 893, in heartbeat
    metrics=await self.get_metrics(),
  File "/home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages/distributed/utils_comm.py", line 391, in retry_operation
    operation=operation,
  File "/home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages/distributed/utils_comm.py", line 379, in retry
    return await coro()
  File "/home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages/distributed/core.py", line 757, in send_recv_from_rpc
    result = await send_recv(comm=comm, op=key, **kwargs)
  File "/home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages/distributed/core.py", line 540, in send_recv
    response = await comm.read(deserializers=deserializers)
  File "/home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages/distributed/comm/tcp.py", line 208, in read
    convert_stream_closed_error(self, e)
  File "/home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages/distributed/comm/tcp.py", line 123, in convert_stream_closed_error
    raise CommClosedError("in %s: %s" % (obj, exc))
distributed.comm.core.CommClosedError: in <closed TCP>: Stream is closed
finished.
(base) u0119129@gbw-d-l0099:~$ pip show dask
Name: dask
Version: 2.11.0
Summary: Parallel PyData with Task Scheduling
Home-page: https://github.com/dask/dask/
Author: None
Author-email: None
License: BSD
Location: /home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages
Requires: 
Required-by: vaex-core, pyscenic, distributed, arboreto
(base) u0119129@gbw-d-l0099:~$ pip show distributed
Name: distributed
Version: 2.11.0
Summary: Distributed scheduler for Dask
Home-page: https://distributed.dask.org
Author: None
Author-email: None
License: BSD
Location: /home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages
Requires: dask, pyyaml, zict, click, psutil, sortedcontainers, setuptools, cloudpickle, msgpack, toolz, tblib, tornado
Required-by: pyscenic, arboreto
(base) u0119129@gbw-d-l0099:~$ pip show pyscenic
Name: pyscenic
Version: 0.10.0
Summary: Python implementation of the SCENIC pipeline for transcription factor inference from single-cell transcriptomics experiments.
Home-page: https://github.com/aertslab/pySCENIC
Author: Bram Van de Sande
Author-email: None
License: GPL-3.0+
Location: /home/luna.kuleuven.be/u0119129/anaconda3/lib/python3.7/site-packages
Requires: multiprocessing-on-dill, setuptools, scipy, pyyaml, cloudpickle, attrs, numba, arboreto, llvmlite, pandas, frozendict, tqdm, loompy, cytoolz, distributed, networkx, pyarrow, interlap, dask, umap-learn, boltons, numpy
Required-by: 

Thanks and Kind regards,
Saeed

@jberkh
Copy link

jberkh commented Oct 21, 2020

The "ValueError: tuple is not allowed for map key" issue seems to be due to the msgpack package version 1.0.0. Like @smanne07 said, it can be solved for grnboost by upgrading dask/distributed, but this breaks pyscenic further down the line in my experience.

For me it was solved by installing dask==1.0.0, distributed'>=1.21.6,<2.0.0' as @cflerin suggests, but additionally installing msgpack'<1.0.0' as well. Hope it does for other as well, cheers!

@blogeman
Copy link

I was having the same issue and traced it to msgpack as well. It seems when msgpack released 1.0 it changed use_bin_type=True, as was mentioned by @nikostrasan. By downgrading to msgpack==0.6.2 I can now run the program, as @jberkh suggested.

@cflerin
Copy link
Contributor

cflerin commented Feb 18, 2021

Thanks all for the feedback and testing on this! The new pySCENIC release 0.11.0 has updated packages and fixes that (I'm hoping) will fix this issue. So I'll close this for now, hopefully it can stay that way...

@cflerin cflerin closed this as completed Feb 18, 2021
@yingyonghui
Copy link

I used the following installation steps.

pip install pyscenic

Got errors in running grnboost2 as above,did the following install/upgrades

python -m pip install 'fsspec>=0.3.3 python -m pip install dask[dataframe] --upgrade pip install distributed -U

which fixed the issue.

python3.6.3 on IBM LSF HPC cluster.

dask==2.11.0 pyscenic==0.10.0 distributed==2.11.0

Thanks for the suggestion, which helps a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests