Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mask2FormerImageProcessor support overlapping features #35536

Open
2 of 4 tasks
mherzog01 opened this issue Jan 6, 2025 · 2 comments
Open
2 of 4 tasks

Mask2FormerImageProcessor support overlapping features #35536

mherzog01 opened this issue Jan 6, 2025 · 2 comments
Labels

Comments

@mherzog01
Copy link

System Info

transformers version: 4.48.0.dev0
Python version: 3.13.1
OS: Linux (AWS CodeSpace) Linux default 5.10.228-219.884.amzn2.x86_64 #1 SMP Wed Oct 23 17:17:00 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Virtual environment: Conda

Output of pip list

Package                  Version
------------------------ -----------
aiohappyeyeballs         2.4.4
aiohttp                  3.11.11
aiosignal                1.3.2
asttokens                3.0.0
attrs                    24.3.0
Brotli                   1.1.0
certifi                  2024.12.14
cffi                     1.17.1
charset-normalizer       3.4.1
colorama                 0.4.6
comm                     0.2.2
datasets                 3.2.0
debugpy                  1.8.11
decorator                5.1.1
dill                     0.3.8
exceptiongroup           1.2.2
executing                2.1.0
filelock                 3.16.1
frozenlist               1.5.0
fsspec                   2024.9.0
h2                       4.1.0
hpack                    4.0.0
huggingface_hub          0.26.5
hyperframe               6.0.1
idna                     3.10
importlib_metadata       8.5.0
ipykernel                6.29.5
ipython                  8.31.0
jedi                     0.19.2
Jinja2                   3.1.5
jupyter_client           8.6.3
jupyter_core             5.7.2
MarkupSafe               3.0.2
matplotlib-inline        0.1.7
mpmath                   1.3.0
multidict                6.1.0
multiprocess             0.70.16
nest_asyncio             1.6.0
networkx                 3.4.2
numpy                    2.2.1
nvidia-cublas-cu12       12.4.5.8
nvidia-cuda-cupti-cu12   12.4.127
nvidia-cuda-nvrtc-cu12   12.4.127
nvidia-cuda-runtime-cu12 12.4.127
nvidia-cudnn-cu12        9.1.0.70
nvidia-cufft-cu12        11.2.1.3
nvidia-curand-cu12       10.3.5.147
nvidia-cusolver-cu12     11.6.1.9
nvidia-cusparse-cu12     12.3.1.170
nvidia-nccl-cu12         2.21.5
nvidia-nvjitlink-cu12    12.4.127
nvidia-nvtx-cu12         12.4.127
packaging                24.2
pandas                   2.2.3
parso                    0.8.4
pexpect                  4.9.0
pickleshare              0.7.5
pillow                   11.1.0
pip                      24.3.1
platformdirs             4.3.6
prompt_toolkit           3.0.48
propcache                0.2.1
psutil                   6.1.1
ptyprocess               0.7.0
pure_eval                0.2.3
pyarrow                  18.1.0
pycparser                2.22
Pygments                 2.18.0
PySocks                  1.7.1
python-dateutil          2.9.0.post0
pytz                     2024.1
PyYAML                   6.0.2
pyzmq                    26.2.0
regex                    2024.11.6
requests                 2.32.3
safetensors              0.5.0
setuptools               75.7.0
six                      1.17.0
stack_data               0.6.3
sympy                    1.13.1
tokenizers               0.21.0
torch                    2.5.1
tornado                  6.4.2
tqdm                     4.67.1
traitlets                5.14.3
transformers             4.48.0.dev0
typing_extensions        4.12.2
tzdata                   2024.2
urllib3                  2.3.0
wcwidth                  0.2.13
xxhash                   3.5.0
yarl                     1.18.3
zipp                     3.21.0
zstandard                0.23.0

Who can help?

@amyeroberts @qubvel

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

From

The code below gives an error "ValueError: Unable to infer channel dimension format". Different permutations of ChannelDimension and location of num_features give the same or similar errors.

import numpy as np

from transformers.image_utils import ChannelDimension
from transformers import Mask2FormerImageProcessor  # Assumes torchvision is installed

processor = Mask2FormerImageProcessor(do_rescale=False, do_resize=False, do_normalize=False)

num_classes = 2
num_features = 5
height, width = (16, 16)
images = [np.zeros((height, width, 3))]
segmentation_maps = [np.random.randint(0, num_classes, (height, width, num_features))]

batch = processor(images,
                  segmentation_maps=segmentation_maps,
                  return_tensors="pt",
                  input_data_format=ChannelDimension.LAST)

See https://stackoverflow.com/questions/79331752/does-the-huggingface-mask2formerimageprocessor-support-overlapping-features.

Expected behavior

Processor supports overlapping masks without error.

@mherzog01 mherzog01 added the bug label Jan 6, 2025
@Rocketknight1
Copy link
Member

cc @zucchini-nlp !

@qubvel
Copy link
Member

qubvel commented Jan 6, 2025

I will take a look, it's related to vision 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants