-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-17487: [Python][Packaging][CI] Add support for Python 3.11 #14499
Conversation
@github-actions crossbow submit cp311 |
|
Revision: 8e2613f Submitted crossbow builds: ursacomputing/crossbow @ actions-fd1ab80f49 |
@github-actions crossbow submit cp311 |
|
@github-actions crossbow submit cp311 |
Revision: 5dc9f31 Submitted crossbow builds: ursacomputing/crossbow @ actions-3375acd0f2 |
@github-actions crossbow submit cp311 |
Revision: 936164b Submitted crossbow builds: ursacomputing/crossbow @ actions-2255056a88 |
We are looking forward to this one being merged in Apache Airflow -> Pyarrow is one of the blocking factors to make Airflow work for Py3.11 and I am trying to make all the oss projects that we consided as friends :) a concerted effort to make Py3.11 support works - as Py 3.11 brings mainly huge improvements in performance that our users are eager to start using ! We track it in apache/airflow#27264 If there is any help needed - happy to help also by talking to some dependencies of yours (which are likely also Airflow depenendencies). Good luck with it :) |
@raulcd Perhaps try applying this patch? diff --git a/python/pyproject.toml b/python/pyproject.toml
index edbc4ade6..a799dc761 100644
--- a/python/pyproject.toml
+++ b/python/pyproject.toml
@@ -18,7 +18,7 @@
[build-system]
requires = [
"cython >= 0.29.22",
- "oldest-supported-numpy>=0.14",
+ "oldest-supported-numpy>=2022.8.16",
"setuptools_scm",
"setuptools >= 40.1.0",
"wheel"
diff --git a/python/requirements-build.txt b/python/requirements-build.txt
index 46eb288c5..927c50d73 100644
--- a/python/requirements-build.txt
+++ b/python/requirements-build.txt
@@ -1,4 +1,4 @@
cython>=0.29
-oldest-supported-numpy>=0.14
+oldest-supported-numpy>=2022.8.16
setuptools_scm
setuptools>=38.6.0
diff --git a/python/requirements-wheel-build.txt b/python/requirements-wheel-build.txt
index 856164f09..a48b30d35 100644
--- a/python/requirements-wheel-build.txt
+++ b/python/requirements-wheel-build.txt
@@ -1,5 +1,5 @@
cython>=0.29.11
-oldest-supported-numpy>=0.14
+oldest-supported-numpy>=2022.8.16
setuptools_scm
setuptools>=58
wheel
diff --git a/python/requirements-wheel-test.txt b/python/requirements-wheel-test.txt
index 1644b2f8b..665b2ce77 100644
--- a/python/requirements-wheel-test.txt
+++ b/python/requirements-wheel-test.txt
@@ -2,26 +2,8 @@ cffi
cython
hypothesis
pickle5; platform_system != "Windows" and python_version < "3.8"
+oldest-supported-numpy>=2022.8.16
pytest
pytest-lazy-fixture
pytz
tzdata; sys_platform == 'win32'
-
-numpy==1.19.5; platform_system == "Linux" and platform_machine == "aarch64" and python_version < "3.7"
-numpy==1.21.3; platform_system == "Linux" and platform_machine == "aarch64" and python_version >= "3.7"
-numpy==1.19.5; platform_system == "Linux" and platform_machine != "aarch64" and python_version < "3.9"
-numpy==1.21.3; platform_system == "Linux" and platform_machine != "aarch64" and python_version >= "3.9"
-numpy==1.21.3; platform_system == "Darwin" and platform_machine == "arm64"
-numpy==1.19.5; platform_system == "Darwin" and platform_machine != "arm64" and python_version < "3.9"
-numpy==1.21.3; platform_system == "Darwin" and platform_machine != "arm64" and python_version >= "3.9"
-numpy==1.19.5; platform_system == "Windows" and python_version < "3.9"
-numpy==1.21.3; platform_system == "Windows" and python_version >= "3.9"
-
-pandas<1.1.0; platform_system == "Linux" and platform_machine != "aarch64" and python_version < "3.8"
-pandas; platform_system == "Linux" and platform_machine != "aarch64" and python_version >= "3.8"
-pandas; platform_system == "Linux" and platform_machine == "aarch64"
-pandas<1.1.0; platform_system == "Darwin" and platform_machine != "arm64" and python_version < "3.8"
-pandas; platform_system == "Darwin" and platform_machine != "arm64" and python_version >= "3.8"
-pandas; platform_system == "Darwin" and platform_machine == "arm64"
-pandas<1.1.0; platform_system == "Windows" and python_version < "3.8"
-pandas; platform_system == "Windows" and python_version >= "3.8" |
I tested the patch locally and while the build of the images is successful I got a lot of test failures:
This is how I reproduce locally: # generate wheel
PYTHON=3.11 docker-compose build --no-cache --progress plain python-wheel-manylinux-2014
PYTHON=3.11 docker-compose run --rm python-wheel-manylinux-2014
# test wheel
PYTHON=3.11 docker-compose build --no-cache python-wheel-manylinux-test-unittests
PYTHON=3.11 docker-compose run --rm python-wheel-manylinux-test-unittests |
Wheels are built successfully at the moment, I am going to trigger the job again to validate the MacOS ones but the jobs are failing due to 7 tests failing due to the change of behaviour of repr on the FileType enum, see: python/cpython#94763 |
@github-actions crossbow submit cp311 |
This patch should help fix the 3.11 enum issue: diff --git a/python/pyarrow/_fs.pyx b/python/pyarrow/_fs.pyx
index e7b028a07..557c08149 100644
--- a/python/pyarrow/_fs.pyx
+++ b/python/pyarrow/_fs.pyx
@@ -78,6 +78,12 @@ cdef CFileType _unwrap_file_type(FileType ty) except *:
assert 0
+def _file_type_to_string(ty):
+ # Python 3.11 changed str(IntEnum) to return the string representation
+ # of the integer value: https://github.com/python/cpython/issues/94763
+ return f"{ty.__class__.__name__}.{ty._name_}"
+
+
cdef class FileInfo(_Weakrefable):
"""
FileSystem entry info.
@@ -185,9 +191,10 @@ cdef class FileInfo(_Weakrefable):
except ValueError:
return ''
- s = '<FileInfo for {!r}: type={}'.format(self.path, str(self.type))
+ s = (f'<FileInfo for {self.path!r}: '
+ f'type={_file_type_to_string(self.type)}')
if self.is_file:
- s += ', size={}'.format(self.size)
+ s += f', size={self.size}'
s += '>'
return s
|
Revision: d5adbac Submitted crossbow builds: ursacomputing/crossbow @ actions-f88a7ca39e |
Python 3.11 has been released as scheduled on October 25, 2022 and this is the first attempt to see how far Airflow (mostly dependencies) are from being ready to officially support 3.11. So far we had to exclude the following dependencies: - [ ] Pyarrow dependency: apache/arrow#14499 - [ ] Google Provider: #27292 and googleapis/python-bigquery#1386 - [ ] Databricks Provider: databricks/databricks-sql-python#59 - [ ] Papermill Provider: nteract/papermill#700 - [ ] Azure Provider: Azure/azure-uamqp-python#334 and Azure/azure-sdk-for-python#27066 - [ ] Apache Beam Provider: apache/beam#23848 - [ ] Snowflake Provider: snowflakedb/snowflake-connector-python#1294 - [ ] JDBC Provider: jpype-project/jpype#1087 - [ ] Hive Provider: cloudera/python-sasl#30 We might decide to release Airflow in 3.11 with those providers disabled in case they are lagging behind eventually, but for the moment we want to work with all the projects in concert to be able to release all providers (Google Provider requires quite a lot of work and likely Google Team stepping up and community helping with migration to latest Goofle cloud libraries)
The tests are trying to compile grpcio, can we avoid that? Either install the GCS testbench on a different Python (with binary wheels), or don't test GCS at all on 3.11. |
@github-actions crossbow submit -g verify-rc-wheels |
This comment was marked as outdated.
This comment was marked as outdated.
Ah, it seems that |
This is awesome, thanks @raulcd for adding support for 3.11 👏🏻 |
Benchmark runs are scheduled for baseline = 8e3a1e1 and contender = e21d5aa. e21d5aa is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
['Python', 'R'] benchmarks have high level of regressions. |
When will an official release become available? |
Hi @noamcohen97 , wheels are part of the official release of Apache Arrow. I have sent an email to the developers mailing list to ask the rest of the community if a new minor release of Apache Arrow is required. You can join the developers mailing list or follow the thread here: |
Are the wheels which were built during the test runs here downloadable? Unfortunately this project is almost impossible to build on your own, if it was easier we wouldn't be sitting here waiting on wheels. |
@joekohlsdorf you can find the wheel in the @github-actions links above. For linux on amd64 the last build is here I have not tested it yet but at least possible to install in latest python:3.11 docker container 😃 # pip install pyarrow-11.0.0.dev52-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Processing /pyarrow-11.0.0.dev52-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Collecting numpy>=1.16.6
Downloading numpy-1.23.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 17.1/17.1 MB 73.6 MB/s eta 0:00:00
Installing collected packages: numpy, pyarrow
Successfully installed numpy-1.23.4 pyarrow-11.0.0.dev52
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
[notice] A new release of pip available: 22.3 -> 22.3.1
[notice] To update, run: pip install --upgrade pip
root@98b070b9e41f:/# python
Python 3.11.0 (main, Oct 25 2022, 05:00:36) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyarrow
>>> |
We do publish nightly development versions of pyarrow as seen on the Python development docs: |
…che#14499) This PR adds jobs to build pyarrow wheels for Python 3.11. Authored-by: Raúl Cumplido <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
) This PR adds jobs to build pyarrow wheels for Python 3.11. Authored-by: Raúl Cumplido <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
Python 3.11 has been released as scheduled on October 25, 2022 and this is the first attempt to see how far Airflow (mostly dependencies) are from being ready to officially support 3.11. So far we had to exclude the following dependencies: - [ ] Pyarrow dependency: apache/arrow#14499 - [ ] Google Provider: #27292 and googleapis/python-bigquery#1386 - [ ] Databricks Provider: databricks/databricks-sql-python#59 - [ ] Papermill Provider: nteract/papermill#700 - [ ] Azure Provider: Azure/azure-uamqp-python#334 and Azure/azure-sdk-for-python#27066 - [ ] Apache Beam Provider: apache/beam#23848 - [ ] Snowflake Provider: snowflakedb/snowflake-connector-python#1294 - [ ] JDBC Provider: jpype-project/jpype#1087 - [ ] Hive Provider: cloudera/python-sasl#30 We might decide to release Airflow in 3.11 with those providers disabled in case they are lagging behind eventually, but for the moment we want to work with all the projects in concert to be able to release all providers (Google Provider requires quite a lot of work and likely Google Team stepping up and community helping with migration to latest Goofle cloud libraries)
Python 3.11 has been released as scheduled on October 25, 2022 and this is the first attempt to see how far Airflow (mostly dependencies) are from being ready to officially support 3.11. So far we had to exclude the following dependencies: - [ ] Pyarrow dependency: apache/arrow#14499 - [ ] Google Provider: apache#27292 and googleapis/python-bigquery#1386 - [ ] Databricks Provider: databricks/databricks-sql-python#59 - [ ] Papermill Provider: nteract/papermill#700 - [ ] Azure Provider: Azure/azure-uamqp-python#334 and Azure/azure-sdk-for-python#27066 - [ ] Apache Beam Provider: apache/beam#23848 - [ ] Snowflake Provider: snowflakedb/snowflake-connector-python#1294 - [ ] JDBC Provider: jpype-project/jpype#1087 - [ ] Hive Provider: cloudera/python-sasl#30 We might decide to release Airflow in 3.11 with those providers disabled in case they are lagging behind eventually, but for the moment we want to work with all the projects in concert to be able to release all providers (Google Provider requires quite a lot of work and likely Google Team stepping up and community helping with migration to latest Goofle cloud libraries)
This PR adds jobs to build pyarrow wheels for Python 3.11.