Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix(hadoop): Remove the schema for hdfs path when reading file (#11963)
Summary: Although we support JVM libhdfs, Gluten's internal benchmark still uses Libhdfs3. We encountered a 'File Not Found' exception when reading the HDFS path with libhdfs3. ``` Reason: Unable to get file path info for file: hdfs://b49691a74b48.jf.intel.com:8020/tpch_sf3000/lineitem/part-00281-3761d71a-87c6-4341-8f1c-db804f904130-c000.snappy.parquet. got error: FileNotFoundException: Path hdfs://b49691a74b48.jf.intel.com:8020/tpch_sf3000/lineitem/part-00281-3761d71a-87c6-4341-8f1c-db804f904130-c000.snappy.parquet does not exist. Retriable: False Context: Split [Hive: hdfs://b49691a74b48.jf.intel.com:8020/tpch_sf3000/lineitem/part-00281-3761d71a-87c6-4341-8f1c-db804f904130-c000.snappy.parquet 0 - 1489456566] Task Gluten_Stage_8_TID_842_VTID_27 Additional Context: Operator: TableScan[0] 0 Function: Impl File: /home/sparkuser/workspace/workspace/Gluten_TPCH_Spark32_test/ep/build-velox/build/velox_ep/velox/connectors/hive/storage_adapters/hdfs/HdfsReadFile.cpp Line: 79 ``` This PR reverts some changes from a previous [PR ](#11811 ensure continued support for libhdfs3 reading in Velox Pull Request resolved: #11963 Reviewed By: xiaoxmeng Differential Revision: D67996555 Pulled By: Yuhta fbshipit-source-id: 29e8c0070bdb403609f3dee711ea3db8a011f8b3
- Loading branch information