Add support for QQP dataset with unit tests #1713

vcm2114 · 2022-05-11T16:24:14Z

Summary

Added support for QQP dataset
Added mocked unit tests for QQP dataset

Test

pytest test/datasets/test_qqp.py

Context

See #1710

…taset

parmeet

LGTM!

Nayef211

LGTM. Can we also make sure to add the sharding filter to the dataset as @parmeet mentioned here #1710 (comment).

Btw I think we can land irrespective of the test failures coming from linux (since this is caused by some recent changes related to torchdata). As long as all the tests are passing on mac os.

Nayef211 · 2022-05-18T21:11:50Z

test/datasets/test_qqp.py

+    file_name = "quora_duplicate_questions.tsv"
+    txt_file = os.path.join(base_dir, file_name)
+    mocked_data = []
+    print(txt_file)


Can we remove the print statement here

Solved with #1734

Nayef211 · 2022-05-18T21:12:01Z

test/datasets/test_qqp.py

+    def setUpClass(cls):
+        super().setUpClass()
+        cls.root_dir = cls.get_base_temp_dir()
+        print(cls.root_dir)


Remove print

Solved with #1734

vcm2114 and others added 3 commits May 4, 2022 23:05

Add QQP dataset + unit test

ecf8012

Adjust output + add different strings for tests

c030373

Merge branch 'pytorch:main' into qqp_dataset

269c1cb

facebook-github-bot added the cla signed label May 11, 2022

vcm2114 mentioned this pull request May 11, 2022

Add support for all datasets of the GLUE benchmark #1710

Closed

8 tasks

vcm2114 added 2 commits May 12, 2022 11:03

Remove lambda functions + correct docstring

d35d2f4

Merge branch 'qqp_dataset' of github.com:VirgileHlav/text into qqp_da…

fb7e9de

…taset

parmeet approved these changes May 13, 2022

View reviewed changes

vcm2114 added 2 commits May 16, 2022 09:58

Fix lint

1950d57

Add dataset documentation

b53de4a

Nayef211 approved these changes May 17, 2022

View reviewed changes

vcm2114 added 2 commits May 18, 2022 15:46

Add shuffle and sharding

2f0fc52

Merge branch 'main' into qqp_dataset

053ff90

vcm2114 merged commit bd0f765 into pytorch:main May 18, 2022

Nayef211 reviewed May 18, 2022

View reviewed changes

This was referenced May 19, 2022

Remove prints in test_qqp.py vcm2114/text#1

Closed

Delete prints in test_qqp.py #1734

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for QQP dataset with unit tests #1713

Add support for QQP dataset with unit tests #1713

vcm2114 commented May 11, 2022

parmeet left a comment

Nayef211 left a comment •

edited

Loading

Nayef211 May 18, 2022

vcm2114 May 19, 2022

Nayef211 May 18, 2022

vcm2114 May 19, 2022

Add support for QQP dataset with unit tests #1713

Add support for QQP dataset with unit tests #1713

Conversation

vcm2114 commented May 11, 2022

Summary

Test

Context

parmeet left a comment

Choose a reason for hiding this comment

Nayef211 left a comment • edited Loading

Choose a reason for hiding this comment

Nayef211 May 18, 2022

Choose a reason for hiding this comment

vcm2114 May 19, 2022

Choose a reason for hiding this comment

Nayef211 May 18, 2022

Choose a reason for hiding this comment

vcm2114 May 19, 2022

Choose a reason for hiding this comment

Nayef211 left a comment •

edited

Loading