Skip to content

Commit

Permalink
Adding TARGETS file for torchtext benchmarks
Browse files Browse the repository at this point in the history
Summary:
### Summary
- Enable benchmarking of torcharrow ops within torchtext

### Benchmark Results
- Benchmarking in fbcode devserver
```
torchtext GPT2BPE tokenizer: 65.811
torchtext vocab: 2.226
torchtext add tokens operation (string): 0.722
torchtext add tokens operation (int): 0.598

torcharrow GPT2BPE tokenizer: 65.739
torcharrow vocab: 1.253
torcharrow add tokens operation (string): 14.335
torcharrow add tokens operation (int): 0.229
```

Benchmarking on Apple MBP (results can also be found in [text#1801](#1801) and [text#1807](#1807))

```
torchtext GPT2BPE tokenizer: 3.13
torchtext vocab: 0.32
torchtext add tokens operation (string): 0.382
torchtext add tokens operation (int): 0.431

torcharrow GPT2BPE tokenizer: 59.13
torcharrow vocab: 0.03
torcharrow add tokens operation (string): 3.652
torcharrow add tokens operation (int): 0.075

```

### Takeaways
- GPT2BPE for torchtext is significantly faster on MBP than devserver
- AddTokens (str) for torcharrow is still significantly slower on both MBP and devserver than the torchtext counterpart

Reviewed By: parmeet

Differential Revision: D37463862

fbshipit-source-id: 1fb538338367bac2b002c1a4b8f128b0b2847bf5
  • Loading branch information
Nayef211 authored and facebook-github-bot committed Jun 28, 2022
1 parent 4a12f93 commit e5f2d92
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion benchmark/benchmark_torcharrow_ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

import torcharrow as ta
import torchtext.transforms as T
from benchmark.utils import Timer
from .utils import Timer
from torcharrow import functional as ta_F
from torchtext._download_hooks import load_state_dict_from_url
from torchtext.datasets import SST2
Expand Down

0 comments on commit e5f2d92

Please sign in to comment.