Skip to content

Commit

Permalink
Temporarily remove some links
Browse files Browse the repository at this point in the history
  • Loading branch information
shumingma committed Apr 10, 2024
1 parent 56da426 commit a448c38
Show file tree
Hide file tree
Showing 2 changed files with 1 addition and 18 deletions.
17 changes: 0 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,23 +184,6 @@ We have examples of how to use TorchScale in the following scenarios/tasks:

We plan to provide more examples regarding different tasks (e.g. vision pretraining and speech recognition) and various deep learning toolkits (e.g. [DeepSpeed](https://github.com/microsoft/DeepSpeed) and [Megatron-LM](https://github.com/NVIDIA/Megatron-LM)). Any comments or PRs are welcome!

## Results

### Stability Evaluation

<p align="center">
<img src="https://publicmodel.blob.core.windows.net/torchscale/pic/convergence.png?sv=2020-04-08&st=2023-08-11T03%3A09%3A09Z&se=2053-08-12T03%3A09%3A00Z&sr=c&sp=rl&sig=3b6nDda%2Fu0vD6E%2BhoTO%2BHfNSnSlUfgvXFV%2FCNKquWjE%3D" width="800"/>
</p>

The training curve is smooth by using TorchScale, while the baseline Transformer cannot converge.

### Scaling-up Experiments

<p align="center">
<img src="https://publicmodel.blob.core.windows.net/torchscale/pic/scaling_curve.png?sv=2020-04-08&st=2023-08-11T03%3A09%3A09Z&se=2053-08-12T03%3A09%3A00Z&sr=c&sp=rl&sig=3b6nDda%2Fu0vD6E%2BhoTO%2BHfNSnSlUfgvXFV%2FCNKquWjE%3D" width="800"/>
</p>

TorchScale supports arbitrary depths and widths, successfully scaling-up the models without pain.

## Acknowledgments

Expand Down
2 changes: 1 addition & 1 deletion examples/fairseq/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ Also, the JSON file should be in the format like this:
]
```

You can quickly get started with our processed vocabulary files: [sentencepiece.bpe.model](https://publicmodel.blob.core.windows.net/torchscale/vocab/sentencepiece.bpe.model?sv=2020-04-08&st=2023-08-11T03%3A09%3A09Z&se=2053-08-12T03%3A09%3A00Z&sr=c&sp=rl&sig=3b6nDda%2Fu0vD6E%2BhoTO%2BHfNSnSlUfgvXFV%2FCNKquWjE%3D) and [dict.txt](https://publicmodel.blob.core.windows.net/torchscale/vocab/dict.txt?sv=2020-04-08&st=2023-08-11T03%3A09%3A09Z&se=2053-08-12T03%3A09%3A00Z&sr=c&sp=rl&sig=3b6nDda%2Fu0vD6E%2BhoTO%2BHfNSnSlUfgvXFV%2FCNKquWjE%3D). Note that this vocabulary is English-only with 64K tokens. To train a new `sentencepiece.bpe.model` on your own data, please refer to the [SentencePiece](https://github.com/google/sentencepiece) repo. With the sentecepiece model and the installed `sentencepiece` library, you can extract the `dict.txt` file from it by
You can quickly get started with our processed vocabulary files: [sentencepiece.bpe.model] and [dict.txt]. Note that this vocabulary is English-only with 64K tokens. To train a new `sentencepiece.bpe.model` on your own data, please refer to the [SentencePiece](https://github.com/google/sentencepiece) repo. With the sentecepiece model and the installed `sentencepiece` library, you can extract the `dict.txt` file from it by
```
spm_export_vocab --model=sentencepiece.bpe.model | sed 's/\t/ /g' | tail -n +4 > dict.txt
```
Expand Down

0 comments on commit a448c38

Please sign in to comment.