Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Nextflow examples #305

Draft
wants to merge 13 commits into
base: master
Choose a base branch
from
12 changes: 3 additions & 9 deletions src/components/Menu.astro
Original file line number Diff line number Diff line change
Expand Up @@ -37,16 +37,10 @@ const isHomepage = currentPath === "/" || currentPath === "/index.html";
<a href="#" class="dropdown-toggle" data-toggle="dropdown">Examples <b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="/example1.html">Basic pipeline</a></li>
<li><a href="/example2.html">Mixing scripting languages</a></li>
<li><a href="/example3.html">BLAST pipeline</a></li>
<li><a href="/example4.html">RNA-Seq pipeline</a></li>
<li><a href="/example2.html">Mixed language pipeline</a></li>
<li><a href="/example3.html">RNA-Seq pipeline</a></li>
<li><a href="/example4.html">Variant calling pipeline</a></li>
<li><a href="/example5.html">Machine Learning pipeline</a></li>
<li>
<a href="https://github.com/nextflow-io/rnaseq-nf" target="_blank">
Simple RNAseq pipeline
<i class="fa fa-sm fa-external-link" aria-hidden="true"></i>
</a>
</li>
</ul>
</li>

Expand Down
88 changes: 39 additions & 49 deletions src/pages/example1.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,96 +7,86 @@ layout: "@layouts/MarkdownPage.astro"
<h3>Basic pipeline</h3>

<p class="text-muted" >
This example shows how to write a pipeline with two simple Bash processes, so that the results produced by the first process are consumed by the second process.
This example shows a simple Nextflow pipeline consisting of two Bash processes.
</p>

```groovy
#!/usr/bin/env nextflow

params.in = "$baseDir/data/sample.fa"
/*
* Pipeline parameters
*/

// Primary input
params.greeting = "Hello World!"

/*
* Split a fasta file into multiple files
* Redirect a string to a text file
*/
process splitSequences {
process sayHello {

input:
path 'input.fa'
val x

output:
path 'seq_*'
path 'output.txt'

script:
"""
awk '/^>/{f="seq_"++d} {print > f}' < input.fa
echo '$x' > output.txt
"""
}

/*
* Reverse the sequences
* Convert lowercase letters to uppercase letters
*/
process reverse {
process convertToUpper {

input:
path x
path y

output:
stdout

script:
"""
cat $x | rev
cat $y | tr '[a-z]' '[A-Z]'
"""
}

/*
* Define the workflow
* Workflow definition
*/
workflow {
splitSequences(params.in) \
| reverse \
| view
}
```

</div>

### Synopsis

- **Line 1** The script starts with a shebang declaration. This allows you to launch your pipeline just like any other Bash script.

- **Line 3**: Declares a pipeline parameter named `params.in` that is initialized with the value `$HOME/sample.fa`. This value can be overridden when launching the pipeline, by simply adding the option `--in <value>` to the script command line.
// Creates channel using the Channel.of() channel factory
greeting_ch = Channel.of(params.greeting)

- **Lines 8-19**: The process that splits the provided file.
// Redirects a string to a text file
sayHello(greeting_ch)

- **Line 10**: Opens the input declaration block. The lines following this clause are interpreted as input definitions.
// Concatenates a text file and transforms lowercase letters to uppercase letters
convertToUpper(sayHello.out)

- **Line 11**: Declares the process input file, which will be named `input.fa` in the process script.

- **Line 13**: Opens the output declaration block. The lines following this clause are interpreted as output declarations.

- **Line 14**: Files whose names match the pattern `seq_*` are declared as the output of this process.

- **Lines 16-18**: The actual script executed by the process to split the input file.

- **Lines 24-35**: The second process, which receives the splits produced by the
previous process and reverses their content.

- **Line 26**: Opens the input declaration block. Lines following this clause are
interpreted as input declarations.
// View convertToUpper output
convertToUpper.out.view()
}
```

- **Line 27**: Defines the process input file.
</div>

- **Line 29**: Opens the output declaration block. Lines following this clause are
interpreted as output declarations.
### Script synopsis

- **Line 30**: The standard output of the executed script is declared as the process
output.
This example shows a simple Nextflow pipeline consisting of two Bash processes. The `sayHello` process takes a string as input and redirects it to an output text file. The `convertToUpper` process takes the output text file from `sayHello` as input, concatenates the text, and converts all of the lowercase letters to uppercase letters. The output from the `convertToUpper` process is then printed to screen.

- **Lines 32-34**: The actual script executed by the process to reverse the content of the input files.
### Try it

- **Lines 40-44**: The workflow that connects everything together!
To try this pipeline:

- **Line 41**: First, the input file specified by `params.in` is passed to the `splitSequences` process.
1. Follow the [Nextflow installation guide](https://www.nextflow.io/docs/latest/install.html#install-nextflow) to install Nextflow (if not already available).
2. Copy the script above and save it as `hello-world.nf`.
3. Launch the pipeline:

- **Line 42**: The outputs of `splitSequences` are passed as inputs to the `reverse` process, which processes each split file in parallel.
nextflow run hello-world.nf

- **Line 43**: Finally, each output emitted by `reverse` is printed.
**NOTE**: To run this example with versions of Nextflow older than 22.04.0, you must include the `-dsl2` flag with `nextflow run`.
47 changes: 38 additions & 9 deletions src/pages/example2.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,33 @@
---
title: Mixing scripting languages
title: Mixed language pipeline
layout: "@layouts/MarkdownPage.astro"
---

<div class="blg-summary example">
<h3>Mixing scripting languages</h3>
<h3>Mixed language pipeline</h3>

<p class="text-muted">
With Nextflow, you are not limited to Bash scripts -- you can use any scripting language! In other words, for each <i>process</i> you can use the language that best fits the specific task or that you simply prefer.
This example shows a simple Nextflow pipeline consisting of two processes written in different languages.
</p>

```groovy
#!/usr/bin/env nextflow

/*
* Pipeline parameters
*/

// Range
params.range = 100

/*
* A trivial Perl script that produces a list of number pairs
*/
process perlTask {

input:
val x

output:
stdout

Expand All @@ -29,7 +38,7 @@ process perlTask {
use warnings;

my $count;
my $range = !{params.range};
my $range = !{x};
for ($count = 0; $count < 10; $count++) {
print rand($range) . ', ' . rand($range) . "\n";
}
Expand All @@ -40,12 +49,14 @@ process perlTask {
* A Python script which parses the output of the previous script
*/
process pyTask {

input:
stdin

output:
stdout

script:
"""
#!/usr/bin/env python
import sys
Expand All @@ -64,17 +75,35 @@ process pyTask {
}

workflow {
perlTask | pyTask | view

// Creates channel using the Channel.of() channel factory
range_ch = Channel.of(params.range)

// A Perl script that produces a list of number pairs
perlTask(range_ch)

// A Python script which parses the output of the previous script
pyTask(perlTask.out)

// View pyTask output
pyTask.out.view()
}
```

</div>

### Synopsis

In the above example we define a simple pipeline with two processes.
This example shows a simple Nextflow pipeline consisting of two processes written in different languages. The `perlTask` process starts with a Perl _shebang_ declaration and executes a Perl script that produces pairs of numbers. Since Perl uses the `$` character for variables, the special `shell` block is used instead of the normal `script` block to distinguish the Perl variables from Nextflow variables. Similarly, the `pyTask` process starts with a Python _shebang_ declaration. It takes the output from the Perl script and executes a Python script that averages the number pairs. The output from the `pyTask` process is then printed to screen.

### Try it

To try this pipeline:

1. Follow the [Nextflow installation guide](https://www.nextflow.io/docs/latest/install.html#install-nextflow) to install Nextflow (if not already available).
2. Copy the script above and save it as `mixed-languages.nf`.
3. Launch the pipeline:

The first process executes a Perl script, because the script block definition starts
with a Perl _shebang_ declaration (line 14). Since Perl uses the `$` character for variables, we use the special `shell` block instead of the normal `script` block to easily distinguish the Perl variables from the Nextflow variables.
nextflow run mixed-languages.nf

In the same way, the second process will execute a Python script, because the script block starts with a Python shebang (line 36).
**NOTE**: To run this example with versions of Nextflow older than 22.04.0, you must include the `-dsl2` flag with `nextflow run`.
Loading