Skip to content

Commit

Permalink
Merge branch 'fsi_dgl' of github.com:tzemicheal/Morpheus into david-f…
Browse files Browse the repository at this point in the history
…si_dgl-patch
  • Loading branch information
dagardner-nv committed Aug 4, 2023
2 parents 224d5d5 + 9c6c7b3 commit ee8e796
Show file tree
Hide file tree
Showing 5 changed files with 17 additions and 19 deletions.
1 change: 0 additions & 1 deletion docker/conda/environments/cuda11.8_examples.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,6 @@ dependencies:
- s3fs>=2023.6
- pip
- wrapt=1.14.1 # ver 1.15 breaks the keras model used by the gnn_fraud_detection_pipeline
- torchmetrics=0.11.4
- pip:
# tensorflow exists in conda-forge but is tied to CUDA-11.3
- tensorflow==2.12.0
18 changes: 9 additions & 9 deletions examples/gnn_fraud_detection_pipeline/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,16 +69,16 @@ python run.py
```
====Registering Pipeline====
====Building Pipeline====
====Building Pipeline Complete!====
Graph construction rate: 0 messages [00:00, ? me====Registering Pipeline Complete!====
====Starting Pipeline====s [00:00, ? messages/s]
Graph construction rate: 0 messages [00:00, ? me====Building Pipeline Complete!====
Inference rate: 0 messages [00:00, ? messages/s]====Registering Pipeline Complete!====
====Starting Pipeline====
====Pipeline Started==== 0 messages [00:00, ? messages/s]
====Building Segment: linear_segment_0====ges/s]
Added source: <from-file-0; FileSourceStage(filename=validation.csv, iterative=False, file_type=FileTypes.Auto, repeat=1, filter_null=False)>
└─> morpheus.MessageMeta
Added stage: <deserialize-1; DeserializeStage(ensure_sliceable_index=True)>
└─ morpheus.MessageMeta -> morpheus.MultiMessage
Added stage: <fraud-graph-construction-2; FraudGraphConstructionStage(training_file=training.csv, input_file=validation.csv)>
Added stage: <fraud-graph-construction-2; FraudGraphConstructionStage(training_file=training.csv)>
└─ morpheus.MultiMessage -> stages.FraudGraphMultiMessage
Added stage: <monitor-3; MonitorStage(description=Graph construction rate, smoothing=0.05, unit=messages, delayed_start=False, determine_count_fn=None, log_level=LogLevels.INFO)>
└─ stages.FraudGraphMultiMessage -> stages.FraudGraphMultiMessage
Expand All @@ -94,13 +94,13 @@ Added stage: <serialize-8; SerializeStage(include=[], exclude=['^ID$', '^_ts_'],
└─ morpheus.MultiMessage -> morpheus.MessageMeta
Added stage: <monitor-9; MonitorStage(description=Serialize rate, smoothing=0.05, unit=messages, delayed_start=False, determine_count_fn=None, log_level=LogLevels.INFO)>
└─ morpheus.MessageMeta -> morpheus.MessageMeta
Added stage: <to-file-10; WriteToFileStage(filename=result.csv, overwrite=True, file_type=FileTypes.Auto, include_index_col=True, flush=False)>
Added stage: <to-file-10; WriteToFileStage(filename=output.csv, overwrite=True, file_type=FileTypes.Auto, include_index_col=True, flush=False)>
└─ morpheus.MessageMeta -> morpheus.MessageMeta
====Building Segment Complete!====
Graph construction rate[Complete]: 265 messages [00:00, 866.07 messages/s]
Inference rate[Complete]: 265 messages [00:03, 84.62 messages/s]
Add classification rate[Complete]: 265 messages [00:03, 83.91 messages/s]
Serialize rate[Complete]: 265 messages [00:03, 83.08 messages/s]
Graph construction rate[Complete]: 265 messages [00:00, 1218.88 messages/s]
Inference rate[Complete]: 265 messages [00:01, 174.04 messages/s]
Add classification rate[Complete]: 265 messages [00:01, 170.69 messages/s]
Serialize rate[Complete]: 265 messages [00:01, 166.36 messages/s]
====Pipeline Complete====
```

Expand Down
1 change: 0 additions & 1 deletion examples/gnn_fraud_detection_pipeline/requirements.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,4 +23,3 @@ dependencies:
- dask>=2023.1.1
- dgl=1.0.2
- distributed>=2023.1.1
- torchmetrics=0.11.4
8 changes: 4 additions & 4 deletions examples/gnn_fraud_detection_pipeline/stages/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -440,7 +440,7 @@ def build_fsi_graph(train_data, col_drop):
Normalized feature tensor after dropping specified columns.
Notes
-----
This function takes the training data, represented as a pandas DataFrame,
This function takes the training data, represented as a cudf DataFrame,
and constructs a heterogeneous graph (DGLGraph) from the given edgelist
and node index.
Expand All @@ -449,8 +449,8 @@ def build_fsi_graph(train_data, col_drop):
Example
-------
>>> import pandas as pd
>>> train_data = pd.DataFrame({'node_id': [1, 2, 3],
>>> import cudf
>>> train_data = cudf.DataFrame({'node_id': [1, 2, 3],
... 'feature1': [0.1, 0.2, 0.3],
... 'feature2': [0.4, 0.5, 0.6]})
>>> col_drop = ['feature2']
Expand All @@ -461,7 +461,7 @@ def build_fsi_graph(train_data, col_drop):
feature_tensors = torch.from_dlpack(feature_tensors.toDlpack())
feature_tensors = (feature_tensors - feature_tensors.mean(0, keepdim=True)) / (0.0001 +
feature_tensors.std(0, keepdim=True))

# Create client, merchant, transaction node id tensors & move to torch.tensor
client_tensor, merchant_tensor, transaction_tensor = torch.tensor_split(
torch.from_dlpack(train_data[col_drop].values.toDlpack()).long(), 3, dim=1)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ def build_fsi_graph(train_data, col_drop):
"""Build a heterogeneous graph from an edgelist and node index.
Parameters
----------
train_data : pd.DataFrame
train_data : cudf.DataFrame
Training data containing node features.
col_drop : list
List of features to drop from the node features.
Expand All @@ -117,7 +117,7 @@ def build_fsi_graph(train_data, col_drop):
Normalized feature tensor after dropping specified columns.
Notes
-----
This function takes the training data, represented as a pandas DataFrame,
This function takes the training data, represented as a cudf DataFrame,
and constructs a heterogeneous graph (DGLGraph) from the given edgelist
and node index.
Expand All @@ -126,8 +126,8 @@ def build_fsi_graph(train_data, col_drop):
Example
-------
>>> import pandas as pd
>>> train_data = pd.DataFrame({'node_id': [1, 2, 3],
>>> import cudf
>>> train_data = cudf.DataFrame({'node_id': [1, 2, 3],
... 'feature1': [0.1, 0.2, 0.3],
... 'feature2': [0.4, 0.5, 0.6]})
>>> col_drop = ['feature2']
Expand Down

0 comments on commit ee8e796

Please sign in to comment.