From 785a2bad92da72f9de33e30db4b25f9313b731cb Mon Sep 17 00:00:00 2001 From: David Gardner Date: Tue, 5 Dec 2023 08:51:25 -0800 Subject: [PATCH 1/4] Remove dropna entry, this was a hang-over from when this document was titled as the quick start guide --- docs/source/cloud_deployment_guide.md | 14 -------------- 1 file changed, 14 deletions(-) diff --git a/docs/source/cloud_deployment_guide.md b/docs/source/cloud_deployment_guide.md index 6f1af0c2a8..7bd0afd6af 100644 --- a/docs/source/cloud_deployment_guide.md +++ b/docs/source/cloud_deployment_guide.md @@ -807,17 +807,3 @@ This section lists solutions to problems you might encounter with Morpheus or fr - Problem: If the standalone kafka cluster is receiving significant message throughput from the producer, this error may happen. - Solution: Reinstall the Morpheus workflow and reduce the Kafka topic's message retention time and message producing rate. - -## The dropna stage -The Drop Null Attributes stage (dropna) requires the specification of a column name. This column will vary from use case (and its input data) to use case. These are the applicable columns for the pre-built pipelines provided by Morpheus. - -| Input | Columns | -| ----- | ------- | -| Azure DFP | userPrincipalName | -| Duo DFP | username | -| DFP Cloudtrail | userIdentitysessionContextsessionIssueruserName | -| Email | data | -| GNN | index, client_node, merchant_node | -| Log Parsing | raw | -| PCAP | data | -| Ransomware | PID, Process, snapshot_id, timestamp, source | From 2891ae44d20a890ec332091827c4a8e50e7911dc Mon Sep 17 00:00:00 2001 From: David Gardner Date: Tue, 5 Dec 2023 09:38:43 -0800 Subject: [PATCH 2/4] Remove TOC entry for removed dropna stage fix link to building a pipeline doc Update flags for hammah and phishing pipelines tp match updates in PR #1398 --- docs/source/cloud_deployment_guide.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/source/cloud_deployment_guide.md b/docs/source/cloud_deployment_guide.md index 7bd0afd6af..727618a243 100644 --- a/docs/source/cloud_deployment_guide.md +++ b/docs/source/cloud_deployment_guide.md @@ -47,7 +47,6 @@ limitations under the License. - [Additional Documentation](#additional-documentation) - [Troubleshooting](#troubleshooting) - [Common Problems](#common-problems) -- [The dropna stage](#the-dropna-stage) ## Introduction @@ -403,7 +402,7 @@ To publish messages to a Kafka topic, we need to copy datasets to locations wher kubectl -n $NAMESPACE exec sdk-cli-helper -- cp -R /workspace/examples/data /common ``` -Refer to the [Morpheus CLI Overview](https://github.com/nv-morpheus/Morpheus/blob/branch-23.11/docs/source/basics/overview.rst) and [Building a Pipeline](https://github.com/nv-morpheus/Morpheus/blob/branch-23.11/docs/source/basics/building_a_pipeline.rst) documentation for more information regarding the commands. +Refer to the [Morpheus CLI Overview](https://github.com/nv-morpheus/Morpheus/blob/branch-23.11/docs/source/basics/overview.rst) and [Building a Pipeline](https://github.com/nv-morpheus/Morpheus/blob/branch-23.11/docs/source/basics/building_a_pipeline.md) documentation for more information regarding the commands. > **Note**: Before running the example pipelines, ensure the criteria below are met: - Ensure models specific to the pipeline are deployed. @@ -445,6 +444,7 @@ helm install --set ngc.apiKey="$API_KEY" \ --userid_filter=user123 \ --feature_scaler=standard \ --userid_column_name=userIdentitysessionContextsessionIssueruserName \ + --timestamp_column_name="event_dt" \ from-cloudtrail --input_glob=/common/models/datasets/validation-data/dfp-cloudtrail-*-input.csv \ --max_files=200 \ train-ae --train_data_glob=/common/models/datasets/training-data/dfp-cloudtrail-*.csv \ @@ -495,7 +495,7 @@ helm install --set ngc.apiKey="$API_KEY" \ monitor --description 'Preprocess Rate' \ inf-triton --model_name=phishing-bert-onnx --server_url=ai-engine:8000 --force_convert_inputs=True \ monitor --description 'Inference Rate' --smoothing=0.001 --unit inf \ - add-class --label=pred --threshold=0.7 \ + add-class --label=is_phishing --threshold=0.7 \ serialize \ to-file --filename=/common/data//phishing-bert-onnx-output.jsonlines --overwrite" \ --namespace $NAMESPACE \ @@ -525,7 +525,7 @@ helm install --set ngc.apiKey="$API_KEY" \ monitor --description 'Preprocess Rate' \ inf-triton --force_convert_inputs=True --model_name=phishing-bert-onnx --server_url=ai-engine:8000 \ monitor --description='Inference Rate' --smoothing=0.001 --unit inf \ - add-class --label=pred --threshold=0.7 \ + add-class --label=is_phishing --threshold=0.7 \ serialize --exclude '^ts_' \ to-kafka --output_topic --bootstrap_servers broker:9092" \ --namespace $NAMESPACE \ From ff8aac1791bb5886a5dc90566fcda50e8ef18e3c Mon Sep 17 00:00:00 2001 From: David Gardner Date: Tue, 5 Dec 2023 09:49:34 -0800 Subject: [PATCH 3/4] Fix quoting issue --- docs/source/cloud_deployment_guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/cloud_deployment_guide.md b/docs/source/cloud_deployment_guide.md index 727618a243..8bfbbe0322 100644 --- a/docs/source/cloud_deployment_guide.md +++ b/docs/source/cloud_deployment_guide.md @@ -444,7 +444,7 @@ helm install --set ngc.apiKey="$API_KEY" \ --userid_filter=user123 \ --feature_scaler=standard \ --userid_column_name=userIdentitysessionContextsessionIssueruserName \ - --timestamp_column_name="event_dt" \ + --timestamp_column_name=event_dt \ from-cloudtrail --input_glob=/common/models/datasets/validation-data/dfp-cloudtrail-*-input.csv \ --max_files=200 \ train-ae --train_data_glob=/common/models/datasets/training-data/dfp-cloudtrail-*.csv \ From cb45dc01878ae96deaf30eb30c0ac025def3b118 Mon Sep 17 00:00:00 2001 From: David Gardner Date: Tue, 5 Dec 2023 09:54:07 -0800 Subject: [PATCH 4/4] Link to the developer guides not the directory --- docs/source/cloud_deployment_guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/cloud_deployment_guide.md b/docs/source/cloud_deployment_guide.md index 8bfbbe0322..f14b133e50 100644 --- a/docs/source/cloud_deployment_guide.md +++ b/docs/source/cloud_deployment_guide.md @@ -782,7 +782,7 @@ kubectl -n $NAMESPACE exec deploy/broker -c broker -- kafka-topics.sh \ ## Additional Documentation For more information on how to use the Morpheus Python API to customize and run your own optimized AI pipelines, Refer to below documentation. -- [Morpheus Developer Guide](https://github.com/nv-morpheus/Morpheus/tree/branch-23.11/docs/source/developer_guide) +- [Morpheus Developer Guides](https://github.com/nv-morpheus/Morpheus/blob/branch-23.11/docs/source/developer_guide/guides.md) - [Morpheus Pipeline Examples](https://github.com/nv-morpheus/Morpheus/tree/branch-23.11/examples)