Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

add mkldnn softmax_output #13699

Merged
merged 17 commits into from
Feb 13, 2019
Merged

add mkldnn softmax_output #13699

merged 17 commits into from
Feb 13, 2019

Conversation

rongzha1
Copy link
Contributor

@rongzha1 rongzha1 commented Dec 20, 2018

Description

Add mkldnn implement for softmax_output OP

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • [done ] Changes are complete (i.e. I finished coding on this PR)
  • [done ] All changes have test coverage:
  • [ done] To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

test_softmax has passed

@pengzhao-intel

@pengzhao-intel
Copy link
Contributor

@rongzha1 please help fix the lint issue

Warning, treated as error:

/work/mxnet/python/mxnet/ndarray/init.py:docstring of mxnet.ndarray.SoftmaxOutput:30: WARNING: Bullet list ends without a blank line; unexpected unindent.

src/operator/nn/mkldnn/mkldnn_softmax_output.cc Outdated Show resolved Hide resolved
}

auto input_mem = idata.GetMKLDNNData();
auto output_mem = odata.GetMKLDNNData();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function will create memory with default layout.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for softmax_output to have input with internal layout? Then how does the output look like?

src/operator/nn/mkldnn/mkldnn_softmax_output.cc Outdated Show resolved Hide resolved
src/operator/softmax_output-inl.h Outdated Show resolved Hide resolved
src/operator/softmax_output-inl.h Show resolved Hide resolved
src/operator/softmax_output-inl.h Outdated Show resolved Hide resolved
@pengzhao-intel
Copy link
Contributor

@rongzha1 could you post some performance changes by this PR?

@rongzha1
Copy link
Contributor Author

@rongzha1 could you post some performance changes by this PR?

for 1024*256 input, performance speedup 2.75

env: skx-8180, 1socket 28 core,
model : wide_and_deep model

old (ms) opt(ms) speedup
0.1890553 0.0687512 2.7498492

Copy link
Contributor

@sandeep-krishnamurthy sandeep-krishnamurthy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contributions.
@azai91 - You may be interested in this PR.

src/operator/nn/mkldnn/mkldnn_softmax_output.cc Outdated Show resolved Hide resolved
src/operator/nn/mkldnn/mkldnn_softmax_output.cc Outdated Show resolved Hide resolved
@pengzhao-intel
Copy link
Contributor

@rongzha1 please rebase the code and make the CI pass :)

@rongzha1
Copy link
Contributor Author

rongzha1 commented Jan 3, 2019

please help to review and merge to the master branch. @eric-haibin-lin @zheng-da @azai91

Copy link
Contributor

@pengzhao-intel pengzhao-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WIP to review :)

#include "./mkldnn_ops-inl.h"
#include "./mkldnn_base-inl.h"

#if MXNET_USE_MKLDNN == 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to the before of include

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// softmax_output has no axis parameter, so use it as it original implement.
int axis = data.shape().ndim() - 1;
mkldnn::softmax_forward::desc desc = is_train
? mkldnn::softmax_forward::desc(mkldnn::prop_kind::forward_training,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the training mode support now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

following mkldnn_softmax did

? mkldnn::softmax_forward::desc(mkldnn::prop_kind::forward_training,
data_md, axis)
: mkldnn::softmax_forward::desc(mkldnn::prop_kind::forward_scoring,
data_md, axis);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto prop = is_train ? mkldnn::prop_kind::forward_training : mkldnn::prop_kind::forward_scoring;
auto desc = mkldnn::softmax_forward::desc(prop, data_md, axis);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK


static mkldnn::softmax_forward::primitive_desc GetSoftmaxOutputFwdDescImpl(
const SoftmaxOutputParam& param, bool is_train,
const NDArray &data, const mkldnn::memory &input_mem) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need have data and input_mem at the same time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, remove data

}

auto input_mem = idata.GetMKLDNNData();
auto output_mem = odata.GetMKLDNNData();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for softmax_output to have input with internal layout? Then how does the output look like?

return op;

DMLC_REGISTER_PARAMETER(SoftmaxOutputParam);
struct SoftmaxOutputGrad {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eric-haibin-lin Please help to review this. Do we have any existing gradient structure to do this?

auto input_mem = idata.GetMKLDNNData();
auto output_mem = odata.GetMKLDNNData();

MKLDNNSoftmaxOutputFwd &fwd = GetSoftmaxOutputForward(param, ctx, idata, *input_mem);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

label is not used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

label is used for backward

gnode->attrs.op = nnvm::Op::Get("_backward_SoftmaxOutput");
gnode->attrs.name = n->attrs.name + "_backward";
std::vector<nnvm::NodeEntry> in_grad(2);
for (uint32_t i = 0; i < 2; ++i) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there are only two elements, no need to have for-loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

src/operator/softmax_output.cc Show resolved Hide resolved
const std::vector<NDArray> &outputs) {
CHECK_EQ(inputs.size(), 2U);
const SoftmaxOutputParam &param = nnvm::get<SoftmaxOutputParam>(attrs.parsed);
// MKLDNN softmaxOutput only works well on the special MKLDNN layout.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "special MKLDNN layout" mean here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

means support ndim 1,2,4; remove this ambiguous comments,

Copy link
Contributor

@pengzhao-intel pengzhao-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution.

LGTM


MXNET_REGISTER_OP_PROPERTY(Softmax, DeprecatedSoftmaxProp)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szha could you help to take a look at this change? SoftmaxOutput is re-writed with NNVM flavor in this PR and the deprecated Softmax is moved to be an alias of SoftmaxOutput. Need your confirm that it doesn't break any API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that they differ by only the label field, which means the note isn't accurate and if it should be considered breakage then it already happened in the past. Looks like the Softmax op as it stands isn't really usable so I think making it an alias of SoftmaxOutput is actually improvement.

@TaoLv
Copy link
Member

TaoLv commented Jan 30, 2019

Ping @szha @eric-haibin-lin for review. Thank you.

@pengzhao-intel
Copy link
Contributor

@szha could you help take a look for the API change?

@rongzha1
Copy link
Contributor Author

Hi, @szha @eric-haibin-lin Can you help to review this PR and merge it to master branch? Thanks

@TaoLv
Copy link
Member

TaoLv commented Feb 12, 2019

@rongzha1 Please re-base code and re-trigger CI. I will merge this PR tomorrow if there is no other comment.

Copy link
Member

@TaoLv TaoLv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit comments.


auto input_mem = idata.GetMKLDNNData();
auto out_mem = CreateMKLDNNMem(out_data[softmaxout_enum::kOut],
input_mem->get_primitive_desc(), req[softmaxout_enum::kOut]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent.

}
FallBackCompute(SoftmaxOutputCompute<cpu>, attrs, ctx, inputs, req, outputs);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove blank line.

return {"data", "label"};
}


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reduce to 1 blank line.

@TaoLv
Copy link
Member

TaoLv commented Feb 13, 2019

Merging now.

@TaoLv TaoLv merged commit 45978a9 into apache:master Feb 13, 2019
stephenrawls pushed a commit to stephenrawls/incubator-mxnet that referenced this pull request Feb 16, 2019
* add mkldnn softmax_output

* fix gpu OP unittest error

* fix ci/jenkins/mxnet-validation/unix-gpu compiler error

* fix coding style

* fix Tao comments

* remove blank line, fix indentx

* modify according to sandeep's comments

* change get CPU engine method, and pravate variable

* move macro MXNET_USE_MKLDNN to the head

* modify according to Tao's comments

* make output layout as input

* change API of GetSoftmaxOutputForward

* add CommitOutput for mkldnn_softmax_output

* trigger Jenkins re-test

* add alias Softmax symbol for SoftmaxOutput OP

* indent and remove blank line
jessr92 pushed a commit to jessr92/incubator-mxnet that referenced this pull request Feb 19, 2019
* add mkldnn softmax_output

* fix gpu OP unittest error

* fix ci/jenkins/mxnet-validation/unix-gpu compiler error

* fix coding style

* fix Tao comments

* remove blank line, fix indentx

* modify according to sandeep's comments

* change get CPU engine method, and pravate variable

* move macro MXNET_USE_MKLDNN to the head

* modify according to Tao's comments

* make output layout as input

* change API of GetSoftmaxOutputForward

* add CommitOutput for mkldnn_softmax_output

* trigger Jenkins re-test

* add alias Softmax symbol for SoftmaxOutput OP

* indent and remove blank line
drivanov pushed a commit to drivanov/incubator-mxnet that referenced this pull request Mar 4, 2019
* add mkldnn softmax_output

* fix gpu OP unittest error

* fix ci/jenkins/mxnet-validation/unix-gpu compiler error

* fix coding style

* fix Tao comments

* remove blank line, fix indentx

* modify according to sandeep's comments

* change get CPU engine method, and pravate variable

* move macro MXNET_USE_MKLDNN to the head

* modify according to Tao's comments

* make output layout as input

* change API of GetSoftmaxOutputForward

* add CommitOutput for mkldnn_softmax_output

* trigger Jenkins re-test

* add alias Softmax symbol for SoftmaxOutput OP

* indent and remove blank line
vdantu pushed a commit to vdantu/incubator-mxnet that referenced this pull request Mar 31, 2019
* add mkldnn softmax_output

* fix gpu OP unittest error

* fix ci/jenkins/mxnet-validation/unix-gpu compiler error

* fix coding style

* fix Tao comments

* remove blank line, fix indentx

* modify according to sandeep's comments

* change get CPU engine method, and pravate variable

* move macro MXNET_USE_MKLDNN to the head

* modify according to Tao's comments

* make output layout as input

* change API of GetSoftmaxOutputForward

* add CommitOutput for mkldnn_softmax_output

* trigger Jenkins re-test

* add alias Softmax symbol for SoftmaxOutput OP

* indent and remove blank line
haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
* add mkldnn softmax_output

* fix gpu OP unittest error

* fix ci/jenkins/mxnet-validation/unix-gpu compiler error

* fix coding style

* fix Tao comments

* remove blank line, fix indentx

* modify according to sandeep's comments

* change get CPU engine method, and pravate variable

* move macro MXNET_USE_MKLDNN to the head

* modify according to Tao's comments

* make output layout as input

* change API of GetSoftmaxOutputForward

* add CommitOutput for mkldnn_softmax_output

* trigger Jenkins re-test

* add alias Softmax symbol for SoftmaxOutput OP

* indent and remove blank line
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants