Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

split_v2 operator #13687

Merged
merged 1 commit into from
Jan 23, 2019
Merged

split_v2 operator #13687

merged 1 commit into from
Jan 23, 2019

Conversation

HyperZealot
Copy link
Contributor

Description

New version of split operator to match the behavior of numpy.split

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • New Split_v2 operator
  • Unit tests

Comments

@HyperZealot HyperZealot requested a review from szha as a code owner December 19, 2018 08:20
@Roshrini
Copy link
Member

@HyperZealot Thanks for working on this.
@mxnet-label-bot Add [pr-awaiting-review, Operator]

@marcoabreu marcoabreu added Operator pr-awaiting-review PR is waiting for code review labels Dec 19, 2018
@rongzha1
Copy link
Contributor

Could you explain the difference between split(sliceChannel) and split_v2 , and maybe show some performance compare between these two OP? Thanks

python/mxnet/ndarray/ndarray.py Show resolved Hide resolved
indices = [0] + list(indices_or_sections)
else:
raise ValueError('indices_or_sections must either int or tuple of ints')
return _internal._split_v2(ary, indices, axis, squeeze_axis, sections)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the symbol implementation and ndarray implementation are different? the ndarray impl always computes indices and here it may pass the number of sections directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do not know the shape of a symbol so you cannot pre-compute the indices in symbol mode.

}
}; // struct SplitParam

inline TShape GetSplitIndices(const TShape& ishape, int axis, int sections) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe return Tuple may make more sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TShape is also a Tuple: https://github.com/dmlc/tvm/blob/master/nnvm/include/nnvm/tuple.h#L325. But maybe I can also switch to nnvm::Tuple.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know TShape uses Tuple. I think it's better to use Tuple for the return result. TShape makes the return result look like a shape.

Copy link
Contributor

@apeforest apeforest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain the difference between split(sliceChannel) and split_v2? Is it possible to update UI of split? We should avoid creating v2 of an operator as much as possible.

@HyperZealot
Copy link
Contributor Author

@apeforest Please also avoid v2 of the same question as well, I think ur question was asked by @rongzha1 already.
This version is to support split of an array at arbitrary points along as axis while the previous split only supported equal splits along an axis, to match numpy behavior of the same operator.
On the other hand, there was some API-breaking changes to cpp-package in my first attempt to modify the original split operator here so that's why I'm moving to creation of a new operator.

@Roshrini
Copy link
Member

Roshrini commented Jan 2, 2019

@zheng-da Can you take a look again to see if your comments are addressed? Thanks

}
}; // struct SplitParam

inline TShape GetSplitIndices(const TShape& ishape, int axis, int sections) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know TShape uses Tuple. I think it's better to use Tuple for the return result. TShape makes the return result look like a shape.

src/operator/tensor/matrix_op-inl.h Outdated Show resolved Hide resolved
const size_t section_size = indices[target + 1] - indices[target];
const size_t target_idx =
head_idx * trailing_size * section_size + mid_idx * trailing_size + tail_idx;
target_data[target_idx] = in_data[i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little concerned about this kernel. It takes a lot of computation to copy an element from the original array to a destination array. At least we should copy the entire row unless we split in the last dimension.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt if memcpy is optimal for all cases actually, and splitting the kernel into 2 versions would also make the code harder to maintain and manage, what do you think if we treat that as a further optimization and do it in a follow-up PR so that the users of DGL can enjoy this new op ASAP?

src/operator/tensor/matrix_op-inl.h Outdated Show resolved Hide resolved
@szha szha merged commit 45d1a1e into apache:master Jan 23, 2019
@HyperZealot HyperZealot deleted the split_v2 branch January 23, 2019 21:10
jessr92 pushed a commit to jessr92/incubator-mxnet that referenced this pull request Jan 27, 2019
stephenrawls pushed a commit to stephenrawls/incubator-mxnet that referenced this pull request Feb 16, 2019
haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Operator pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants