Skip to content

Commit 2fc8a20

Browse files
author
liqing
committed
beta 0.1.1.3
- fix benchmark script for older version adb - add FAQ.md - add environment requirement in Install.md - add coeff in Eltwise Op - fix bugs in strassen 1x1 data preparation - add download failure process in get_model.sh
1 parent ab7a871 commit 2fc8a20

File tree

13 files changed

+245
-55
lines changed

13 files changed

+245
-55
lines changed

benchmark/bench_android.sh

+2-1
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,8 @@ function bench_android() {
6262
find . -name "*.so" | while read solib; do
6363
adb push $solib $ANDROID_DIR
6464
done
65-
adb push benchmark.out timeProfile.out $ANDROID_DIR
65+
adb push benchmark.out $ANDROID_DIR
66+
adb push timeProfile.out $ANDROID_DIR
6667
adb shell chmod 0777 $ANDROID_DIR/benchmark.out
6768

6869
if [ "" != "$PUSH_MODEL" ]; then

benchmark/models/vgg16.mnn

-4.14 KB
Binary file not shown.

doc/FAQ.md

+111
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
## Compiling FAQ
2+
### Environment Requirement
3+
4+
cmake 3.10+
5+
gcc 4.9+
6+
protobuf 3.0+
7+
8+
__Remember to run cmake again after upgrading gcc.__
9+
10+
11+
### schema/generate.sh Relative Errors
12+
13+
``` shell
14+
*** building flatc ***
15+
CMake Error: Could not find CMAKE_ROOT !!!
16+
```
17+
18+
If the script fails with error above, your CMake was not installed correctly.
19+
20+
Try```sudo apt install extra-cmake-modules```or```export CMAKE_ROOT=/path/to/where_cmake_installed```to fix it.
21+
22+
__Remember to run schema/generate.sh after editing schema (*.proto).__
23+
24+
25+
### tools/script/get_model.sh Relative Errors
26+
27+
``` shell
28+
Could NOT find Protobuf (missing: Protobuf_INCLUDE_DIR)
29+
```
30+
31+
``` shell
32+
Unrecognized syntax identifier "proto3". This parser only recognizes "proto2".
33+
```
34+
35+
If the script fails with errors above, your protobuf was not installed correctly. Follow [Protobuf's Installation Instructions](https://github.com/protocolbuffers/protobuf/blob/master/src/README.md) to install it.
36+
37+
If there are multiple protobufs are installed and conflicts with each other, you could try solutions below:
38+
39+
``` shell
40+
which protoc
41+
# comment the output path in .bashrc if it do NOT direct to the correct protoc.
42+
source .bashrc
43+
sudo ldconfig
44+
```
45+
46+
or
47+
48+
``` shell
49+
# uninstall
50+
sudo apt-get remove libprotobuf-dev
51+
sudo apt-get remove protobuf-compiler
52+
sudo apt-get remove python-protobuf
53+
sudo rm -rf /usr/local/bin/protoc
54+
sudo rm -rf /usr/bin/protoc
55+
sudo rm -rf /usr/local/include/google
56+
sudo rm -rf /usr/local/include/protobuf*
57+
sudo rm -rf /usr/include/google
58+
sudo rm -rf /usr/include/protobuf*
59+
60+
# install
61+
sudo apt-get update
62+
sudo ldconfig
63+
sudo apt-get install libprotobuf* protobuf-compiler python-protobuf
64+
```
65+
66+
### Cross-compile on Windows
67+
68+
Cross-compile on Windows is not supported currently. You may try https://github.com/microsoft/Terminal with Linux subsystem including.
69+
70+
71+
### Quantized Models
72+
73+
We support TensorFlow Quantized Models for now. And we plan to provide a model quantizing tool based on MNN model format, which is training free.
74+
75+
76+
### Unsupported Operations
77+
78+
``` shell
79+
opConverter ==> MNN Converter NOT_SUPPORTED_OP: [ ANY_OP_NAME ]
80+
```
81+
82+
If the MNNConverter fails with error above, one or more operations are not supported by MNN. You could submit an issue or leave a comment at pinned issue. If you want to implement it yourself, You can follow [our guide](AddOp_EN.md). Pull requests are always welcome.
83+
84+
85+
__TensorFlow SSD model is not supported -- usage of TensorFlow Object API produces some unsupported control logic operations in post-processing part. And the TensorFlow SSD model is not as efficient as Caffe SSD model. So, it is recommended to use the Caffe version SSD model.__
86+
87+
88+
## Runtime FAQ
89+
90+
### What is NC4HW4 Format ?
91+
92+
The difference between NCHW and NC4HW4 is just like the difference between color representing method planar and chunky. Imagine a 2x2 RGBA image, in planar representing (NCHW), its storage would be `RRRRGGGGBBBBAAAA`; and in chunky representing (NC4HW4), its storage would be `RGBARGBARGBARGBA`. In MNN, we pack each 4 channels for floats or 8 channels for int8s to gain better performance with SIMD.
93+
94+
You can obtain tensor's format through ```TensorUtils::getDescribe(tensor)->dimensionFormat```. If it returns `MNN_DATA_FORMAT_NC4HW4`, the channel dim is packed, which may cause tensor's elementSize be greater than product of each dimension.
95+
96+
### How to Convert Between Formats ?
97+
98+
You can convert tensor format using codes below:
99+
100+
101+
``` c++
102+
auto srcTensor = Tensor::create({1, 224, 224, 3}, Tensor::TENSORFLOW);
103+
// ... set srcTensor data
104+
auto dstTensor = net->getSessionInput(session, NULL);
105+
dstTensor->copyFromHostTensor(srcTensor);
106+
```
107+
108+
### Why does output tensor data copying so slow on GPU backend?
109+
110+
If you do not wait for GPU inference to be finished (through runSessionWithCallback with sync), copyToHostTensor has to wait for it before copying data.
111+

doc/Install_CN.md

+8-3
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
## Linux|arm|aarch64|Darwin
2828
### 本地编译
2929
步骤如下:
30-
1. 安装cmake(建议使用3.10或以上版本)
30+
1. 安装cmake(建议使用3.10或以上版本)、protobuf(使用3.0或以上版本)、gcc(使用4.9或以上版本)
3131
2. `cd /path/to/MNN`
3232
3. `./schema/generate.sh`
3333
4. `./tools/script/get_model.sh`(可选,模型仅demo工程需要)
@@ -72,7 +72,7 @@ make -j4
7272
## Android
7373

7474
步骤如下:
75-
1. 安装cmake(建议使用3.10或以上版本)
75+
1. 安装cmake(建议使用3.10或以上版本)、protobuf(使用3.0或以上版本)、gcc(使用4.9或以上版本)
7676
2.`https://developer.android.com/ndk/downloads/`下载安装NDK,最好不要超过r17、r18及之后的ndk版本(否则,无法使用gcc编译,且clang在编译32位的so时有bug)
7777
3. 在 .bashrc 或者 .bash_profile 中设置 NDK 环境变量,eg: export ANDROID_NDK=/Users/username/path/to/android-ndk-r14b
7878
4. `cd /path/to/MNN`
@@ -84,4 +84,9 @@ make -j4
8484

8585
## iOS
8686

87-
在macOS下,用Xcode打开project/ios/MNN.xcodeproj,点击编译即可
87+
步骤如下:
88+
1. 安装protobuf(使用3.0或以上版本)
89+
2. `cd /path/to/MNN`
90+
3. `./schema/generate.sh`
91+
4. `./tools/script/get_model.sh`(可选,模型仅demo工程需要)
92+
5. 在macOS下,用Xcode打开project/ios/MNN.xcodeproj,点击编译即可

doc/Install_EN.md

+9-5
Original file line numberDiff line numberDiff line change
@@ -25,10 +25,10 @@ Defaults `OFF`, When `ON`, build the Metal backend, apply GPU according to setti
2525
## Linux|arm|aarch64|Darwin
2626

2727
### Build on Host
28-
1. Install cmake(cmake version >=3.10 is recommended)
28+
1. Install cmake (version >= 3.10 is recommended), protobuf (version >= 3.0 is required) and gcc (version >= 4.9 is required)
2929
2. `cd /path/to/MNN`
3030
3. `./schema/generate.sh`
31-
4. `./tools/script/get_model.sh`(optional, models are required only in demo project)
31+
4. `./tools/script/get_model.sh`(optional, models are needed only in demo project)
3232
5. `mkdir build && cd build && cmake .. && make -j4`
3333

3434
Then you will get the MNN library(libMNN.so)
@@ -70,16 +70,20 @@ make -j4
7070

7171
## Android
7272

73-
1. Install cmake(cmake version >=3.10 is recommended)
73+
1. Install cmake (version >=3.10 is recommended), protobuf (version >= 3.0 is required) and gcc (version >= 4.9 is required)
7474
2. [Download and Install NDK](https://developer.android.com/ndk/downloads/), download the the version before r17 is strongly recommended (otherwise cannot use gcc to build, and building armv7 with clang possibly will get error)
7575
3. Set ANDROID_NDK path, eg: `export ANDROID_NDK=/Users/username/path/to/android-ndk-r14b`
7676
4. `cd /path/to/MNN`
7777
5. `./schema/generate.sh`
78-
6. `./tools/script/get_model.sh`(optional, models are required only in demo project)
78+
6. `./tools/script/get_model.sh`(optional, models are needed only in demo project)
7979
7. `cd project/android`
8080
8. Build armv7 library: `mkdir build_32 && cd build_32 && ../build_32.sh`
8181
9. Build armv8 library: `mkdir build_64 && cd build_64 && ../build_64.sh`
8282

8383
## iOS
8484

85-
open [MNN.xcodeproj](../project/ios/) with Xcode on macOS, then build.
85+
1. Install protobuf (version >= 3.0 is required)
86+
2. `cd /path/to/MNN`
87+
3. `./schema/generate.sh`
88+
4. `./tools/script/get_model.sh`(optional, models are needed only in demo project)
89+
5. open [MNN.xcodeproj](../project/ios/) with Xcode on macOS, then build.

schema/default/CaffeOp.fbs

+1
Original file line numberDiff line numberDiff line change
@@ -162,6 +162,7 @@ enum EltwiseType : byte {
162162

163163
table Eltwise {
164164
type:EltwiseType;
165+
coeff:[float];
165166
}
166167

167168
table Flatten {

source/backend/cpu/CPUEltwise.cpp

+31-6
Original file line numberDiff line numberDiff line change
@@ -20,14 +20,40 @@
2020

2121
namespace MNN {
2222

23+
CPUEltwise::CPUEltwise(Backend *b, const MNN::Op *op) : Execution(b) {
24+
auto eltwiseParam = op->main_as_Eltwise();
25+
mType = eltwiseParam->type();
26+
27+
// keep compatible with old model
28+
if (eltwiseParam->coeff()) {
29+
const int size = eltwiseParam->coeff()->size();
30+
mCoeff.resize(size);
31+
memcpy(mCoeff.data(), eltwiseParam->coeff()->data(), size * sizeof(float));
32+
}
33+
}
34+
2335
ErrorCode CPUEltwise::onExecute(const std::vector<Tensor *> &inputs, const std::vector<Tensor *> &outputs) {
2436
auto inputTensor = inputs[0];
2537
const int size = inputTensor->elementSize();
2638
auto sizeQuad = size / 4;
2739

28-
auto outputTensor = outputs[0];
29-
auto outputHost = outputTensor->host<float>();
30-
auto proc = MNNMatrixProd;
40+
auto outputTensor = outputs[0];
41+
auto outputHost = outputTensor->host<float>();
42+
const auto input0Ptr = inputs[0]->host<float>();
43+
44+
const int coeffSize = mCoeff.size();
45+
bool isIdentity = coeffSize >= 2;
46+
if (isIdentity) {
47+
// when Eltwise has coeff
48+
if (mCoeff[0] == 1.0f && mCoeff[1] == 0.0f) {
49+
memcpy(outputHost, input0Ptr, inputs[0]->size());
50+
return NO_ERROR;
51+
} else {
52+
return NOT_SUPPORT;
53+
}
54+
}
55+
56+
auto proc = MNNMatrixProd;
3157
switch (mType) {
3258
case EltwiseType_PROD:
3359
proc = MNNMatrixProd;
@@ -44,7 +70,7 @@ ErrorCode CPUEltwise::onExecute(const std::vector<Tensor *> &inputs, const std::
4470
}
4571

4672
auto inputT1 = inputs[1];
47-
proc(outputHost, inputs[0]->host<float>(), inputT1->host<float>(), sizeQuad, 0, 0, 0, 1);
73+
proc(outputHost, input0Ptr, inputT1->host<float>(), sizeQuad, 0, 0, 0, 1);
4874
for (int i = 2; i < inputs.size(); ++i) {
4975
proc(outputHost, outputHost, inputs[i]->host<float>(), sizeQuad, 0, 0, 0, 1);
5076
}
@@ -55,8 +81,7 @@ class CPUEltwiesCreator : public CPUBackend::Creator {
5581
public:
5682
virtual Execution *onCreate(const std::vector<Tensor *> &inputs, const std::vector<Tensor *> &outputs,
5783
const MNN::Op *op, Backend *backend) const {
58-
auto elt = op->main_as_Eltwise();
59-
return new CPUEltwise(backend, elt->type());
84+
return new CPUEltwise(backend, op);
6085
}
6186
};
6287
REGISTER_CPU_OP_CREATOR(CPUEltwiesCreator, OpType_Eltwise);

source/backend/cpu/CPUEltwise.hpp

+2-3
Original file line numberDiff line numberDiff line change
@@ -15,14 +15,13 @@
1515
namespace MNN {
1616
class CPUEltwise : public Execution {
1717
public:
18-
CPUEltwise(Backend *b, MNN::EltwiseType type) : Execution(b), mType(type) {
19-
// nothing to do
20-
}
18+
CPUEltwise(Backend *b, const MNN::Op *op);
2119
virtual ~CPUEltwise() = default;
2220
virtual ErrorCode onExecute(const std::vector<Tensor *> &inputs, const std::vector<Tensor *> &outputs) override;
2321

2422
private:
2523
EltwiseType mType;
24+
std::vector<float> mCoeff;
2625
};
2726

2827
} // namespace MNN

source/backend/cpu/compute/Convolution1x1Strassen.cpp

+2-2
Original file line numberDiff line numberDiff line change
@@ -118,13 +118,13 @@ ErrorCode Convolution1x1Strassen::onResize(const std::vector<Tensor *> &inputs,
118118
for (oyStart = 0; oyStart * strideY - padY < 0; ++oyStart) {
119119
// do nothing
120120
}
121-
for (oyEnd = oh - 1; oyEnd * strideY - padY >= ih - 1; --oyEnd) {
121+
for (oyEnd = oh - 1; oyEnd * strideY - padY >= ih; --oyEnd) {
122122
// do nothing
123123
}
124124
for (oxStart = 0; oxStart * strideX - padX < 0; ++oxStart) {
125125
// do nothing
126126
}
127-
for (oxEnd = oh - 1; oxEnd * strideX - padX >= iw - 1; --oxEnd) {
127+
for (oxEnd = ow - 1; oxEnd * strideX - padX >= iw; --oxEnd) {
128128
// do nothing
129129
}
130130
int oyCount = oyEnd - oyStart + 1;

tools/converter/source/MNNDump2Json.cpp

+2
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,8 @@ int main(int argc, const char** argv) {
4040
} else if (type == MNN::OpParameter::OpParameter_MatMul) {
4141
opParam->main.AsMatMul()->weight.clear();
4242
opParam->main.AsMatMul()->bias.clear();
43+
} else if (type == MNN::OpParameter::OpParameter_PRelu) {
44+
opParam->main.AsPRelu()->slope.clear();
4345
}
4446
}
4547
flatbuffers::FlatBufferBuilder newBuilder(1024);

tools/converter/source/caffe/Eltwise.cpp

+9-2
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
//
88

99
#include "OpConverter.hpp"
10+
#include "logkit.h"
1011

1112
class EltWise : public OpConverter {
1213
public:
@@ -26,8 +27,8 @@ class EltWise : public OpConverter {
2627
void EltWise::run(MNN::OpT* dstOp, const caffe::LayerParameter& parameters, const caffe::LayerParameter& weight) {
2728
auto elt = new MNN::EltwiseT;
2829
dstOp->main.value = elt;
29-
auto& c = parameters.eltwise_param();
30-
switch (c.operation()) {
30+
auto& caffeParam = parameters.eltwise_param();
31+
switch (caffeParam.operation()) {
3132
case caffe::EltwiseParameter_EltwiseOp_MAX:
3233
elt->type = MNN::EltwiseType_MAXIMUM;
3334
break;
@@ -41,5 +42,11 @@ void EltWise::run(MNN::OpT* dstOp, const caffe::LayerParameter& parameters, cons
4142
default:
4243
break;
4344
}
45+
46+
const int coffSize = caffeParam.coeff_size();
47+
elt->coeff.resize(coffSize);
48+
for (int i = 0; i < coffSize; ++i) {
49+
elt->coeff[i] = caffeParam.coeff(i);
50+
}
4451
}
4552
static OpConverterRegister<EltWise> a("Eltwise");

0 commit comments

Comments
 (0)