Iteration Plan (August - September 2017)

Question

Iteration Plan (August - September 2017)

cha-zhang opened this issue 7 years ago · comments

This plan captures our work from early August to mid September. We will ship around September 15th. Major work items of this iteration include Volta 16bit support and C#/.NET API. There will also be numerous other improvements we will make as detailed below.

Endgame

September 11: Code freeze for the end game
September 15: Release date

Planned items

We plan to ship these items at the end of this iteration.

Legend of annotations:

Icon	Description
	Item not started
	Item finished
🏃	Work in progress
✋	Blocked
💪	Stretch

Documentation

Add HTML version of tutorials and manuals so that they can be searchable
Add missing evaluation documents

System

✋ 16bit support for training on Volta GPU (limited functionality)

Update learner interface to simplify parameter setting and adding new learners (Potential breaking change)
A preliminary C#/.NET API that enables people to train simple networks such as ConvNet on MNIST.
R-binding for training and evaluation (will be published in a separate repository)
✋ Improve statistics for distributed evaluation

Examples

Faster R-CNN object detection
- Enable arbitrary input image size via free static axis for convolution
- C++ implementation of some Python layers
- Usability improvement
  ✋ New example for natural language processing (NLP)
Semantic segmentation (stretch goal)

Operations

Noise contrastive estimation node
Aggregation on sparse gradient for embedded layer
Gradient as an operator (stretch goal)
Reduced rank for convolution in C++ to enable convolution on 1D data
Dilated convolution

Performance

Asynchronous evaluation API (Python and C#)
✋ Intel MKL update to improve inference speed on CPU by around 2x on AlexNet

Keras and Tensorboard

Example on Keras and SKLearn multi-GPU support on CNTK
Image feature support with Tensorboard for CNTK

Others

Continue work on Deep Learning Explained course on edX.

meijiesky · Answer 1 · Mon Aug 07 2017 10:33:12 GMT+0800 (China Standard Time)

Looking forward to new examples for NLP. Thanks.

JimSEOW · Answer 2 · Mon Aug 07 2017 14:59:46 GMT+0800 (China Standard Time)

@meijiesky could you help to list what new examples of NLP that can be done in tensorflow but still with no examples in CNTK? Thank you

meijiesky · Answer 3 · Mon Aug 07 2017 15:39:31 GMT+0800 (China Standard Time)

@JimSEOW I am interested in seeing examples on using reinforcement learning to generate dialogue responses and GAN to train the language model.

There are two reinforcement learning examples, CartPole and FlappingBird. In NLP, there are some differences.

The action space is huge because we need to sample one word from the vocabulary.
The reward is calculated using networks rather than being immediately given by the environment, making the training process very slow.
Sequences are of different lengths so I can only train one sequence at a time by sampling one token and calculate the loss at each generation step until the end of the sequence.

Besides, I have seen the tutorial of GAN to generate images, the model is continuous because the image can be generated simply by using a softmax layer. However, in terms of NLP the model is discrete because we need to use the softmax and sample one token at each time step.

I really enjoy CNTK and I would truly appreciate if you can provide examples using RL or GAN to deal with NLP. Thank you.

PS: Beam search decoder is still not available and hope we can have beam search in Python someday.

e-thereal · Answer 4 · Mon Aug 07 2017 17:09:38 GMT+0800 (China Standard Time)

Does point 2 under performance (Intel MKL update to improve inference speed on CPU by around 2x on AlexNet) refer to the MKL-DNN library integration or is the speed-up caused by an update of the MKL library to a newer version? I'm most eager to try the MKL-DNN integration. :)

JimSEOW · Answer 5 · Mon Aug 07 2017 18:42:31 GMT+0800 (China Standard Time)

@meijiesky FYI, I do not work for CNTK team. A few months back, some of the CNTK users decided to gather to consolidate our feedback for the need to have .NET modeling of CNTK networks beyond simply evaluation. This is happening FINALLY, step by step.

Another important milestone is to bring CNTK to UWP(x64bits) - done, UWP(ARM64, for both W10 on ARM64 and IoT Core Pro for ARM64). - WIP

Most important now for the CNTK developments is to identify

What are the use cases that have been done in Tensoflow BUT are still missing in CNTK
We need to figure out WHAT are limiting this?
Is there a challenge in translating the NETWORK design of Tensorflow to CNTK?
What can we suggest to CNTK team to make it easier to port Tensorflow network to CNTK network?

Please help the communities to lobby more to contribute TO NARROW the GAPs in use case between tensorflow and CNTK.

What CNTK team has shown is that THEY NOW RESPONSE in AGILE way!

meijiesky · Answer 6 · Mon Aug 07 2017 19:37:06 GMT+0800 (China Standard Time)

@JimSEOW Thanks for your explanation. Personally, I think CNTK processes RNN networks mush faster than other DL frameworks and that's why I choose CNTK to do NLP. However, CNTK is a fresh new Tool compared to Tensorflow, and there are not as many resources in CNTK as in Tensorflow, making CNTK users sometimes have trouble resolving their problems or not able to find resources to refer to. If there is better documentation, more tutorials or examples and stronger performance, sooner or later Tensorflow users will switch to CNTK.

Veikko Eeva · Answer 7 · Mon Aug 07 2017 22:31:02 GMT+0800 (China Standard Time)

@e-thereal

Does point 2 under performance (Intel MKL update to improve inference speed on CPU by around 2x on AlexNet) refer to the MKL-DNN library integration or is the speed-up caused by an update of the MKL library to a newer version? I'm most eager to try the MKL-DNN integration. :)

A link at https://gitter.im/Microsoft/CNTK?at=5985a8d81c8697534a8e23d6 (https://www.nextplatform.com/2017/03/21/can-fpgas-beat-gpus-accelerating-next-generation-deep-learning/, though other links there might interest you also) that also discusses MKL-DNN, might be of interest.

Cha Zhang · Answer 8 · Mon Aug 07 2017 22:54:12 GMT+0800 (China Standard Time)

@e-thereal Point 2 under performance is an update of the MKL library. This is our collaboration with Intel and they are not eager to move to MKL-DNN yet.

Benjamin Cherian · Answer 9 · Wed Aug 09 2017 00:52:36 GMT+0800 (China Standard Time)

@cha-zhang Do you know if the MKL plans on using the NN "primitives" that are available in MKL 2017? I imagine this could result in performance improvements for inference on CNNs even without a full transition to MKL-DNN. Or is the main performance improvement coming from running GEMM with AVX 512 on Skylake-SP? (i.e., no performance improvements expected on older Intel processors.)

Cha Zhang · Answer 10 · Wed Aug 09 2017 01:35:47 GMT+0800 (China Standard Time)

@bencherian This will be a refresh to MKL 2017.

JimSEOW · Answer 11 · Wed Aug 09 2017 05:46:32 GMT+0800 (China Standard Time)

=> Whenever it is possible, perhaps we could list down how the recent requests by users are related to up-coming interaction plan. This increases agile feedback and user engagement.

Scenario involving recent request for Faster R-CNN

Pedro N. Rodriguez H · Answer 12 · Fri Aug 11 2017 04:22:40 GMT+0800 (China Standard Time)

@JimSEOW

If I can add few (mexican) cents :)

Lately I am very interested in the field of reinforcement learning. And if you try to find a recent reproduction of a recent paper or even the actual code of the paper (eg, curiosity driven exploration, unreal, a3c, ga3c, paac, pedictron, etc), you most likely find a tensorflow implementation. Moreover, OpenAI has started to provide rl algorithms in tensorflow.

The community and Microsoft would need to step it up, and implement these models in CNTK. On top, blogs would need to be more active and show how the object oriented approach of cntk could make your live easier to implement, for example, the unreal methodology (sharing weights between 3 or 4 networks).

Happy to help

JimSEOW · Answer 13 · Fri Aug 11 2017 09:17:27 GMT+0800 (China Standard Time)

@pedronahum A Mexican with a banker job in day time while pushing data science's envelop will always have some interesting views.

=> First, CNTK has to take on the challenges in multiple fronts. However, we need to prioritize them while keep collecting feedback and suggestions and communicating at the same time that these suggestions are being "agile" not just from the CNTK team but as part of a consensus building with the community.

=> THIS IS the state CNTK is now in. THIS is the combination that will keep drawing talented users to come and feedback to drive CNTK development to the right directions.

=> NOW the main discussion
How could we communicate MORE effectively [diagram, short presentation] to new users how the object oriented approach of CNTK have unique advantages compared to competitors?

We must NOT limit the discussion to just the technical aspects. This is what the CNTK team is very good at.

We need to mobilize users, not only like yourself, with financial background, but also a diverse background like marketing, sale, communication to interpret the unique advantages of CNTK.

=> In other words, CNTK is on good footing, we need to figure out the different strategies of "Re-branding" CNTK, in individualized way, for users of different backgrounds.

黎强 · Answer 14 · Fri Aug 11 2017 22:42:07 GMT+0800 (China Standard Time)

I can provide a NLP example on answer selection using the dataset from Microsoft Beauty of Programming 2017. The model can achieve 58% MRR(Mean Reciprocal Rank) on validation set. Would you like it? @cha-zhang

Cha Zhang · Answer 15 · Fri Aug 11 2017 22:48:58 GMT+0800 (China Standard Time)

@Alan-Lee123 of course we would love to have your contribution! :)

sayanpa · Answer 16 · Sun Aug 13 2017 07:37:53 GMT+0800 (China Standard Time)

@Alan-Lee123 what would be a good way to get your example into our repository. We can chat offline if you prefer. LMK.

黎强 · Answer 17 · Sun Aug 13 2017 21:04:27 GMT+0800 (China Standard Time)

@sayanpa I would love to share ideas with you.

Cha Zhang · Answer 18 · Thu Aug 24 2017 02:44:18 GMT+0800 (China Standard Time)

@e-thereal @veikkoeeva @bencherian Apologize for some inaccurate information before. For this September iteration, it is a refresh for MKL, that part is accurate. However, we are aiming for MKL-DNN integration in the very near future, maybe in the next iteration or two.

e-thereal · Answer 19 · Thu Aug 24 2017 10:22:48 GMT+0800 (China Standard Time)

Thanks for the update. Good to know and we're looking forward to the mkl-dnn integration.

…

Sent from my Windows 10 phone From: Cha Zhang Sent: Mittwoch, 23. August 2017 11:45 To: Microsoft/CNTK Cc: e-thereal; Mention Subject: Re: [Microsoft/CNTK] Iteration Plan (August - September 2017) (#2194) @e-thereal @veikkoeeva @bencherian Apologize for some inaccurate information before. For this September iteration, it is a refresh for MKL, that part is accurate. However, we are aiming for MKL-DNN integration in the very near future, maybe in the next iteration or two. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

mohammad hasan sohan ajini · Answer 20 · Fri Aug 25 2017 19:49:04 GMT+0800 (China Standard Time)

Hi

Please prepare some contents that relates how to set up learning parameters of the network. I already read this one, but there some confusing parameters that need more description to understand. I recommend that prepare some cookbook for parameter mapping. For example using eta = 0.001 and momentum = 0.9 is so common in other toolkits. A cookbook table, mapping such parameters to CNTK version will be very helpful.

thanks

nono · Answer 21 · Thu Aug 31 2017 17:18:41 GMT+0800 (China Standard Time)

Hi, is there any plan about the cntk library evaluation with GPU in UWP?
thanks.

Rafael Antonio Ribeiro Gomes · Answer 22 · Fri Sep 01 2017 02:03:15 GMT+0800 (China Standard Time)

Despite _"New example for natural language processing (NLP)"_Examples showing how to use Conditional Random Fields (CRF) would be great. I don't know if CNTK has plans to build a CRF layer.

Veikko Eeva · Answer 23 · Fri Sep 01 2017 04:08:09 GMT+0800 (China Standard Time)

@nono1981 Take your pick: https://github.com/Microsoft/CNTK/search?q=uwp&type=Issues&utf8=%E2%9C%93 :)

Wolfgang Manousek · Answer 24 · Fri Sep 01 2017 12:23:23 GMT+0800 (China Standard Time)

@nono1981 - please see issue #2243

nono · Answer 25 · Tue Sep 05 2017 14:14:04 GMT+0800 (China Standard Time)

@wolfma61 @veikkoeeva thank you all. :)

Cha Zhang · Answer 26 · Thu Sep 07 2017 08:27:34 GMT+0800 (China Standard Time)

Some update. A few tasks for this iteration are blocked:

16bit support for training on Volta GPU: this depends on cuda 9 and cuda 9 is not GA yet. We will wait.
Improve statistics for distributed evaluation: this is now blocked due to lack of resources for this iteration.
Intel MKL update to improve inference speed on CPU by around 2x on AlexNet: this will be delayed due to our migration of build system to Azure VM. MKL update requires many changes in our build system, and we won't be able to do it while simultaneously doing build migration.
New example for natural language processing (NLP): this will very likely be delayed.

Apologies for these issues. The team recently has a reorg, which impacted the team's execution efficiency.

Mark Whiting · Answer 27 · Fri Sep 08 2017 04:45:43 GMT+0800 (China Standard Time)

Is it too late to get a hello-world Logic Regression example to train and evaluate a model using the new C#/API into the next release? I did notice the other examples.

Cha Zhang · Answer 28 · Fri Sep 08 2017 13:23:12 GMT+0800 (China Standard Time)

@whitmark : @liqunfu will give it a try.

Mads Dabros · Answer 29 · Thu Sep 14 2017 21:01:38 GMT+0800 (China Standard Time)

Will a nuget package with the new C#/API and dependencies be included in the release (CPU/GPU)?

Cha Zhang · Answer 30 · Fri Sep 15 2017 01:03:40 GMT+0800 (China Standard Time)

Yes, there will be NuGet package for the C# API.

Mark Whiting · Answer 31 · Sat Sep 16 2017 05:12:59 GMT+0800 (China Standard Time)

ETA for v.2.2?

Vlado Boza · Answer 32 · Sat Sep 16 2017 18:20:37 GMT+0800 (China Standard Time)

In July iteration plan you mentioned: "CNTK object-oriented C API."
Is there any update on this?

Cha Zhang · Answer 33 · Sat Sep 16 2017 20:04:00 GMT+0800 (China Standard Time)

We have debated and decided to use SWIG for C# API. So we are no longer creating C API at this moment.

Dmitry Selivanov · Answer 34 · Tue Sep 19 2017 04:32:39 GMT+0800 (China Standard Time)

@cha-zhang I'm curious what will be used for R bindings? I hope it will it be native support (NOT through python/reticulate).

Cha Zhang · Answer 35 · Tue Sep 19 2017 07:38:14 GMT+0800 (China Standard Time)

Sorry to disappoint you, but the R-binding will be through reticulate.

Dmitry Selivanov · Answer 36 · Tue Sep 19 2017 13:32:08 GMT+0800 (China Standard Time)

This is very frustrating (and this even can't be called "bindings"). From Reasons to Switch from TensorFlow to CNTK:

This not only makes it extremely fast, but also allows it to be used as a C++ API ready to be integrated with any applications. It also makes it very easy to add additional bindings to CNTK, such as Python, R, Java, etc.

R lacks good integration with comprehensive deep-learning framework (the only natively supported is MXnet, but it is hard to call R's interface matured). Such addition of native support potentially could be really useful.

So what will be the reason to use CNTK R interface if similar one for tensorflow (and even CNTK via keras) exists?

Cha Zhang · Answer 37 · Tue Sep 19 2017 18:30:14 GMT+0800 (China Standard Time)

This work is done by a team outside CNTK, and the choice of reticulate is to ensure we can have a R-binding as soon as possible. Reticulate seems to cause ~5% perf drop over CNTK Python API, and should still be much faster than TensorFlow's R-binding.

Pedro N. Rodriguez H · Answer 38 · Tue Sep 19 2017 23:22:33 GMT+0800 (China Standard Time)

Hi @cha-zhang,

In the same token, have you run a similar performance test for the C# API? Thanks

Cha Zhang · Answer 39 · Tue Sep 26 2017 06:11:09 GMT+0800 (China Standard Time)

If you use the minibatch source, speed of C# and Python are the same. If you feed data yourself, we are seeing some 30% slow down for C#. We are investigating the issue.

Cha Zhang · Answer 40 · Tue Sep 26 2017 06:11:34 GMT+0800 (China Standard Time)

I'm closing this issue since v2.2 was shipped on Sep. 15. We will post a new iteration plan soon.