13/06/2018
One of the products Microsoft introduced at their 2018 Build conference was an open source, cross-platform machine learning framework for .NET developers.
ML.NET will
...allow .NET developers to develop their own models and infuse custom ML into their applications without prior expertise in developing or tuning machine learning models.
Democratization of machine learning---i.e. enabling people without a background in machine learning to build and run models--has become a popular goal. However, I remain skeptical of its viability. Machine learning holds many traps for the unweary. Treating the field as a black box seems reckless.
However, frameworks like ML.NET do close the gap between machine learning specialists and software developers. There can be considerable effort involved in taking models developed in popular machine learning languages, such as R and Python, and integrating them into enterprise applications written in languages such as C#.
By creating a high-quality machine learning framework for .NET, Microsoft have made it easier (and faster) to get machine learning into enterprise (or mobile--via Xamarin) applications. That is a form of democratisation--making machine learning more available.
What kinds of problems can be tackled using ML.NET?
While currently in preview ML.NET is based on internal libraries used by Microsoft. It is apparently used in major Microsoft products such as Windows, Bing and Azure.
Learning models currently supported by the framework include
- K-Means clustering
- Logistic regression
- Support Vector Machines
- Naive Bayes
- Random forests
- Boosted trees
Additional techniques, such as recommendation engines and anomaly detection, are on the roadmap.
ML.NET will eventually expose interfaces to other popular machine learning libraries such as TensorFlow, for deep learning, and Accord.NET.
Finally, there will be a number of tooling and language enhancements, including scaling-out in Azure and GUI/Visual Studio features.
How can ML.NET be used in applications?
ML.NET is provided as a NuGet package making it a breeze to install in a new or existing .NET application.
The framework adopts the "pipeline" approach that is used in other machine learning libraries, such as scikit-learn and Apache Spark MLlib. Data is "piped" through a number of stages to produce useful results (e.g. predictions). A typical pipeline may involve
- Loading the data
- Transforming the data
- Feature extraction/engineering
- Configuring a learning model
- Training the model
- Using the trained model (e.g. to obtain predictions)
Pipelines provide a standard API for working with machine learning models. This makes is easier to switch one model for another during testing and experimentation. It also splits the modelling effort up into well-defined steps making it easier to understand existing code.
Microsoft provide a worked example using a regression model to predict taxi fares in New York City.
Summary
ML.NET makes it easier than ever for .NET developers to include machine learning in their applications. This is likely to encourage the adoption of the technology in mainstream applications. Let's hope that .NET developers use this new power responsibly.
Learning Tree training
If you are interested in the topics covered in this blog post, Learning Tree has a number of courses that may help advance your skills in these areas...and avoid any traps.