Toggle Menu

Blog > .NET > ML.NET in Jupyter Notebooks

ML.NET in Jupyter Notebooks

Microsoft recently reached a major milestone in the development of the open-source cross-platform library ML.NET at MS Ignite 2019 in Orlando. With the release of a long awaited C# kernel, ML.NET is now fully supported to run in Jupyter Notebooks. What really makes Jupyter Notebooks so appealing is that the code is very easy to […]

By Alexander Slotte

June 05, 2020

Microsoft recently reached a major milestone in the development of the open-source cross-platform library ML.NET at MS Ignite 2019 in Orlando. With the release of a long awaited C# kernel, ML.NET is now fully supported to run in Jupyter Notebooks.

What really makes Jupyter Notebooks so appealing is that the code is very easy to share (it’s all contained in a .ipynb file). A notebook can contain interactive plots and charts to describe the data, and each cell can be executed in isolation. With the support of Jupyter Notebooks, we’re able to use ML.NET for the full machine learning workflow, and not only to train our model.So, what is Jupyter Notebook, and why is this significant? Jupyter Notebook is an open-source web application that is heavily utilized by the data science community. A notebook consists of multiple cells, which can either contain interactive code or text, with the code being executed on a separate kernel.

With the support of Jupyter Notebooks, we’re able to use ML.NET for the full machine learning workflow, and not only to train our model.

Alright, enough talking, let’s see some code! To get started, we need to set up our local environment!

Install Jupyter – There are many ways to install Jupyter, but the easiest way is to download and install Anaconda.

Install dotnet try – The C# kernel is based on the dotnet try tool. To install the dotnet try tool, open either the command prompt or PowerShell and type in the following command: dotnet try jupyter installg dotnet-try

Install the .NET Jupyter Kernel – to connect Jupyter with the dotnet try tool, execute the following command in a command prompt or PowerShell to install the .NET Kernel: dotnet try jupyter install

To start Jupter Notebooks, open Anaconda and click on Jupyter Notebook.


It’s always easier to learn something new using examples. To that effect, I’ve created a small GitHub repository with a couple of sample notebooks for you to explore at your convenience.

Let’s walk through a multi-class classification example to get familiar with the structure of a notebook.

Declare package dependencies

Just like in regular C# code, we are able to import and consume third-party NuGet packages.

To do so, we can prefix the package name with #r.

Declare data types

A cell can either contain a class or a method. Once a cell has been executed it can be referenced and used by other cells in the notebook.

Data Exploration

One of the key benefits of using Jupyter Notebooks with ML.NET is its support for data exploration and plotting. In order to build a robust machine learning model, we need to know our data inside and out. In Jupyter, we can take a look at the data by dumping it to a table.

We can also use third-party libraries such as Xplot.Plotty to create plots.

Summary

With the introduction of Jupyter support, ML.NET takes a big step in the right direction. There are still plenty of bits and pieces to improve, e.g. support for native C# IntelliSense, but I’m positive that the ML.NET team is on the right track.

I hope you’ve found this post useful. Should you have any questions, or just want to chat ML.NET, you can always find me on Twitter at @alexslotte!

Category: .NET

Tags: Data, Machine Learning, Streaming

You Might Also Like

Data Visualization

Exploring the Complexity of Visualizing COVID-19 Case Data

Over the last four+ months, the world has logged on daily to track case counts...

Cloud Engineering

Apache Kafka Automation, Part 1: Zookeeper as a Service

Zookeeper is an orchestration service, typically associated with distributed systems (think Hadoop or Kafka). Managing...

Cloud Engineering

Apache Kafka Automation, Part 2: Kafka as a Service

Apache Kafka is a community distributed event streaming and processing platform capable of handling trillions...