Orange Blog

By: AJDA, Aug 14, 2015

Classifying instances with Orange in Python

Last week we showed you how to create your own data table in Python shell. Now we’re going to take you a step further and show you how to easily classify data with Orange. First we’re going to create a new data table with 10 fruits as our instances. import Orange from Orange.data import * color = DiscreteVariable("color", values=["orange", "green", "yellow"])calories = ContinuousVariable("calories") fiber = ContinuousVariable("fiber") fruit = DiscreteVariable("fruit", values=["orange", "apple", "peach"]) domain = Domain([color, calories, fiber], class_vars=fruit) data=Table(domain, [</span> ["green", 4, 1.


By: AJDA, Jul 24, 2015

Visualizing Misclassifications

In data mining classification is one of the key methods for making predictions and gaining important information from our data. We would, for example, use classification for predicting which patients are likely to have the disease based on a given set of symptoms. In Orange an easy way to classify your data is to select several classification widgets (e.g. Naive Bayes, Classification Tree and Linear Regression), compare the prediction quality of each learner with Test Learners and Confusion Matrix and then use the best performing classifier on a new data set for classification.


By: AJDA, Jul 10, 2015

Learn with Paint Data

Paint Data widget might initially look like a kids’ game, but in combination with other Orange widgets it becomes a very simple and useful tool for conveying statistical concepts, such as k-means, hierarchical clustering and prediction models (like SVM, logistical regression, etc.). The widget enables you to draw your data on a 2-D plane. You can name the x and y axes, select the number of classes (which are represented by different colors) and then position the points on a graph.


By: AJDA, Jul 3, 2015

Support vectors output in SVM widget

Did you know that the widget for support vector machines (SVM) classifier can output support vectors? And that you can visualise these in any other Orange widget? In the context of all other data sets, this could provide some extra insight into how this popular classification algorithm works and what it actually does. Ideally, that is, in the case of linear seperability, support vector machines (SVM) find a **hyperplane with the largest margin **to any data instance.


By: BIOLAB, Apr 30, 2012

Orange GSoC: Multi-Target Learning for Orange

Orange already supports multi-target classification, but the current implementation of clustering trees is written in Python. One of the five projects Orange has chosen at this year’s Google Summer of Code is the implementation of clustering trees in C. The goal of my project is to speed up the building time of clustering trees and lower their spatial complexity, especially when used in random forests. Implementation will be based on Orange’s SimpleTreeLearner and will be integrated with Orange 3.


By: BIOLAB, Jan 9, 2012

Multi-label classification (and Multi-target prediction) in Orange

The last summer, student Wencan Luo participated in Google Summer of Code to implement Multi-label Classification in Orange. He provided a framework, implemented a few algorithms and some prototype widgets. His work has been “hidden” in our repositories for too long; finally, we have merged part of his code into Orange (widgets are not there yet …) and added a more general support for multi-target prediction. You can load multi-label tab-delimited data (e.


By: BIOLAB, Sep 2, 2011

GSoC Review: Multi-label Classification Implementation

Traditional single-label classification is concerned with learning from a set of examples that are associated with a single label l from a set of disjoint labels L, |L| > 1. If |L| = 2, then the learning problem is called a binary classification problem, while if |L| > 2, then it is called a multi-class classification problem (Tsoumakas & Katakis, 2007). Multi-label classification methods are increasingly used by many applications, such as textual data classification, protein function classification, music categorization and semantic scene classification.


By: BIOLAB, Aug 24, 2011

Faster classification and regression trees

SimpleTreeLearner is an implementation of classification and regression trees that sacrifices flexibility for speed. A benchmark on 42 different datasets reveals that SimpleTreeLearner is 11 times faster than the original TreeLearner. The motivation behind developing a new tree induction algorithm from scratch was to speed up the construction of random forests, but you can also use it as a standalone learner. SimpleTreeLearner uses gain ratio for classification and MSE for regression and can handle unknown values.