Working on Open Source AI Project to Tune Models 

By AI Trends Staff 

IBM and the Quebec AI Institute (Mila) are collaborating to accelerate the Orion AI and machine learning open source technology they started working on together in early 2020, to improve a key component known as hyperparameter optimization. 

This tuning is to set rules used to control the learning process. The values of parameters can be referred to also as node weights. The project aims to help researchers improve machine learning model performance, and pinpoint with the “black box” of AI where their models need work, according to a related IBM press release.  

The Orion software originated from Mila and is envisioned as a backend to complement existing machine learning frameworks. (Credit: Mila)

The Orion software (no relation to the recently-hacked SolarWinds Orion platform) is envisioned as a backend to complement existing machine learning frameworks, according to an account from Mila. 

The goals of this project are to 1) create a tool well-adapted to researchers’ workflow and with little configuration/manipulation required, 2) establish clear benchmarks to convince researchers of efficiency, and 3) leverage prior knowledge to avoid optimization from scratch,” stated Xavier Bouthillier, lead developer of Orion and a Phd computer science student at the University of Montreal. 

Xavier Bouthillier, lead developer of Orion, Phd computer science student, University of Montreal.

Mila and IBM have built a benchmarking module in Orion, with a variety of assessments and tasks to ensure sufficient coverage of most use cases encountered in research. For each task, optimization algorithms can be benchmarked based on various assessment scenarios. These include: time to result, average performance, search space dimensions size, search space dimensions type, parallel execution advantage, and search algorithm parameters sensitivity.   

IBM Intends to Integrate Orion Code into Watson Machine Learning Accelerator 

IBM’s Spectrum Computing team based in Markham, Ontario, has contributed to the Orion code base. IBM intends to integrate the open-source Orion code into its Watson Machine Learning Accelerator.  

Yoshua Bengio, Scientific Director at Mila and one of the world’s leading experts in artificial intelligence and deep learning, stated, “A collaboration with leading industry AI experts such as IBM is a great opportunity to accelerate the development of an open-source solution recently initiated at Mila, combining engineering expertise, practical hands-on experience and cutting-edge research in AI.” 

Bengio added, “Hyperparameter optimization plays an important role in the scientific progress of AI, both as an enabler to reach the best performances achievable by new algorithms, and as a foundation for a rigorous measure of progress, providing a principled common ground to compare algorithms. Hyperparameter optimization and its subfield of neural architecture search are additionally a key solution for the deployment of energy-efficient AI technologies, a problem currently posed by the trend of increasing computational cost of deep learning models.” 

Steven Astorino, Vice President of Development for IBM Data & AI and Canada Lab Director

Steven Astorino, Vice President of Development for IBM Data & AI and Canada Lab Director, stated, “Collaborating with some top global AI researchers at Mila, we’re improving open-source technology to the benefit of all researchers and data scientists, while advancing the capabilities of IBM Watson Machine Learning Accelerator. This provides even greater value through our end-to-end client solutions and advances IBM’s commitment to both the consumption of and contribution to open-source technology.” 

Area631 Incubator for IBM Employees Launched from Markham 

Astorino developed the first incubator program for IBM employees in Markham in 2018. Called Area631, the three-month program offers a startup-like experience for developing ideas and creating prototypes. Now Area631 is in expansion mode, with plans to launch the incubators at eight global IBM software development labs spanning the United States, China, India, Germany, and Poland, according to a recent account in betakit. 

After Astorino became the Canada Lab director at IBM, responsible for all IBM Lab locations across Canada, he got the idea for Area631. “I was really trying to understand, ‘okay, how can we do this better? How can we collaborate better, or more importantly, how do we innovate better and come up with some great things that we can try and transform and disrupt the market,’” Astorino stated. 

The name Area631 represents six “IBMers” working for three months on one breakthrough. Through the internal incubator, IBM gives employees the opportunity to submit ideas; if chosen, they can work on the idea to create prototypes. The employees are given the three months, full time, to work on the idea with the small team. 

“The whole point was to drive transformation in a large company like IBM, and still innovate as if you were a startup, ” Astorino stated. Area631 already has a success story: Watson AIOps, for AI for IT operations, which came about from the first project of Markham’s Area631. Today IBM sells the service to IT operations teams to help them respond more quickly to slowdowns and outages. 

Watson AIOps is “a significant business opportunity for IBM,” Astorino stated. “We invested an entire business unit in that after the Area631 project was completed. So I would say that was a huge success.” 

IBM Accelerator Speeds Development with Large Model Support 

The IBM Watson Machine Learning Accelerator, previously called IBM PowerAI Enterprise, “targets a rarefied group of developers with large workloads and big infrastructure budgets,” states a 2019 report from 451 Research, technology analysts.  

The Accelerator is a combined hardware and software package that aggregates a range of prominent open source deep-learning frameworks alongside development and management tools, so that enterprise users can more easily build and scale machine-learning pipelines. 

One example is the large model support (LMS) function in the Accelerator, which directly connects the CPU to the Nvidia GPU, providing a 5.6x improvement in data transfer speeds to system memory. “Users can thus tackle projects where model size or data size are significantly larger than the limited memory available on the GPUs, leading to more accurate models and improving model training time,” stated the report authors.  

Results have been impressive. In one instance, IBM was able to train a model on the Enlarged ImageNet Dataset 3.8x faster than without LMS. 

To accelerate the training process, Watson Machine Learning Accelerator includes SnapML, a distributed machine-learning library for GPU acceleration supporting logistic regression, linear regression and support vector machine models. It was developed by IBM Research in Zurich. 

Read the source articles and information from IBM press release on the Mila collaboration, in an account from Mila, in betakit, and from 451 Research.