Advancements in Machine Learning for ML: Improving Efficiency of ML Workloads

Read about the exciting advancements in machine learning for machine learning and how ML models can be more efficient.


ML Compilers

Modern machine learning models are programmed and trained using ML programming frameworks such as TensorFlow and PyTorch. These frameworks provide high-level instructions to ML practitioners, allowing them to focus on their models rather than the underlying hardware optimization.

ML compilers play a crucial role in optimizing the efficiency of ML workloads. They convert user-written programs into instructions to be executed on hardware. Graph-level and kernel-level optimizations are performed to improve the efficiency of the programs.

One important optimization in ML compilers is assigning memory layouts to intermediate tensors. The layout assignment optimization needs to balance between local computation efficiency and layout transformation overhead.

TpuGraphs Dataset

To improve the efficiency of ML models, Google has released the TpuGraphs dataset for learning cost models. This dataset contains computational graphs of ML workloads, compilation configurations, and execution times. It focuses on graph-level and tiling optimizations.

The dataset provides 25 times more graphs than previous graph property prediction datasets and features popular model architectures like ResNet and Transformer. Baseline learned cost models based on graph neural networks (GNNs) are provided with the dataset.

The dataset allows for exploring large graph-level prediction tasks, which present challenges in scalability, training efficiency, and model quality.

Graph Segment Training and Kaggle Competition

To scale GNN training for large graphs, a method called Graph Segment Training (GST) is introduced. GST partitions large graphs into smaller segments and updates the model with random subsets of segments. This method reduces training time by three times.

Additionally, a Kaggle competition called 'Fast or Slow? Predict AI Model Runtime' was held using the TpuGraphs dataset. The competition attracted participants from 66 countries and highlighted interesting techniques like graph pruning, feature padding, and cross-configuration attention.

The winners of the Kaggle competition will be debriefed and their solutions will be previewed at the ML for Systems workshop at NeurIPS.