members-only post

Influential work of 2022

Whether you are a machine learning researcher, a practitioner, or just someone with a general interest in the field, we hope this blog will provide a valuable resource for learning about the latest and most influential work.
Influential work of 2022
Photo by Russ Ward / Unsplash

Machine learning has become an increasingly important field in recent years, with applications ranging from self-driving cars to speech recognition to natural language processing. As a result, there has been a proliferation of research papers published in the field, covering a wide range of topics and approaches.

Following are the top machine learning papers that have been published in 2022.

A ConvNet for the 2020s

A ConvNet for the 2020s
The “Roaring 20s” of visual recognition began with the introduction of VisionTransformers (ViTs), which quickly superseded ConvNets as the state-of-the-artimage classification model. A vanilla ViT, on the other hand, facesdifficulties when applied to general computer vision tasks such as objectd…

The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification model. A vanilla ViT, on the other hand, faces difficulties when applied to general computer vision tasks such as object detection and semantic segmentation. It is the hierarchical Transformers (e.g., Swin Transformers) that reintroduced several ConvNet priors, making Transformers practically viable as a generic vision backbone and demonstrating remarkable performance on a wide variety of vision tasks. However, the effectiveness of such hybrid approaches is still largely credited to the intrinsic superiority of Transformers, rather than the inherent inductive biases of convolutions...

If you are into Transformer camp, this paper is for you.


A Generalist Agent

A Generalist Agent
Inspired by progress in large-scale language modeling, we apply a similarapproach towards building a single generalist agent beyond the realm of textoutputs. The agent, which we refer to as Gato, works as a multi-modal,multi-task, multi-embodiment generalist policy. The same network with the same…

A paper from Deepmind talks about a multi-modal, multi-task, multi-embodiment generalist policy follower agent which can do almost everything.


Galactica: A Large Language Model for Science

Galactica: A Large Language Model for Science
Information overload is a major obstacle to scientific progress. Theexplosive growth in scientific literature and data has made it ever harder todiscover useful insights in a large mass of information. Today scientificknowledge is accessed through search engines, but they are unable to organizes…

As the name says, an LLM for Science trained in scientific data. It comes with a demo which impresses at times but at others it is a big disappointment. If you have played with ChatGPT and are trying to compare the two, understand the comparison is not fair. I played with Chua's circuit reference and I was surprized to see the response.


High-Resolution Image Synthesis with Latent Diffusion Models

High-Resolution Image Synthesis with Latent Diffusion Models - Computer Vision & Learning Group
By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image data and beyond. Stable diffusion.

Taking the world by storm, Stable Diffusion has changed the generative image landscape. With this, now text to image is becoming part of almost every product


Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents

This post is for subscribers only

Subscribe to continue reading