Influential work of 2022

Machine learning has become an increasingly important field in recent years, with applications ranging from self-driving cars to speech recognition to natural language processing. As a result, there has been a proliferation of research papers published in the field, covering a wide range of topics and approaches.

Following are the top machine learning papers that have been published in 2022.

A ConvNet for the 2020s

A ConvNet for the 2020s

The “Roaring 20s” of visual recognition began with the introduction of VisionTransformers (ViTs), which quickly superseded ConvNets as the state-of-the-artimage classification model. A vanilla ViT, on the other hand, facesdifficulties when applied to general computer vision tasks such as objectd…

arXiv.orgZhuang Liu

The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification model. A vanilla ViT, on the other hand, faces difficulties when applied to general computer vision tasks such as object detection and semantic segmentation. It is the hierarchical Transformers (e.g., Swin Transformers) that reintroduced several ConvNet priors, making Transformers practically viable as a generic vision backbone and demonstrating remarkable performance on a wide variety of vision tasks. However, the effectiveness of such hybrid approaches is still largely credited to the intrinsic superiority of Transformers, rather than the inherent inductive biases of convolutions...

If you are into Transformer camp, this paper is for you.

A Generalist Agent

A Generalist Agent

Inspired by progress in large-scale language modeling, we apply a similarapproach towards building a single generalist agent beyond the realm of textoutputs. The agent, which we refer to as Gato, works as a multi-modal,multi-task, multi-embodiment generalist policy. The same network with the same…

arXiv.orgScott Reed

A paper from Deepmind talks about a multi-modal, multi-task, multi-embodiment generalist policy follower agent which can do almost everything.

Galactica: A Large Language Model for Science

Galactica: A Large Language Model for Science

Information overload is a major obstacle to scientific progress. Theexplosive growth in scientific literature and data has made it ever harder todiscover useful insights in a large mass of information. Today scientificknowledge is accessed through search engines, but they are unable to organizes…

arXiv.orgRoss Taylor

As the name says, an LLM for Science trained in scientific data. It comes with a demo which impresses at times but at others it is a big disappointment. If you have played with ChatGPT and are trying to compare the two, understand the comparison is not fair. I played with Chua's circuit reference and I was surprized to see the response.

High-Resolution Image Synthesis with Latent Diffusion Models

Taking the world by storm, Stable Diffusion has changed the generative image landscape. With this, now text to image is becoming part of almost every product

Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents