Improving JEPAs with Spatial Conditioning: Robust and Efficient Representation Learning
This article was accepted into the Self-Supervised Learning Workshop: Theory and Practice (SSLTP) at NeurIPS 2024. The Image-based Joint Embedding ...
This article was accepted into the Self-Supervised Learning Workshop: Theory and Practice (SSLTP) at NeurIPS 2024. The Image-based Joint Embedding ...
In recent years, there has been significant development in the field of large pre-trained models for robot policy learning. The ...
Large language models (LLMs) have demonstrated impressive capabilities in handling knowledge-intensive tasks through their parametric knowledge stored within the model ...
Visual and action data are interconnected in robotic tasks, forming a perception-action loop. Robots rely on control parameters for movement, ...
End-to-end (E2E) neural networks have emerged as flexible and accurate models for multilingual automatic speech recognition (ASR). However, as the ...
This paper was accepted into the IEEE Spoken Language technology (SLT) 2024 Workshop. In this paper, we propose an algorithm ...
There has been a marked movement in the field of AGI systems toward the use of adaptive, pretrained representations known ...
This paper presents Embedding Pose Graph (EPG), an innovative method that combines the strengths of basic models with a simple ...
In this paper, we present a novel approach to automatically assign entity labels to images from existing noisy image-text pairs. ...
Neural language models (LMs) have become popular due to their extensive theoretical work mainly focused on representation ability. A previous ...