-
Ziya2: Data-centric Learning is All LLMs Need
Paper • 2311.03301 • Published • 20 -
Co-training and Co-distillation for Quality Improvement and Compression of Language Models
Paper • 2311.02849 • Published • 8 -
MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning
Paper • 2311.02303 • Published • 11 -
ADaPT: As-Needed Decomposition and Planning with Language Models
Paper • 2311.05772 • Published • 15
Yassine Ouali
youali
·
AI & ML interests
ML, ∀ subject ∈ adjacent(ML)
Recent Activity
liked
a model
29 days ago
iiiorg/piiranha-v1-detect-personal-information
liked
a dataset
4 months ago
lmms-lab/LLaVA-OneVision-Data
liked
a Space
5 months ago
hzxie/gaussian-city
Organizations
None yet
Standalone Neural Modules
Diffusion Modles
RL
Efficient ML
Multimodal/Vision LLMs
-
GLaMM: Pixel Grounding Large Multimodal Model
Paper • 2311.03356 • Published • 36 -
CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding
Paper • 2311.03354 • Published • 8 -
CogVLM: Visual Expert for Pretrained Language Models
Paper • 2311.03079 • Published • 27 -
UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework
Paper • 2311.10125 • Published • 6
Transformers
-
Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency
Paper • 2311.02772 • Published • 7 -
Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
Paper • 2311.05698 • Published • 14 -
Hiformer: Heterogeneous Feature Interactions Learning with Transformers for Recommender Systems
Paper • 2311.05884 • Published • 11 -
Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers
Paper • 2311.10642 • Published • 26
Graphics/3D
-
VR-NeRF: High-Fidelity Virtualized Walkable Spaces
Paper • 2311.02542 • Published • 19 -
Consistent4D: Consistent 360° Dynamic Object Generation from Monocular Video
Paper • 2311.02848 • Published • 7 -
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model
Paper • 2311.06214 • Published • 33
Computer Vision
LLMs
-
Ziya2: Data-centric Learning is All LLMs Need
Paper • 2311.03301 • Published • 20 -
Co-training and Co-distillation for Quality Improvement and Compression of Language Models
Paper • 2311.02849 • Published • 8 -
MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning
Paper • 2311.02303 • Published • 11 -
ADaPT: As-Needed Decomposition and Planning with Language Models
Paper • 2311.05772 • Published • 15
Multimodal/Vision LLMs
-
GLaMM: Pixel Grounding Large Multimodal Model
Paper • 2311.03356 • Published • 36 -
CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding
Paper • 2311.03354 • Published • 8 -
CogVLM: Visual Expert for Pretrained Language Models
Paper • 2311.03079 • Published • 27 -
UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework
Paper • 2311.10125 • Published • 6
Standalone Neural Modules
Transformers
-
Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency
Paper • 2311.02772 • Published • 7 -
Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
Paper • 2311.05698 • Published • 14 -
Hiformer: Heterogeneous Feature Interactions Learning with Transformers for Recommender Systems
Paper • 2311.05884 • Published • 11 -
Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers
Paper • 2311.10642 • Published • 26
Diffusion Modles
Graphics/3D
-
VR-NeRF: High-Fidelity Virtualized Walkable Spaces
Paper • 2311.02542 • Published • 19 -
Consistent4D: Consistent 360° Dynamic Object Generation from Monocular Video
Paper • 2311.02848 • Published • 7 -
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model
Paper • 2311.06214 • Published • 33
RL
Computer Vision
Efficient ML