Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2409.16040

stuff i never have time to read

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 92
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks

Paper • 2402.11984 • Published Feb 19, 2024
BlackGoose Rimer: Harnessing RWKV-7 as a Simple yet Superior Replacement for Transformers in Large-Scale Time Series Modeling

Paper • 2503.06121 • Published Mar 8 • 5
Timer: Transformers for Time Series Analysis at Scale

Paper • 2402.02368 • Published Feb 4, 2024 • 1

GenAI-based Time Series

Leveraging generative models like transformers and GANs for advanced time series prediction and analysis.

Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

Paper • 2409.16040 • Published Sep 24, 2024 • 16
Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts

Paper • 2410.10469 • Published Oct 14, 2024 • 1
Unified Training of Universal Time Series Forecasting Transformers

Paper • 2402.02592 • Published Feb 4, 2024 • 8
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20, 2024 • 63

Mixture of Experts

Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

Paper • 2409.16040 • Published Sep 24, 2024 • 16

Papers I want to read

Papers in my to-read list

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 72
Chameleon: Mixed-Modal Early-Fusion Foundation Models

Paper • 2405.09818 • Published May 16, 2024 • 131
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Paper • 2405.15574 • Published May 24, 2024 • 56
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 90

about 13 hours ago

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Paper • 2404.15653 • Published Apr 24, 2024 • 30
MoDE: CLIP Data Experts via Clustering

Paper • 2404.16030 • Published Apr 24, 2024 • 15
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20, 2024 • 51
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21, 2024 • 34

Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

Paper • 2409.16040 • Published Sep 24, 2024 • 16

Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

Paper • 2409.16040 • Published Sep 24, 2024 • 16

Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

Paper • 2409.16040 • Published Sep 24, 2024 • 16

FLAME: Factuality-Aware Alignment for Large Language Models

Paper • 2405.01525 • Published May 2, 2024 • 29
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23, 2024 • 41
Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 55
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture

Paper • 2405.18991 • Published May 29, 2024 • 12

amazingvince/Not-WizardLM-2-7B

Text Generation • 7B • Updated Apr 16, 2024 • 7 • 91
m-a-p/neo_7b

Text Generation • 8B • Updated Jun 3, 2024 • 18 • 56
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

Paper • 2409.16040 • Published Sep 24, 2024 • 16

stuff i never have time to read

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17 • 92
Hebbian Learning based Orthogonal Projection for Continual Learning of Spiking Neural Networks

Paper • 2402.11984 • Published Feb 19, 2024
BlackGoose Rimer: Harnessing RWKV-7 as a Simple yet Superior Replacement for Transformers in Large-Scale Time Series Modeling

Paper • 2503.06121 • Published Mar 8 • 5
Timer: Transformers for Time Series Analysis at Scale

Paper • 2402.02368 • Published Feb 4, 2024 • 1

Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

Paper • 2409.16040 • Published Sep 24, 2024 • 16

GenAI-based Time Series

Leveraging generative models like transformers and GANs for advanced time series prediction and analysis.

Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

Paper • 2409.16040 • Published Sep 24, 2024 • 16
Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts

Paper • 2410.10469 • Published Oct 14, 2024 • 1
Unified Training of Universal Time Series Forecasting Transformers

Paper • 2402.02592 • Published Feb 4, 2024 • 8
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20, 2024 • 63

Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

Paper • 2409.16040 • Published Sep 24, 2024 • 16

Mixture of Experts

Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

Paper • 2409.16040 • Published Sep 24, 2024 • 16

Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

Paper • 2409.16040 • Published Sep 24, 2024 • 16

Papers I want to read

Papers in my to-read list

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 72
Chameleon: Mixed-Modal Early-Fusion Foundation Models

Paper • 2405.09818 • Published May 16, 2024 • 131
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Paper • 2405.15574 • Published May 24, 2024 • 56
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 90

FLAME: Factuality-Aware Alignment for Large Language Models

Paper • 2405.01525 • Published May 2, 2024 • 29
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23, 2024 • 41
Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 55
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture

Paper • 2405.18991 • Published May 29, 2024 • 12

about 13 hours ago

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Paper • 2404.15653 • Published Apr 24, 2024 • 30
MoDE: CLIP Data Experts via Clustering

Paper • 2404.16030 • Published Apr 24, 2024 • 15
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20, 2024 • 51
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21, 2024 • 34

amazingvince/Not-WizardLM-2-7B

Text Generation • 7B • Updated Apr 16, 2024 • 7 • 91
m-a-p/neo_7b

Text Generation • 8B • Updated Jun 3, 2024 • 18 • 56
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

Paper • 2409.16040 • Published Sep 24, 2024 • 16

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs