We would like to share two updates to the vLLM community.

Future of vLLM is Open

We are excited to see vLLM is becoming the standard for LLM inference and serving. In the recent Meta Llama 3.1 announcement, 8 out of 10 official partners for real time inference run vLLM as the serving engine for the Llama 3.1 models. We have also heard anecdotally that vLLM is being used in many of the AI features in our daily life.

We believe vLLM’s success comes from the power of the strong open source community. vLLM is actively maintai ned by a consortium of groups such as UC Berkeley, Anyscale, AWS, CentML, Databricks, IBM, Neural Magic, Roblox, Snowflake, and others. To this extent, we want to ensure the ownership and governance of the project is open an d transparent as well.

We are excited to announce that vLLM has started the incubation process into LF AI & Data Foundation. This means no one party will have exclusive control over the future of vLLM. The license and trademark will be irrevocably open. You can trust vLLM is here to stay and will be actively maintained and improved going forward.

Performance is top priority

The vLLM contributors are doubling down to ensure vLLM is a fastest and easiest-to-use LLM inference and serving engine.

To recall our roadmap, we focus vLLM on six objectives: wide model coverage, broad hardware support, top performance, production-ready, thriving open source community, and extensible architecture.

In our objective for performance optimization, we have made the following progress to date:

We will continue to update the community in vLLM’s progress in closing the performance gap. You can track our overall progress here. Please continue to suggest new ideas and contribute with your improvements!

More Resources

We would like to highlight the following RPCs being actively developed

There is a thriving research community building their research projects on top of vLLM. We are deeply humbled by the impressive works and would love to collaborate and integrate. The list of papers includes but is not limited to: