You can now run 70B model on a single 4GB GPU and it even scales up to the colossal Llama 3.1 405B on just 8GB of VRAM. AirLLM uses “Layer-wise Inference.” Instead of loading the whole model, it loads, computes, and flushes one layer at a time. → No quantization needed by default → Supports Llama, Qwen, and Mistral → Works on Linux, Windows, and macOS 100% 100% Open Source.
📎 附加媒體:
[圖片] https://scontent-yyz1-1.cdninstagram.com/v/t51.82787-15/659582339_17957941761081709_6717916988380958419_n.jpg?stp=dst-jpg_e35_tt6&efg=eyJ2ZW5jb2RlX3RhZyI6InRocmVhZHMuQ0FST1VTRUxfSVRFTS5pbWFnZV91cmxnZW4uMTQ0MHgxODAxLnNkci5mODI3ODcuZGVmYXVsdF9pbWFnZS5jMiJ9&_nc_ht=scontent-yyz1-1.cdninstagram.com&_nc_cat=101&_nc_oc=Q6cZ2gFb03Cxn08YHN2BwsPC2737S9SVHF3kKWQWvvTJF1vElcjJM-X24YacR-6tmc-Scyg&_nc_ohc=EvlPqMN8VSwQ7kNvwETXg2z&_nc_gid=ql6pQEyTunm_-7aZe8ZhOg&edm=APs17CUBAAAA&ccb=7-5&ig_cache_key=Mzg2NDc4Njk1MjE4NTAzMzU2Ng%3D%3D.3-ccb7-5&oh=00_Af1mVT0YivcBNYYh8Tix11zZSa_WT1x1HOkBwOL_FBEl8w&oe=69D2B533&_nc_sid=10d13b
[圖片] https://scontent-yyz1-1.cdninstagram.com/v/t51.82787-15/656826295_17957941776081709_2725668036447327880_n.jpg?stp=dst-jpg_e35_tt6&efg=eyJ2ZW5jb2RlX3RhZyI6InRocmVhZHMuQ0FST1VTRUxfSVRFTS5pbWFnZV91cmxnZW4uMTQ0MHgxODAxLnNkci5mODI3ODcuZGVmYXVsdF9pbWFnZS5jMiJ9&_nc_ht=scontent-yyz1-1.cdninstagram.com&_nc_cat=101&_nc_oc=Q6cZ2gFb03Cxn08YHN2BwsPC2737S9SVHF3kKWQWvvTJF1vElcjJM-X24YacR-6tmc-Scyg&_nc_ohc=dPBKKqrK8T4Q7kNvwHxFuI_&_nc_gid=ql6pQEyTunm_-7aZe8ZhOg&edm=APs17CUBAAAA&ccb=7-5&ig_cache_key=Mzg2NDc4Njk1MTc5OTE3NDE3MQ%3D%3D.3-ccb7-5&oh=00_Af1GCnajZTmqENbfWTELZa0a_F3u1_Sa0TUYzCi5r0mehA&oe=69D2BA9C&_nc_sid=10d13b
[圖片] https://scontent-yyz1-1.cdninstagram.com/v/t51.82787-15/658122054_17957941785081709_8014098124466222014_n.jpg?stp=dst-jpg_e35_tt6&efg=eyJ2ZW5jb2RlX3RhZyI6InRocmVhZHMuQ0FST1VTRUxfSVRFTS5pbWFnZV91cmxnZW4uMTQ0MHgxODAxLnNkci5mODI3ODcuZGVmYXVsdF9pbWFnZS5jMiJ9&_nc_ht=scontent-yyz1-1.cdninstagram.com&_nc_cat=101&_nc_oc=Q6cZ2gFb03Cxn08YHN2BwsPC2737S9SVHF3kKWQWvvTJF1vElcjJM-X24YacR-6tmc-Scyg&_nc_ohc=5s8HyCNQ-R8Q7kNvwGLp5Fz&_nc_gid=ql6pQEyTunm_-7aZe8ZhOg&edm=APs17CUBAAAA&ccb=7-5&ig_cache_key=Mzg2NDc4Njk1MTQ3MTk3NTMxMg%3D%3D.3-ccb7-5&oh=00_Af3HJ-fvxn7wkzsoanNM9V9CGrTMiUv_ysS48bbm-m7A3Q&oe=69D2A573&_nc_sid=10d13b
[圖片] https://scontent-yyz1-1.cdninstagram.com/v/t51.82787-15/659582339_17957941761081709_6717916988380958419_n.jpg?stp=dst-jpg_e35_tt6&efg=eyJ2ZW5jb2RlX3RhZyI6InRocmVhZHMuQ0FST1VTRUxfSVRFTS5pbWFnZV91cmxnZW4uMTQ0MHgxODAxLnNkci5mODI3ODcuZGVmYXVsdF9pbWFnZS5jMiJ9&_nc_ht=scontent-yyz1-1.cdninstagram.com&_nc_cat=101&_nc_oc=Q6cZ2gFb03Cxn08YHN2BwsPC2737S9SVHF3kKWQWvvTJF1vElcjJM-X24YacR-6tmc-Scyg&_nc_ohc=EvlPqMN8VSwQ7kNvwETXg2z&_nc_gid=ql6pQEyTunm_-7aZe8ZhOg&edm=APs17CUBAAAA&ccb=7-5&ig_cache_key=Mzg2NDc4Njk1MjE4NTAzMzU2Ng%3D%3D.3-ccb7-5&oh=00_Af1mVT0YivcBNYYh8Tix11zZSa_WT1x1HOkBwOL_FBEl8w&oe=69D2B533&_nc_sid=10d13b
[圖片] https://scontent-yyz1-1.cdninstagram.com/v/t51.82787-15/656826295_17957941776081709_2725668036447327880_n.jpg?stp=dst-jpg_e35_tt6&efg=eyJ2ZW5jb2RlX3RhZyI6InRocmVhZHMuQ0FST1VTRUxfSVRFTS5pbWFnZV91cmxnZW4uMTQ0MHgxODAxLnNkci5mODI3ODcuZGVmYXVsdF9pbWFnZS5jMiJ9&_nc_ht=scontent-yyz1-1.cdninstagram.com&_nc_cat=101&_nc_oc=Q6cZ2gFb03Cxn08YHN2BwsPC2737S9SVHF3kKWQWvvTJF1vElcjJM-X24YacR-6tmc-Scyg&_nc_ohc=dPBKKqrK8T4Q7kNvwHxFuI_&_nc_gid=ql6pQEyTunm_-7aZe8ZhOg&edm=APs17CUBAAAA&ccb=7-5&ig_cache_key=Mzg2NDc4Njk1MTc5OTE3NDE3MQ%3D%3D.3-ccb7-5&oh=00_Af1GCnajZTmqENbfWTELZa0a_F3u1_Sa0TUYzCi5r0mehA&oe=69D2BA9C&_nc_sid=10d13b
[圖片] https://scontent-yyz1-1.cdninstagram.com/v/t51.82787-15/658122054_17957941785081709_8014098124466222014_n.jpg?stp=dst-jpg_e35_tt6&efg=eyJ2ZW5jb2RlX3RhZyI6InRocmVhZHMuQ0FST1VTRUxfSVRFTS5pbWFnZV91cmxnZW4uMTQ0MHgxODAxLnNkci5mODI3ODcuZGVmYXVsdF9pbWFnZS5jMiJ9&_nc_ht=scontent-yyz1-1.cdninstagram.com&_nc_cat=101&_nc_oc=Q6cZ2gFb03Cxn08YHN2BwsPC2737S9SVHF3kKWQWvvTJF1vElcjJM-X24YacR-6tmc-Scyg&_nc_ohc=5s8HyCNQ-R8Q7kNvwGLp5Fz&_nc_gid=ql6pQEyTunm_-7aZe8ZhOg&edm=APs17CUBAAAA&ccb=7-5&ig_cache_key=Mzg2NDc4Njk1MTQ3MTk3NTMxMg%3D%3D.3-ccb7-5&oh=00_Af3HJ-fvxn7wkzsoanNM9V9CGrTMiUv_ysS48bbm-m7A3Q&oe=69D2A573&_nc_sid=10d13b