2024-01-21T18:02:30,189 Created temporary directory: /tmp/pip-build-tracker-3uar7qiv 2024-01-21T18:02:30,190 Initialized build tracking at /tmp/pip-build-tracker-3uar7qiv 2024-01-21T18:02:30,191 Created build tracker: /tmp/pip-build-tracker-3uar7qiv 2024-01-21T18:02:30,191 Entered build tracker: /tmp/pip-build-tracker-3uar7qiv 2024-01-21T18:02:30,192 Created temporary directory: /tmp/pip-wheel-3go2k9k7 2024-01-21T18:02:30,195 Created temporary directory: /tmp/pip-ephem-wheel-cache-tcoubn21 2024-01-21T18:02:30,219 Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple 2024-01-21T18:02:30,223 2 location(s) to search for versions of multi-loras: 2024-01-21T18:02:30,223 * https://pypi.org/simple/multi-loras/ 2024-01-21T18:02:30,223 * https://www.piwheels.org/simple/multi-loras/ 2024-01-21T18:02:30,224 Fetching project page and analyzing links: https://pypi.org/simple/multi-loras/ 2024-01-21T18:02:30,224 Getting page https://pypi.org/simple/multi-loras/ 2024-01-21T18:02:30,226 Found index url https://pypi.org/simple/ 2024-01-21T18:02:30,453 Fetched page https://pypi.org/simple/multi-loras/ as application/vnd.pypi.simple.v1+json 2024-01-21T18:02:30,455 Skipping link: No binaries permitted for multi-loras: https://files.pythonhosted.org/packages/f9/7c/ad7f43fe688303b03f5fc13f48f383e7dcf1f5a154e736bd58894b2b875b/multi_loras-0.1.0-py3-none-any.whl (from https://pypi.org/simple/multi-loras/) (requires-python:>=3.8.0) 2024-01-21T18:02:30,456 Found link https://files.pythonhosted.org/packages/e0/b9/6914be7a810f4cdc6a74bfc25c3cd0b6979452ae52aaa68ccc80fcd3269f/multi_loras-0.1.0.tar.gz (from https://pypi.org/simple/multi-loras/) (requires-python:>=3.8.0), version: 0.1.0 2024-01-21T18:02:30,457 Skipping link: No binaries permitted for multi-loras: https://files.pythonhosted.org/packages/43/b0/65d9ac06cffadc9f28da1bb039e77b1538842dd75dff6814fa5d7fb695d6/multi_loras-0.2.0-py3-none-any.whl (from https://pypi.org/simple/multi-loras/) (requires-python:>=3.8.0) 2024-01-21T18:02:30,458 Found link https://files.pythonhosted.org/packages/92/a2/0ebc4872978836cb41b7678198724a04230f51543fd361c26112e0692490/multi_loras-0.2.0.tar.gz (from https://pypi.org/simple/multi-loras/) (requires-python:>=3.8.0), version: 0.2.0 2024-01-21T18:02:30,459 Skipping link: No binaries permitted for multi-loras: https://files.pythonhosted.org/packages/01/7d/abc0401e3fcba855543ffe817304df982ce6ce8705367c75e7000750c246/multi_loras-0.3.0-py3-none-any.whl (from https://pypi.org/simple/multi-loras/) (requires-python:>=3.8.0) 2024-01-21T18:02:30,460 Found link https://files.pythonhosted.org/packages/6f/a9/3291b7e932fd87ffbba91f3b41674f70d83e6280f3d35fb7f6163aca1c02/multi_loras-0.3.0.tar.gz (from https://pypi.org/simple/multi-loras/) (requires-python:>=3.8.0), version: 0.3.0 2024-01-21T18:02:30,461 Fetching project page and analyzing links: https://www.piwheels.org/simple/multi-loras/ 2024-01-21T18:02:30,461 Getting page https://www.piwheels.org/simple/multi-loras/ 2024-01-21T18:02:30,463 Found index url https://www.piwheels.org/simple/ 2024-01-21T18:02:30,625 Fetched page https://www.piwheels.org/simple/multi-loras/ as text/html 2024-01-21T18:02:30,627 Skipping link: No binaries permitted for multi-loras: https://www.piwheels.org/simple/multi-loras/multi_loras-0.2.0-py3-none-any.whl#sha256=3e6cace49921c89e7592128cf7376e900982a44acadc97243f43a753288ea1cc (from https://www.piwheels.org/simple/multi-loras/) (requires-python:>=3.8.0) 2024-01-21T18:02:30,627 Skipping link: No binaries permitted for multi-loras: https://www.piwheels.org/simple/multi-loras/multi_loras-0.1.0-py3-none-any.whl#sha256=61640ca439ca6d9847b87c1138b3c1a7a4c06bb9efc9c5bdf513febc784083a8 (from https://www.piwheels.org/simple/multi-loras/) (requires-python:>=3.8.0) 2024-01-21T18:02:30,628 Skipping link: not a file: https://www.piwheels.org/simple/multi-loras/ 2024-01-21T18:02:30,629 Skipping link: not a file: https://pypi.org/simple/multi-loras/ 2024-01-21T18:02:30,647 Given no hashes to check 1 links for project 'multi-loras': discarding no candidates 2024-01-21T18:02:30,664 Collecting multi-loras==0.3.0 2024-01-21T18:02:30,667 Created temporary directory: /tmp/pip-unpack-1q58qi97 2024-01-21T18:02:30,881 Downloading multi_loras-0.3.0.tar.gz (103 kB) 2024-01-21T18:02:31,155 Added multi-loras==0.3.0 from https://files.pythonhosted.org/packages/6f/a9/3291b7e932fd87ffbba91f3b41674f70d83e6280f3d35fb7f6163aca1c02/multi_loras-0.3.0.tar.gz to build tracker '/tmp/pip-build-tracker-3uar7qiv' 2024-01-21T18:02:31,157 Running setup.py (path:/tmp/pip-wheel-3go2k9k7/multi-loras_3b1dd4433d8c4ea99200971396ed73d4/setup.py) egg_info for package multi-loras 2024-01-21T18:02:31,158 Created temporary directory: /tmp/pip-pip-egg-info-2n7be8id 2024-01-21T18:02:31,158 Preparing metadata (setup.py): started 2024-01-21T18:02:31,160 Running command python setup.py egg_info 2024-01-21T18:02:32,197 running egg_info 2024-01-21T18:02:32,199 creating /tmp/pip-pip-egg-info-2n7be8id/multi_loras.egg-info 2024-01-21T18:02:32,223 writing /tmp/pip-pip-egg-info-2n7be8id/multi_loras.egg-info/PKG-INFO 2024-01-21T18:02:32,227 writing dependency_links to /tmp/pip-pip-egg-info-2n7be8id/multi_loras.egg-info/dependency_links.txt 2024-01-21T18:02:32,229 writing requirements to /tmp/pip-pip-egg-info-2n7be8id/multi_loras.egg-info/requires.txt 2024-01-21T18:02:32,230 writing top-level names to /tmp/pip-pip-egg-info-2n7be8id/multi_loras.egg-info/top_level.txt 2024-01-21T18:02:32,231 writing manifest file '/tmp/pip-pip-egg-info-2n7be8id/multi_loras.egg-info/SOURCES.txt' 2024-01-21T18:02:32,327 reading manifest file '/tmp/pip-pip-egg-info-2n7be8id/multi_loras.egg-info/SOURCES.txt' 2024-01-21T18:02:32,329 adding license file 'LICENSE' 2024-01-21T18:02:32,334 writing manifest file '/tmp/pip-pip-egg-info-2n7be8id/multi_loras.egg-info/SOURCES.txt' 2024-01-21T18:02:32,442 Preparing metadata (setup.py): finished with status 'done' 2024-01-21T18:02:32,446 Source in /tmp/pip-wheel-3go2k9k7/multi-loras_3b1dd4433d8c4ea99200971396ed73d4 has version 0.3.0, which satisfies requirement multi-loras==0.3.0 from https://files.pythonhosted.org/packages/6f/a9/3291b7e932fd87ffbba91f3b41674f70d83e6280f3d35fb7f6163aca1c02/multi_loras-0.3.0.tar.gz 2024-01-21T18:02:32,447 Removed multi-loras==0.3.0 from https://files.pythonhosted.org/packages/6f/a9/3291b7e932fd87ffbba91f3b41674f70d83e6280f3d35fb7f6163aca1c02/multi_loras-0.3.0.tar.gz from build tracker '/tmp/pip-build-tracker-3uar7qiv' 2024-01-21T18:02:32,454 Created temporary directory: /tmp/pip-unpack-4evxqi4j 2024-01-21T18:02:32,455 Created temporary directory: /tmp/pip-unpack-ay5k8r7m 2024-01-21T18:02:32,462 Building wheels for collected packages: multi-loras 2024-01-21T18:02:32,466 Created temporary directory: /tmp/pip-wheel-q8iyblmh 2024-01-21T18:02:32,467 Building wheel for multi-loras (setup.py): started 2024-01-21T18:02:32,468 Destination directory: /tmp/pip-wheel-q8iyblmh 2024-01-21T18:02:32,469 Running command python setup.py bdist_wheel 2024-01-21T18:02:33,482 running bdist_wheel 2024-01-21T18:02:33,576 running build 2024-01-21T18:02:33,576 running build_py 2024-01-21T18:02:33,603 creating build 2024-01-21T18:02:33,604 creating build/lib 2024-01-21T18:02:33,605 creating build/lib/multi_loras 2024-01-21T18:02:33,606 copying multi_loras/delta_weights.py -> build/lib/multi_loras 2024-01-21T18:02:33,608 copying multi_loras/extract_lora.py -> build/lib/multi_loras 2024-01-21T18:02:33,610 copying multi_loras/__version__.py -> build/lib/multi_loras 2024-01-21T18:02:33,612 copying multi_loras/dare.py -> build/lib/multi_loras 2024-01-21T18:02:33,614 copying multi_loras/merge_peft_adapters.py -> build/lib/multi_loras 2024-01-21T18:02:33,616 copying multi_loras/__init__.py -> build/lib/multi_loras 2024-01-21T18:02:33,617 copying multi_loras/lorahub.py -> build/lib/multi_loras 2024-01-21T18:02:33,620 copying multi_loras/merge_models.py -> build/lib/multi_loras 2024-01-21T18:02:33,622 copying multi_loras/__main__.py -> build/lib/multi_loras 2024-01-21T18:02:33,624 copying multi_loras/merging_methods.py -> build/lib/multi_loras 2024-01-21T18:02:33,627 copying multi_loras/orthogonal_component.py -> build/lib/multi_loras 2024-01-21T18:02:33,630 creating build/lib/multi_loras/slora 2024-01-21T18:02:33,631 copying multi_loras/slora/sampling_params.py -> build/lib/multi_loras/slora 2024-01-21T18:02:33,633 copying multi_loras/slora/slora_server.py -> build/lib/multi_loras/slora 2024-01-21T18:02:33,636 copying multi_loras/slora/__init__.py -> build/lib/multi_loras/slora 2024-01-21T18:02:33,637 copying multi_loras/slora/install_slora_kernel.py -> build/lib/multi_loras/slora 2024-01-21T18:02:33,639 copying multi_loras/slora/io_struct.py -> build/lib/multi_loras/slora 2024-01-21T18:02:33,642 creating build/lib/multi_loras/slora/common 2024-01-21T18:02:33,643 copying multi_loras/slora/common/infer_utils.py -> build/lib/multi_loras/slora/common 2024-01-21T18:02:33,645 copying multi_loras/slora/common/gqa_mem_manager.py -> build/lib/multi_loras/slora/common 2024-01-21T18:02:33,647 copying multi_loras/slora/common/build_utils.py -> build/lib/multi_loras/slora/common 2024-01-21T18:02:33,649 copying multi_loras/slora/common/ppl_int8kv_mem_manager.py -> build/lib/multi_loras/slora/common 2024-01-21T18:02:33,651 copying multi_loras/slora/common/__init__.py -> build/lib/multi_loras/slora/common 2024-01-21T18:02:33,653 copying multi_loras/slora/common/mem_manager.py -> build/lib/multi_loras/slora/common 2024-01-21T18:02:33,655 copying multi_loras/slora/common/int8kv_mem_manager.py -> build/lib/multi_loras/slora/common 2024-01-21T18:02:33,657 copying multi_loras/slora/common/mem_allocator.py -> build/lib/multi_loras/slora/common 2024-01-21T18:02:33,660 creating build/lib/multi_loras/slora/models 2024-01-21T18:02:33,661 copying multi_loras/slora/models/__init__.py -> build/lib/multi_loras/slora/models 2024-01-21T18:02:33,663 creating build/lib/multi_loras/slora/router 2024-01-21T18:02:33,664 copying multi_loras/slora/router/stats.py -> build/lib/multi_loras/slora/router 2024-01-21T18:02:33,666 copying multi_loras/slora/router/req_queue.py -> build/lib/multi_loras/slora/router 2024-01-21T18:02:33,669 copying multi_loras/slora/router/profiler.py -> build/lib/multi_loras/slora/router 2024-01-21T18:02:33,671 copying multi_loras/slora/router/pets_req_queue.py -> build/lib/multi_loras/slora/router 2024-01-21T18:02:33,673 copying multi_loras/slora/router/input_params.py -> build/lib/multi_loras/slora/router 2024-01-21T18:02:33,675 copying multi_loras/slora/router/__init__.py -> build/lib/multi_loras/slora/router 2024-01-21T18:02:33,677 copying multi_loras/slora/router/cluster_req_queue.py -> build/lib/multi_loras/slora/router 2024-01-21T18:02:33,679 copying multi_loras/slora/router/abort_req_queue.py -> build/lib/multi_loras/slora/router 2024-01-21T18:02:33,681 copying multi_loras/slora/router/peft_req_queue.py -> build/lib/multi_loras/slora/router 2024-01-21T18:02:33,684 copying multi_loras/slora/router/manager.py -> build/lib/multi_loras/slora/router 2024-01-21T18:02:33,687 creating build/lib/multi_loras/slora/utils 2024-01-21T18:02:33,688 copying multi_loras/slora/utils/infer_utils.py -> build/lib/multi_loras/slora/utils 2024-01-21T18:02:33,691 copying multi_loras/slora/utils/model_load.py -> build/lib/multi_loras/slora/utils 2024-01-21T18:02:33,693 copying multi_loras/slora/utils/metric.py -> build/lib/multi_loras/slora/utils 2024-01-21T18:02:33,695 copying multi_loras/slora/utils/__init__.py -> build/lib/multi_loras/slora/utils 2024-01-21T18:02:33,697 copying multi_loras/slora/utils/model_utils.py -> build/lib/multi_loras/slora/utils 2024-01-21T18:02:33,699 copying multi_loras/slora/utils/net_utils.py -> build/lib/multi_loras/slora/utils 2024-01-21T18:02:33,701 creating build/lib/multi_loras/slora/common/basemodel 2024-01-21T18:02:33,702 copying multi_loras/slora/common/basemodel/basemodel.py -> build/lib/multi_loras/slora/common/basemodel 2024-01-21T18:02:33,705 copying multi_loras/slora/common/basemodel/__init__.py -> build/lib/multi_loras/slora/common/basemodel 2024-01-21T18:02:33,707 copying multi_loras/slora/common/basemodel/infer_struct.py -> build/lib/multi_loras/slora/common/basemodel 2024-01-21T18:02:33,710 creating build/lib/multi_loras/slora/common/configs 2024-01-21T18:02:33,711 copying multi_loras/slora/common/configs/config.py -> build/lib/multi_loras/slora/common/configs 2024-01-21T18:02:33,713 copying multi_loras/slora/common/configs/__init__.py -> build/lib/multi_loras/slora/common/configs 2024-01-21T18:02:33,715 creating build/lib/multi_loras/slora/common/basemodel/triton_kernel 2024-01-21T18:02:33,716 copying multi_loras/slora/common/basemodel/triton_kernel/dequantize_gemm_int4.py -> build/lib/multi_loras/slora/common/basemodel/triton_kernel 2024-01-21T18:02:33,719 copying multi_loras/slora/common/basemodel/triton_kernel/destindex_copy_kv.py -> build/lib/multi_loras/slora/common/basemodel/triton_kernel 2024-01-21T18:02:33,722 copying multi_loras/slora/common/basemodel/triton_kernel/apply_penalty.py -> build/lib/multi_loras/slora/common/basemodel/triton_kernel 2024-01-21T18:02:33,724 copying multi_loras/slora/common/basemodel/triton_kernel/__init__.py -> build/lib/multi_loras/slora/common/basemodel/triton_kernel 2024-01-21T18:02:33,726 copying multi_loras/slora/common/basemodel/triton_kernel/dequantize_gemm_int8.py -> build/lib/multi_loras/slora/common/basemodel/triton_kernel 2024-01-21T18:02:33,728 copying multi_loras/slora/common/basemodel/triton_kernel/quantize_gemm_int8.py -> build/lib/multi_loras/slora/common/basemodel/triton_kernel 2024-01-21T18:02:33,732 creating build/lib/multi_loras/slora/common/basemodel/layer_weights 2024-01-21T18:02:33,733 copying multi_loras/slora/common/basemodel/layer_weights/hf_load_utils.py -> build/lib/multi_loras/slora/common/basemodel/layer_weights 2024-01-21T18:02:33,735 copying multi_loras/slora/common/basemodel/layer_weights/transformer_layer_weight.py -> build/lib/multi_loras/slora/common/basemodel/layer_weights 2024-01-21T18:02:33,737 copying multi_loras/slora/common/basemodel/layer_weights/pre_and_post_layer_weight.py -> build/lib/multi_loras/slora/common/basemodel/layer_weights 2024-01-21T18:02:33,739 copying multi_loras/slora/common/basemodel/layer_weights/__init__.py -> build/lib/multi_loras/slora/common/basemodel/layer_weights 2024-01-21T18:02:33,741 copying multi_loras/slora/common/basemodel/layer_weights/base_layer_weight.py -> build/lib/multi_loras/slora/common/basemodel/layer_weights 2024-01-21T18:02:33,743 creating build/lib/multi_loras/slora/common/basemodel/layer_infer 2024-01-21T18:02:33,745 copying multi_loras/slora/common/basemodel/layer_infer/pre_layer_infer.py -> build/lib/multi_loras/slora/common/basemodel/layer_infer 2024-01-21T18:02:33,747 copying multi_loras/slora/common/basemodel/layer_infer/__init__.py -> build/lib/multi_loras/slora/common/basemodel/layer_infer 2024-01-21T18:02:33,749 copying multi_loras/slora/common/basemodel/layer_infer/post_layer_infer.py -> build/lib/multi_loras/slora/common/basemodel/layer_infer 2024-01-21T18:02:33,751 copying multi_loras/slora/common/basemodel/layer_infer/base_layer_infer.py -> build/lib/multi_loras/slora/common/basemodel/layer_infer 2024-01-21T18:02:33,753 copying multi_loras/slora/common/basemodel/layer_infer/transformer_layer_infer.py -> build/lib/multi_loras/slora/common/basemodel/layer_infer 2024-01-21T18:02:33,755 creating build/lib/multi_loras/slora/common/basemodel/layer_infer/template 2024-01-21T18:02:33,756 copying multi_loras/slora/common/basemodel/layer_infer/template/post_layer_infer_template.py -> build/lib/multi_loras/slora/common/basemodel/layer_infer/template 2024-01-21T18:02:33,758 copying multi_loras/slora/common/basemodel/layer_infer/template/pre_layer_infer_template.py -> build/lib/multi_loras/slora/common/basemodel/layer_infer/template 2024-01-21T18:02:33,760 copying multi_loras/slora/common/basemodel/layer_infer/template/transformer_layer_infer_template.py -> build/lib/multi_loras/slora/common/basemodel/layer_infer/template 2024-01-21T18:02:33,763 copying multi_loras/slora/common/basemodel/layer_infer/template/__init__.py -> build/lib/multi_loras/slora/common/basemodel/layer_infer/template 2024-01-21T18:02:33,765 creating build/lib/multi_loras/slora/models/llama2 2024-01-21T18:02:33,767 copying multi_loras/slora/models/llama2/__init__.py -> build/lib/multi_loras/slora/models/llama2 2024-01-21T18:02:33,768 copying multi_loras/slora/models/llama2/model.py -> build/lib/multi_loras/slora/models/llama2 2024-01-21T18:02:33,771 creating build/lib/multi_loras/slora/models/llama 2024-01-21T18:02:33,772 copying multi_loras/slora/models/llama/__init__.py -> build/lib/multi_loras/slora/models/llama 2024-01-21T18:02:33,774 copying multi_loras/slora/models/llama/model.py -> build/lib/multi_loras/slora/models/llama 2024-01-21T18:02:33,776 copying multi_loras/slora/models/llama/infer_struct.py -> build/lib/multi_loras/slora/models/llama 2024-01-21T18:02:33,779 creating build/lib/multi_loras/slora/models/llama2/triton_kernel 2024-01-21T18:02:33,780 copying multi_loras/slora/models/llama2/triton_kernel/token_attention_softmax_and_reducev.py -> build/lib/multi_loras/slora/models/llama2/triton_kernel 2024-01-21T18:02:33,782 copying multi_loras/slora/models/llama2/triton_kernel/token_attention_nopad_reduceV.py -> build/lib/multi_loras/slora/models/llama2/triton_kernel 2024-01-21T18:02:33,784 copying multi_loras/slora/models/llama2/triton_kernel/context_flashattention_nopad.py -> build/lib/multi_loras/slora/models/llama2/triton_kernel 2024-01-21T18:02:33,786 copying multi_loras/slora/models/llama2/triton_kernel/token_attention_nopad_softmax.py -> build/lib/multi_loras/slora/models/llama2/triton_kernel 2024-01-21T18:02:33,788 copying multi_loras/slora/models/llama2/triton_kernel/__init__.py -> build/lib/multi_loras/slora/models/llama2/triton_kernel 2024-01-21T18:02:33,790 copying multi_loras/slora/models/llama2/triton_kernel/token_attention_nopad_att1.py -> build/lib/multi_loras/slora/models/llama2/triton_kernel 2024-01-21T18:02:33,793 creating build/lib/multi_loras/slora/models/llama2/layer_weights 2024-01-21T18:02:33,794 copying multi_loras/slora/models/llama2/layer_weights/transformer_layer_weight.py -> build/lib/multi_loras/slora/models/llama2/layer_weights 2024-01-21T18:02:33,796 copying multi_loras/slora/models/llama2/layer_weights/__init__.py -> build/lib/multi_loras/slora/models/llama2/layer_weights 2024-01-21T18:02:33,798 creating build/lib/multi_loras/slora/models/llama2/layer_infer 2024-01-21T18:02:33,799 copying multi_loras/slora/models/llama2/layer_infer/__init__.py -> build/lib/multi_loras/slora/models/llama2/layer_infer 2024-01-21T18:02:33,802 copying multi_loras/slora/models/llama2/layer_infer/transformer_layer_infer.py -> build/lib/multi_loras/slora/models/llama2/layer_infer 2024-01-21T18:02:33,805 creating build/lib/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:33,806 copying multi_loras/slora/models/llama/triton_kernel/rotary_emb.py -> build/lib/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:33,808 copying multi_loras/slora/models/llama/triton_kernel/token_attention_softmax_and_reducev.py -> build/lib/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:33,810 copying multi_loras/slora/models/llama/triton_kernel/token_attention_nopad_reduceV.py -> build/lib/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:33,812 copying multi_loras/slora/models/llama/triton_kernel/context_flashattention_nopad.py -> build/lib/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:33,815 copying multi_loras/slora/models/llama/triton_kernel/token_attention_nopad_softmax.py -> build/lib/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:33,817 copying multi_loras/slora/models/llama/triton_kernel/rmsnorm.py -> build/lib/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:33,819 copying multi_loras/slora/models/llama/triton_kernel/__init__.py -> build/lib/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:33,821 copying multi_loras/slora/models/llama/triton_kernel/token_attention_nopad_att1.py -> build/lib/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:33,824 creating build/lib/multi_loras/slora/models/llama/layer_weights 2024-01-21T18:02:33,825 copying multi_loras/slora/models/llama/layer_weights/transformer_layer_weight.py -> build/lib/multi_loras/slora/models/llama/layer_weights 2024-01-21T18:02:33,828 copying multi_loras/slora/models/llama/layer_weights/pre_and_post_layer_weight.py -> build/lib/multi_loras/slora/models/llama/layer_weights 2024-01-21T18:02:33,830 copying multi_loras/slora/models/llama/layer_weights/__init__.py -> build/lib/multi_loras/slora/models/llama/layer_weights 2024-01-21T18:02:33,832 creating build/lib/multi_loras/slora/models/llama/layer_infer 2024-01-21T18:02:33,833 copying multi_loras/slora/models/llama/layer_infer/pre_layer_infer.py -> build/lib/multi_loras/slora/models/llama/layer_infer 2024-01-21T18:02:33,835 copying multi_loras/slora/models/llama/layer_infer/__init__.py -> build/lib/multi_loras/slora/models/llama/layer_infer 2024-01-21T18:02:33,836 copying multi_loras/slora/models/llama/layer_infer/post_layer_infer.py -> build/lib/multi_loras/slora/models/llama/layer_infer 2024-01-21T18:02:33,838 copying multi_loras/slora/models/llama/layer_infer/transformer_layer_infer.py -> build/lib/multi_loras/slora/models/llama/layer_infer 2024-01-21T18:02:33,841 creating build/lib/multi_loras/slora/router/model_infer 2024-01-21T18:02:33,841 copying multi_loras/slora/router/model_infer/naive_infer_adapter.py -> build/lib/multi_loras/slora/router/model_infer 2024-01-21T18:02:33,844 copying multi_loras/slora/router/model_infer/model_rpc.py -> build/lib/multi_loras/slora/router/model_infer 2024-01-21T18:02:33,846 copying multi_loras/slora/router/model_infer/__init__.py -> build/lib/multi_loras/slora/router/model_infer 2024-01-21T18:02:33,847 copying multi_loras/slora/router/model_infer/infer_adapter.py -> build/lib/multi_loras/slora/router/model_infer 2024-01-21T18:02:33,850 copying multi_loras/slora/router/model_infer/post_process.py -> build/lib/multi_loras/slora/router/model_infer 2024-01-21T18:02:33,852 copying multi_loras/slora/router/model_infer/infer_batch.py -> build/lib/multi_loras/slora/router/model_infer 2024-01-21T18:02:33,854 running egg_info 2024-01-21T18:02:33,913 writing multi_loras.egg-info/PKG-INFO 2024-01-21T18:02:33,916 writing dependency_links to multi_loras.egg-info/dependency_links.txt 2024-01-21T18:02:33,918 writing requirements to multi_loras.egg-info/requires.txt 2024-01-21T18:02:33,919 writing top-level names to multi_loras.egg-info/top_level.txt 2024-01-21T18:02:33,962 reading manifest file 'multi_loras.egg-info/SOURCES.txt' 2024-01-21T18:02:33,966 adding license file 'LICENSE' 2024-01-21T18:02:33,972 writing manifest file 'multi_loras.egg-info/SOURCES.txt' 2024-01-21T18:02:34,015 /usr/local/lib/python3.11/dist-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated. 2024-01-21T18:02:34,016 !! 2024-01-21T18:02:34,017 ******************************************************************************** 2024-01-21T18:02:34,017 Please avoid running ``setup.py`` directly. 2024-01-21T18:02:34,018 Instead, use pypa/build, pypa/installer or other 2024-01-21T18:02:34,018 standards-based tools. 2024-01-21T18:02:34,019 See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details. 2024-01-21T18:02:34,020 ******************************************************************************** 2024-01-21T18:02:34,021 !! 2024-01-21T18:02:34,021 self.initialize_options() 2024-01-21T18:02:34,040 installing to build/bdist.linux-armv7l/wheel 2024-01-21T18:02:34,041 running install 2024-01-21T18:02:34,066 running install_lib 2024-01-21T18:02:34,089 creating build/bdist.linux-armv7l 2024-01-21T18:02:34,090 creating build/bdist.linux-armv7l/wheel 2024-01-21T18:02:34,092 creating build/bdist.linux-armv7l/wheel/multi_loras 2024-01-21T18:02:34,093 copying build/lib/multi_loras/delta_weights.py -> build/bdist.linux-armv7l/wheel/multi_loras 2024-01-21T18:02:34,095 copying build/lib/multi_loras/extract_lora.py -> build/bdist.linux-armv7l/wheel/multi_loras 2024-01-21T18:02:34,098 creating build/bdist.linux-armv7l/wheel/multi_loras/slora 2024-01-21T18:02:34,099 copying build/lib/multi_loras/slora/sampling_params.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora 2024-01-21T18:02:34,100 copying build/lib/multi_loras/slora/slora_server.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora 2024-01-21T18:02:34,103 copying build/lib/multi_loras/slora/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora 2024-01-21T18:02:34,105 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/common 2024-01-21T18:02:34,106 copying build/lib/multi_loras/slora/common/infer_utils.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common 2024-01-21T18:02:34,108 copying build/lib/multi_loras/slora/common/gqa_mem_manager.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common 2024-01-21T18:02:34,109 copying build/lib/multi_loras/slora/common/build_utils.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common 2024-01-21T18:02:34,111 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel 2024-01-21T18:02:34,113 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/triton_kernel 2024-01-21T18:02:34,114 copying build/lib/multi_loras/slora/common/basemodel/triton_kernel/dequantize_gemm_int4.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/triton_kernel 2024-01-21T18:02:34,117 copying build/lib/multi_loras/slora/common/basemodel/triton_kernel/destindex_copy_kv.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/triton_kernel 2024-01-21T18:02:34,119 copying build/lib/multi_loras/slora/common/basemodel/triton_kernel/apply_penalty.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/triton_kernel 2024-01-21T18:02:34,121 copying build/lib/multi_loras/slora/common/basemodel/triton_kernel/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/triton_kernel 2024-01-21T18:02:34,123 copying build/lib/multi_loras/slora/common/basemodel/triton_kernel/dequantize_gemm_int8.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/triton_kernel 2024-01-21T18:02:34,125 copying build/lib/multi_loras/slora/common/basemodel/triton_kernel/quantize_gemm_int8.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/triton_kernel 2024-01-21T18:02:34,128 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_weights 2024-01-21T18:02:34,129 copying build/lib/multi_loras/slora/common/basemodel/layer_weights/hf_load_utils.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_weights 2024-01-21T18:02:34,131 copying build/lib/multi_loras/slora/common/basemodel/layer_weights/transformer_layer_weight.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_weights 2024-01-21T18:02:34,133 copying build/lib/multi_loras/slora/common/basemodel/layer_weights/pre_and_post_layer_weight.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_weights 2024-01-21T18:02:34,134 copying build/lib/multi_loras/slora/common/basemodel/layer_weights/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_weights 2024-01-21T18:02:34,136 copying build/lib/multi_loras/slora/common/basemodel/layer_weights/base_layer_weight.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_weights 2024-01-21T18:02:34,138 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_infer 2024-01-21T18:02:34,139 copying build/lib/multi_loras/slora/common/basemodel/layer_infer/pre_layer_infer.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_infer 2024-01-21T18:02:34,141 copying build/lib/multi_loras/slora/common/basemodel/layer_infer/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_infer 2024-01-21T18:02:34,143 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_infer/template 2024-01-21T18:02:34,144 copying build/lib/multi_loras/slora/common/basemodel/layer_infer/template/post_layer_infer_template.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_infer/template 2024-01-21T18:02:34,146 copying build/lib/multi_loras/slora/common/basemodel/layer_infer/template/pre_layer_infer_template.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_infer/template 2024-01-21T18:02:34,148 copying build/lib/multi_loras/slora/common/basemodel/layer_infer/template/transformer_layer_infer_template.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_infer/template 2024-01-21T18:02:34,150 copying build/lib/multi_loras/slora/common/basemodel/layer_infer/template/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_infer/template 2024-01-21T18:02:34,152 copying build/lib/multi_loras/slora/common/basemodel/layer_infer/post_layer_infer.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_infer 2024-01-21T18:02:34,153 copying build/lib/multi_loras/slora/common/basemodel/layer_infer/base_layer_infer.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_infer 2024-01-21T18:02:34,155 copying build/lib/multi_loras/slora/common/basemodel/layer_infer/transformer_layer_infer.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel/layer_infer 2024-01-21T18:02:34,157 copying build/lib/multi_loras/slora/common/basemodel/basemodel.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel 2024-01-21T18:02:34,159 copying build/lib/multi_loras/slora/common/basemodel/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel 2024-01-21T18:02:34,161 copying build/lib/multi_loras/slora/common/basemodel/infer_struct.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/basemodel 2024-01-21T18:02:34,163 copying build/lib/multi_loras/slora/common/ppl_int8kv_mem_manager.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common 2024-01-21T18:02:34,165 copying build/lib/multi_loras/slora/common/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common 2024-01-21T18:02:34,166 copying build/lib/multi_loras/slora/common/mem_manager.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common 2024-01-21T18:02:34,168 copying build/lib/multi_loras/slora/common/int8kv_mem_manager.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common 2024-01-21T18:02:34,170 copying build/lib/multi_loras/slora/common/mem_allocator.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common 2024-01-21T18:02:34,172 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/common/configs 2024-01-21T18:02:34,173 copying build/lib/multi_loras/slora/common/configs/config.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/configs 2024-01-21T18:02:34,175 copying build/lib/multi_loras/slora/common/configs/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/common/configs 2024-01-21T18:02:34,177 copying build/lib/multi_loras/slora/install_slora_kernel.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora 2024-01-21T18:02:34,179 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/models 2024-01-21T18:02:34,180 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama2 2024-01-21T18:02:34,182 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama2/triton_kernel 2024-01-21T18:02:34,183 copying build/lib/multi_loras/slora/models/llama2/triton_kernel/token_attention_softmax_and_reducev.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama2/triton_kernel 2024-01-21T18:02:34,186 copying build/lib/multi_loras/slora/models/llama2/triton_kernel/token_attention_nopad_reduceV.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama2/triton_kernel 2024-01-21T18:02:34,188 copying build/lib/multi_loras/slora/models/llama2/triton_kernel/context_flashattention_nopad.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama2/triton_kernel 2024-01-21T18:02:34,191 copying build/lib/multi_loras/slora/models/llama2/triton_kernel/token_attention_nopad_softmax.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama2/triton_kernel 2024-01-21T18:02:34,193 copying build/lib/multi_loras/slora/models/llama2/triton_kernel/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama2/triton_kernel 2024-01-21T18:02:34,195 copying build/lib/multi_loras/slora/models/llama2/triton_kernel/token_attention_nopad_att1.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama2/triton_kernel 2024-01-21T18:02:34,198 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama2/layer_weights 2024-01-21T18:02:34,199 copying build/lib/multi_loras/slora/models/llama2/layer_weights/transformer_layer_weight.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama2/layer_weights 2024-01-21T18:02:34,201 copying build/lib/multi_loras/slora/models/llama2/layer_weights/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama2/layer_weights 2024-01-21T18:02:34,204 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama2/layer_infer 2024-01-21T18:02:34,205 copying build/lib/multi_loras/slora/models/llama2/layer_infer/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama2/layer_infer 2024-01-21T18:02:34,207 copying build/lib/multi_loras/slora/models/llama2/layer_infer/transformer_layer_infer.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama2/layer_infer 2024-01-21T18:02:34,210 copying build/lib/multi_loras/slora/models/llama2/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama2 2024-01-21T18:02:34,212 copying build/lib/multi_loras/slora/models/llama2/model.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama2 2024-01-21T18:02:34,214 copying build/lib/multi_loras/slora/models/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models 2024-01-21T18:02:34,217 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama 2024-01-21T18:02:34,218 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:34,219 copying build/lib/multi_loras/slora/models/llama/triton_kernel/rotary_emb.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:34,222 copying build/lib/multi_loras/slora/models/llama/triton_kernel/token_attention_softmax_and_reducev.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:34,225 copying build/lib/multi_loras/slora/models/llama/triton_kernel/token_attention_nopad_reduceV.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:34,228 copying build/lib/multi_loras/slora/models/llama/triton_kernel/context_flashattention_nopad.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:34,231 copying build/lib/multi_loras/slora/models/llama/triton_kernel/token_attention_nopad_softmax.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:34,233 copying build/lib/multi_loras/slora/models/llama/triton_kernel/rmsnorm.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:34,235 copying build/lib/multi_loras/slora/models/llama/triton_kernel/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:34,237 copying build/lib/multi_loras/slora/models/llama/triton_kernel/token_attention_nopad_att1.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/triton_kernel 2024-01-21T18:02:34,241 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/layer_weights 2024-01-21T18:02:34,242 copying build/lib/multi_loras/slora/models/llama/layer_weights/transformer_layer_weight.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/layer_weights 2024-01-21T18:02:34,245 copying build/lib/multi_loras/slora/models/llama/layer_weights/pre_and_post_layer_weight.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/layer_weights 2024-01-21T18:02:34,247 copying build/lib/multi_loras/slora/models/llama/layer_weights/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/layer_weights 2024-01-21T18:02:34,249 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/layer_infer 2024-01-21T18:02:34,250 copying build/lib/multi_loras/slora/models/llama/layer_infer/pre_layer_infer.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/layer_infer 2024-01-21T18:02:34,253 copying build/lib/multi_loras/slora/models/llama/layer_infer/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/layer_infer 2024-01-21T18:02:34,254 copying build/lib/multi_loras/slora/models/llama/layer_infer/post_layer_infer.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/layer_infer 2024-01-21T18:02:34,256 copying build/lib/multi_loras/slora/models/llama/layer_infer/transformer_layer_infer.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama/layer_infer 2024-01-21T18:02:34,259 copying build/lib/multi_loras/slora/models/llama/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama 2024-01-21T18:02:34,261 copying build/lib/multi_loras/slora/models/llama/model.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama 2024-01-21T18:02:34,263 copying build/lib/multi_loras/slora/models/llama/infer_struct.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/models/llama 2024-01-21T18:02:34,266 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/router 2024-01-21T18:02:34,267 copying build/lib/multi_loras/slora/router/stats.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/router 2024-01-21T18:02:34,269 copying build/lib/multi_loras/slora/router/req_queue.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/router 2024-01-21T18:02:34,272 copying build/lib/multi_loras/slora/router/profiler.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/router 2024-01-21T18:02:34,274 copying build/lib/multi_loras/slora/router/pets_req_queue.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/router 2024-01-21T18:02:34,277 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/router/model_infer 2024-01-21T18:02:34,278 copying build/lib/multi_loras/slora/router/model_infer/naive_infer_adapter.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/router/model_infer 2024-01-21T18:02:34,281 copying build/lib/multi_loras/slora/router/model_infer/model_rpc.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/router/model_infer 2024-01-21T18:02:34,284 copying build/lib/multi_loras/slora/router/model_infer/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/router/model_infer 2024-01-21T18:02:34,286 copying build/lib/multi_loras/slora/router/model_infer/infer_adapter.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/router/model_infer 2024-01-21T18:02:34,289 copying build/lib/multi_loras/slora/router/model_infer/post_process.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/router/model_infer 2024-01-21T18:02:34,291 copying build/lib/multi_loras/slora/router/model_infer/infer_batch.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/router/model_infer 2024-01-21T18:02:34,294 copying build/lib/multi_loras/slora/router/input_params.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/router 2024-01-21T18:02:34,296 copying build/lib/multi_loras/slora/router/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/router 2024-01-21T18:02:34,298 copying build/lib/multi_loras/slora/router/cluster_req_queue.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/router 2024-01-21T18:02:34,300 copying build/lib/multi_loras/slora/router/abort_req_queue.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/router 2024-01-21T18:02:34,303 copying build/lib/multi_loras/slora/router/peft_req_queue.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/router 2024-01-21T18:02:34,305 copying build/lib/multi_loras/slora/router/manager.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/router 2024-01-21T18:02:34,309 creating build/bdist.linux-armv7l/wheel/multi_loras/slora/utils 2024-01-21T18:02:34,310 copying build/lib/multi_loras/slora/utils/infer_utils.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/utils 2024-01-21T18:02:34,312 copying build/lib/multi_loras/slora/utils/model_load.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/utils 2024-01-21T18:02:34,314 copying build/lib/multi_loras/slora/utils/metric.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/utils 2024-01-21T18:02:34,316 copying build/lib/multi_loras/slora/utils/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/utils 2024-01-21T18:02:34,317 copying build/lib/multi_loras/slora/utils/model_utils.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/utils 2024-01-21T18:02:34,319 copying build/lib/multi_loras/slora/utils/net_utils.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora/utils 2024-01-21T18:02:34,322 copying build/lib/multi_loras/slora/io_struct.py -> build/bdist.linux-armv7l/wheel/multi_loras/slora 2024-01-21T18:02:34,324 copying build/lib/multi_loras/__version__.py -> build/bdist.linux-armv7l/wheel/multi_loras 2024-01-21T18:02:34,326 copying build/lib/multi_loras/dare.py -> build/bdist.linux-armv7l/wheel/multi_loras 2024-01-21T18:02:34,328 copying build/lib/multi_loras/merge_peft_adapters.py -> build/bdist.linux-armv7l/wheel/multi_loras 2024-01-21T18:02:34,331 copying build/lib/multi_loras/__init__.py -> build/bdist.linux-armv7l/wheel/multi_loras 2024-01-21T18:02:34,332 copying build/lib/multi_loras/lorahub.py -> build/bdist.linux-armv7l/wheel/multi_loras 2024-01-21T18:02:34,335 copying build/lib/multi_loras/merge_models.py -> build/bdist.linux-armv7l/wheel/multi_loras 2024-01-21T18:02:34,338 copying build/lib/multi_loras/__main__.py -> build/bdist.linux-armv7l/wheel/multi_loras 2024-01-21T18:02:34,340 copying build/lib/multi_loras/merging_methods.py -> build/bdist.linux-armv7l/wheel/multi_loras 2024-01-21T18:02:34,344 copying build/lib/multi_loras/orthogonal_component.py -> build/bdist.linux-armv7l/wheel/multi_loras 2024-01-21T18:02:34,347 running install_egg_info 2024-01-21T18:02:34,387 Copying multi_loras.egg-info to build/bdist.linux-armv7l/wheel/multi_loras-0.3.0-py3.11.egg-info 2024-01-21T18:02:34,398 running install_scripts 2024-01-21T18:02:34,412 creating build/bdist.linux-armv7l/wheel/multi_loras-0.3.0.dist-info/WHEEL 2024-01-21T18:02:34,415 creating '/tmp/pip-wheel-q8iyblmh/multi_loras-0.3.0-py3-none-any.whl' and adding 'build/bdist.linux-armv7l/wheel' to it 2024-01-21T18:02:34,417 adding 'multi_loras/__init__.py' 2024-01-21T18:02:34,419 adding 'multi_loras/__main__.py' 2024-01-21T18:02:34,421 adding 'multi_loras/__version__.py' 2024-01-21T18:02:34,423 adding 'multi_loras/dare.py' 2024-01-21T18:02:34,425 adding 'multi_loras/delta_weights.py' 2024-01-21T18:02:34,427 adding 'multi_loras/extract_lora.py' 2024-01-21T18:02:34,430 adding 'multi_loras/lorahub.py' 2024-01-21T18:02:34,433 adding 'multi_loras/merge_models.py' 2024-01-21T18:02:34,434 adding 'multi_loras/merge_peft_adapters.py' 2024-01-21T18:02:34,440 adding 'multi_loras/merging_methods.py' 2024-01-21T18:02:34,442 adding 'multi_loras/orthogonal_component.py' 2024-01-21T18:02:34,444 adding 'multi_loras/slora/__init__.py' 2024-01-21T18:02:34,446 adding 'multi_loras/slora/install_slora_kernel.py' 2024-01-21T18:02:34,448 adding 'multi_loras/slora/io_struct.py' 2024-01-21T18:02:34,449 adding 'multi_loras/slora/sampling_params.py' 2024-01-21T18:02:34,454 adding 'multi_loras/slora/slora_server.py' 2024-01-21T18:02:34,456 adding 'multi_loras/slora/common/__init__.py' 2024-01-21T18:02:34,457 adding 'multi_loras/slora/common/build_utils.py' 2024-01-21T18:02:34,459 adding 'multi_loras/slora/common/gqa_mem_manager.py' 2024-01-21T18:02:34,461 adding 'multi_loras/slora/common/infer_utils.py' 2024-01-21T18:02:34,462 adding 'multi_loras/slora/common/int8kv_mem_manager.py' 2024-01-21T18:02:34,464 adding 'multi_loras/slora/common/mem_allocator.py' 2024-01-21T18:02:34,466 adding 'multi_loras/slora/common/mem_manager.py' 2024-01-21T18:02:34,468 adding 'multi_loras/slora/common/ppl_int8kv_mem_manager.py' 2024-01-21T18:02:34,470 adding 'multi_loras/slora/common/basemodel/__init__.py' 2024-01-21T18:02:34,472 adding 'multi_loras/slora/common/basemodel/basemodel.py' 2024-01-21T18:02:34,474 adding 'multi_loras/slora/common/basemodel/infer_struct.py' 2024-01-21T18:02:34,476 adding 'multi_loras/slora/common/basemodel/layer_infer/__init__.py' 2024-01-21T18:02:34,477 adding 'multi_loras/slora/common/basemodel/layer_infer/base_layer_infer.py' 2024-01-21T18:02:34,479 adding 'multi_loras/slora/common/basemodel/layer_infer/post_layer_infer.py' 2024-01-21T18:02:34,480 adding 'multi_loras/slora/common/basemodel/layer_infer/pre_layer_infer.py' 2024-01-21T18:02:34,482 adding 'multi_loras/slora/common/basemodel/layer_infer/transformer_layer_infer.py' 2024-01-21T18:02:34,484 adding 'multi_loras/slora/common/basemodel/layer_infer/template/__init__.py' 2024-01-21T18:02:34,485 adding 'multi_loras/slora/common/basemodel/layer_infer/template/post_layer_infer_template.py' 2024-01-21T18:02:34,487 adding 'multi_loras/slora/common/basemodel/layer_infer/template/pre_layer_infer_template.py' 2024-01-21T18:02:34,488 adding 'multi_loras/slora/common/basemodel/layer_infer/template/transformer_layer_infer_template.py' 2024-01-21T18:02:34,491 adding 'multi_loras/slora/common/basemodel/layer_weights/__init__.py' 2024-01-21T18:02:34,492 adding 'multi_loras/slora/common/basemodel/layer_weights/base_layer_weight.py' 2024-01-21T18:02:34,494 adding 'multi_loras/slora/common/basemodel/layer_weights/hf_load_utils.py' 2024-01-21T18:02:34,495 adding 'multi_loras/slora/common/basemodel/layer_weights/pre_and_post_layer_weight.py' 2024-01-21T18:02:34,497 adding 'multi_loras/slora/common/basemodel/layer_weights/transformer_layer_weight.py' 2024-01-21T18:02:34,499 adding 'multi_loras/slora/common/basemodel/triton_kernel/__init__.py' 2024-01-21T18:02:34,501 adding 'multi_loras/slora/common/basemodel/triton_kernel/apply_penalty.py' 2024-01-21T18:02:34,504 adding 'multi_loras/slora/common/basemodel/triton_kernel/dequantize_gemm_int4.py' 2024-01-21T18:02:34,506 adding 'multi_loras/slora/common/basemodel/triton_kernel/dequantize_gemm_int8.py' 2024-01-21T18:02:34,508 adding 'multi_loras/slora/common/basemodel/triton_kernel/destindex_copy_kv.py' 2024-01-21T18:02:34,511 adding 'multi_loras/slora/common/basemodel/triton_kernel/quantize_gemm_int8.py' 2024-01-21T18:02:34,512 adding 'multi_loras/slora/common/configs/__init__.py' 2024-01-21T18:02:34,513 adding 'multi_loras/slora/common/configs/config.py' 2024-01-21T18:02:34,515 adding 'multi_loras/slora/models/__init__.py' 2024-01-21T18:02:34,517 adding 'multi_loras/slora/models/llama/__init__.py' 2024-01-21T18:02:34,518 adding 'multi_loras/slora/models/llama/infer_struct.py' 2024-01-21T18:02:34,519 adding 'multi_loras/slora/models/llama/model.py' 2024-01-21T18:02:34,521 adding 'multi_loras/slora/models/llama/layer_infer/__init__.py' 2024-01-21T18:02:34,523 adding 'multi_loras/slora/models/llama/layer_infer/post_layer_infer.py' 2024-01-21T18:02:34,524 adding 'multi_loras/slora/models/llama/layer_infer/pre_layer_infer.py' 2024-01-21T18:02:34,525 adding 'multi_loras/slora/models/llama/layer_infer/transformer_layer_infer.py' 2024-01-21T18:02:34,527 adding 'multi_loras/slora/models/llama/layer_weights/__init__.py' 2024-01-21T18:02:34,528 adding 'multi_loras/slora/models/llama/layer_weights/pre_and_post_layer_weight.py' 2024-01-21T18:02:34,530 adding 'multi_loras/slora/models/llama/layer_weights/transformer_layer_weight.py' 2024-01-21T18:02:34,531 adding 'multi_loras/slora/models/llama/triton_kernel/__init__.py' 2024-01-21T18:02:34,533 adding 'multi_loras/slora/models/llama/triton_kernel/context_flashattention_nopad.py' 2024-01-21T18:02:34,535 adding 'multi_loras/slora/models/llama/triton_kernel/rmsnorm.py' 2024-01-21T18:02:34,536 adding 'multi_loras/slora/models/llama/triton_kernel/rotary_emb.py' 2024-01-21T18:02:34,538 adding 'multi_loras/slora/models/llama/triton_kernel/token_attention_nopad_att1.py' 2024-01-21T18:02:34,539 adding 'multi_loras/slora/models/llama/triton_kernel/token_attention_nopad_reduceV.py' 2024-01-21T18:02:34,541 adding 'multi_loras/slora/models/llama/triton_kernel/token_attention_nopad_softmax.py' 2024-01-21T18:02:34,542 adding 'multi_loras/slora/models/llama/triton_kernel/token_attention_softmax_and_reducev.py' 2024-01-21T18:02:34,544 adding 'multi_loras/slora/models/llama2/__init__.py' 2024-01-21T18:02:34,545 adding 'multi_loras/slora/models/llama2/model.py' 2024-01-21T18:02:34,546 adding 'multi_loras/slora/models/llama2/layer_infer/__init__.py' 2024-01-21T18:02:34,548 adding 'multi_loras/slora/models/llama2/layer_infer/transformer_layer_infer.py' 2024-01-21T18:02:34,550 adding 'multi_loras/slora/models/llama2/layer_weights/__init__.py' 2024-01-21T18:02:34,551 adding 'multi_loras/slora/models/llama2/layer_weights/transformer_layer_weight.py' 2024-01-21T18:02:34,553 adding 'multi_loras/slora/models/llama2/triton_kernel/__init__.py' 2024-01-21T18:02:34,554 adding 'multi_loras/slora/models/llama2/triton_kernel/context_flashattention_nopad.py' 2024-01-21T18:02:34,556 adding 'multi_loras/slora/models/llama2/triton_kernel/token_attention_nopad_att1.py' 2024-01-21T18:02:34,557 adding 'multi_loras/slora/models/llama2/triton_kernel/token_attention_nopad_reduceV.py' 2024-01-21T18:02:34,558 adding 'multi_loras/slora/models/llama2/triton_kernel/token_attention_nopad_softmax.py' 2024-01-21T18:02:34,560 adding 'multi_loras/slora/models/llama2/triton_kernel/token_attention_softmax_and_reducev.py' 2024-01-21T18:02:34,561 adding 'multi_loras/slora/router/__init__.py' 2024-01-21T18:02:34,563 adding 'multi_loras/slora/router/abort_req_queue.py' 2024-01-21T18:02:34,564 adding 'multi_loras/slora/router/cluster_req_queue.py' 2024-01-21T18:02:34,565 adding 'multi_loras/slora/router/input_params.py' 2024-01-21T18:02:34,568 adding 'multi_loras/slora/router/manager.py' 2024-01-21T18:02:34,569 adding 'multi_loras/slora/router/peft_req_queue.py' 2024-01-21T18:02:34,571 adding 'multi_loras/slora/router/pets_req_queue.py' 2024-01-21T18:02:34,572 adding 'multi_loras/slora/router/profiler.py' 2024-01-21T18:02:34,573 adding 'multi_loras/slora/router/req_queue.py' 2024-01-21T18:02:34,575 adding 'multi_loras/slora/router/stats.py' 2024-01-21T18:02:34,576 adding 'multi_loras/slora/router/model_infer/__init__.py' 2024-01-21T18:02:34,578 adding 'multi_loras/slora/router/model_infer/infer_adapter.py' 2024-01-21T18:02:34,580 adding 'multi_loras/slora/router/model_infer/infer_batch.py' 2024-01-21T18:02:34,583 adding 'multi_loras/slora/router/model_infer/model_rpc.py' 2024-01-21T18:02:34,585 adding 'multi_loras/slora/router/model_infer/naive_infer_adapter.py' 2024-01-21T18:02:34,587 adding 'multi_loras/slora/router/model_infer/post_process.py' 2024-01-21T18:02:34,588 adding 'multi_loras/slora/utils/__init__.py' 2024-01-21T18:02:34,590 adding 'multi_loras/slora/utils/infer_utils.py' 2024-01-21T18:02:34,591 adding 'multi_loras/slora/utils/metric.py' 2024-01-21T18:02:34,592 adding 'multi_loras/slora/utils/model_load.py' 2024-01-21T18:02:34,594 adding 'multi_loras/slora/utils/model_utils.py' 2024-01-21T18:02:34,595 adding 'multi_loras/slora/utils/net_utils.py' 2024-01-21T18:02:34,597 adding 'multi_loras-0.3.0.dist-info/LICENSE' 2024-01-21T18:02:34,599 adding 'multi_loras-0.3.0.dist-info/METADATA' 2024-01-21T18:02:34,600 adding 'multi_loras-0.3.0.dist-info/WHEEL' 2024-01-21T18:02:34,601 adding 'multi_loras-0.3.0.dist-info/top_level.txt' 2024-01-21T18:02:34,603 adding 'multi_loras-0.3.0.dist-info/RECORD' 2024-01-21T18:02:34,607 removing build/bdist.linux-armv7l/wheel 2024-01-21T18:02:34,745 Building wheel for multi-loras (setup.py): finished with status 'done' 2024-01-21T18:02:34,749 Created wheel for multi-loras: filename=multi_loras-0.3.0-py3-none-any.whl size=137969 sha256=339a859719ed61d8efe1ade890db9569b4cb5f620f9c58e4e33350c36b77a142 2024-01-21T18:02:34,750 Stored in directory: /tmp/pip-ephem-wheel-cache-tcoubn21/wheels/7d/7e/67/14355d37e8daddcee1921f3170863b732985822a222b941c73 2024-01-21T18:02:34,765 Successfully built multi-loras 2024-01-21T18:02:34,773 Removed build tracker: '/tmp/pip-build-tracker-3uar7qiv'