You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
INFO 04-18 06:30:36 [init.py:239] Automatically detected platform cuda.
Namespace(backend='deepspeed-mii', base_url=None, host='127.0.0.1', port=8000, endpoint='/v1/completions', dataset_name='sharegpt', dataset_path='./ShareGPT_V3_unfiltered_cleaned_split.json', max_concurrency=None, model='meta-llama/Meta-Llama-3-8B', tokenizer=None, use_beam_search=False, num_prompts=500, logprobs=None, request_rate=4.0, burstiness=1.0, seed=0, trust_remote_code=False, disable_tqdm=False, profile=False, save_result=False, save_detailed=False, metadata=None, result_dir=None, result_filename=None, ignore_eos=False, percentile_metrics='ttft,tpot,itl', metric_percentiles='99', goodput=None, sonnet_input_len=550, sonnet_output_len=150, sonnet_prefix_len=200, sharegpt_output_len=None, random_input_len=1024, random_output_len=128, random_range_ratio=0.0, random_prefix_len=0, hf_subset=None, hf_split=None, hf_output_len=None, top_p=None, top_k=None, min_p=None, temperature=None, tokenizer_mode='auto', served_model_name=None, lora_modules=None)
Starting initial single prompt test run...
Traceback (most recent call last):
File "/home/ishi/work/deepspeed_client/vllm/benchmarks/benchmark_serving.py", line 1088, in
main(args)
File "/home/ishi/work/deepspeed_client/vllm/benchmarks/benchmark_serving.py", line 684, in main
benchmark_result = asyncio.run(
^^^^^^^^^^^^
File "/home/ishi/.pyenv/versions/3.12.7/lib/python3.12/asyncio/runners.py", line 194, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/home/ishi/.pyenv/versions/3.12.7/lib/python3.12/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ishi/.pyenv/versions/3.12.7/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/home/ishi/work/deepspeed_client/vllm/benchmarks/benchmark_serving.py", line 297, in benchmark
raise ValueError(
ValueError: Initial test run failed - Please make sure benchmark arguments are correctly specified. Error: Bad Request
It seems that no error occurs in this pull-request #15926 , but an error occurred in my environment.
Is there any difference?
In addition, I am currently attempting to port the process that makes the OPEN_AI_API_KEY available and enables ttft acquisition from async_request_openai_completions() to async_request_deepspeed_mii() in benchmarks/backend_request_func.py, and no errors occur when I run the test source code.
Before submitting a new issue...
Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
The text was updated successfully, but these errors were encountered:
Your current environment
The output of `python collect_env.py`
🐛 Describe the bug
server environment:
deepspeed 0.16.3
deepspeed-kernels 0.0.1.dev1698255861
deepspeed-mii 0.3.1+fcd0a5b
server command:
$ python -m mii.entrypoints.openai_api_server --tensor-parallel 1 --model meta-llama/Meta-Llama-3-8B --port 8000
client command:
$ python vllm/benchmarks/benchmark_serving.py --dataset-name sharegpt --dataset-path ./ShareGPT_V3_unfiltered_cleaned_split.json --model meta-llama/Meta-Llama-3-8B --num_prompts 500 --request-rate 4 --port 8000 --backend deepspeed-mii
error log:
It seems that no error occurs in this pull-request #15926 , but an error occurred in my environment.
Is there any difference?
In addition, I am currently attempting to port the process that makes the OPEN_AI_API_KEY available and enables ttft acquisition from async_request_openai_completions() to async_request_deepspeed_mii() in benchmarks/backend_request_func.py, and no errors occur when I run the test source code.
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: