💻 Code: Modify the default behavior of API key cooldown, change it to default off for API key cooldown. When there is only one API key in the channel, it will not cooldown under any circumstances. When API_KEY_COOLDOWN_PERIOD is 0, it will not cooldown. After enabling cooldown, any error with the API key will trigger a cooldown.
Browse files- README.md +1 -1
- README_CN.md +1 -1
- main.py +4 -2
README.md
CHANGED
@@ -92,7 +92,7 @@ providers:
|
|
92 |
preferences:
|
93 |
API_KEY_RATE_LIMIT: 15/min # Each API Key can request up to 15 times per minute, optional. The default is 999999/min.
|
94 |
# API_KEY_RATE_LIMIT: 15/min,10/day # Supports multiple frequency constraints
|
95 |
-
API_KEY_COOLDOWN_PERIOD: 60 # Each API Key will be cooled down for 60 seconds after encountering a 429 error. Optional, the default is
|
96 |
|
97 |
- provider: vertex
|
98 |
project_id: gen-lang-client-xxxxxxxxxxxxxx # Description: Your Google Cloud project ID. Format: String, usually composed of lowercase letters, numbers, and hyphens. How to obtain: You can find your project ID in the project selector of the Google Cloud Console.
|
|
|
92 |
preferences:
|
93 |
API_KEY_RATE_LIMIT: 15/min # Each API Key can request up to 15 times per minute, optional. The default is 999999/min.
|
94 |
# API_KEY_RATE_LIMIT: 15/min,10/day # Supports multiple frequency constraints
|
95 |
+
API_KEY_COOLDOWN_PERIOD: 60 # Each API Key will be cooled down for 60 seconds after encountering a 429 error. Optional, the default is 0 seconds. When set to 0, the cooling mechanism is not enabled.
|
96 |
|
97 |
- provider: vertex
|
98 |
project_id: gen-lang-client-xxxxxxxxxxxxxx # Description: Your Google Cloud project ID. Format: String, usually composed of lowercase letters, numbers, and hyphens. How to obtain: You can find your project ID in the project selector of the Google Cloud Console.
|
README_CN.md
CHANGED
@@ -92,7 +92,7 @@ providers:
|
|
92 |
preferences:
|
93 |
API_KEY_RATE_LIMIT: 15/min # 每个 API Key 每分钟最多请求次数,选填。默认为 999999/min
|
94 |
# API_KEY_RATE_LIMIT: 15/min,10/day # 支持多个频率约束条件
|
95 |
-
API_KEY_COOLDOWN_PERIOD: 60 # 每个 API Key 遭遇 429 错误后的冷却时间,单位为秒,选填。默认为
|
96 |
|
97 |
- provider: vertex
|
98 |
project_id: gen-lang-client-xxxxxxxxxxxxxx # 描述: 您的Google Cloud项目ID。格式: 字符串,通常由小写字母、数字和连字符组成。获取方式: 在Google Cloud Console的项目选择器中可以找到您的项目ID。
|
|
|
92 |
preferences:
|
93 |
API_KEY_RATE_LIMIT: 15/min # 每个 API Key 每分钟最多请求次数,选填。默认为 999999/min
|
94 |
# API_KEY_RATE_LIMIT: 15/min,10/day # 支持多个频率约束条件
|
95 |
+
API_KEY_COOLDOWN_PERIOD: 60 # 每个 API Key 遭遇 429 错误后的冷却时间,单位为秒,选填。默认为 0 秒, 当设置为 0 秒时,不启用冷却机制。
|
96 |
|
97 |
- provider: vertex
|
98 |
project_id: gen-lang-client-xxxxxxxxxxxxxx # 描述: 您的Google Cloud项目ID。格式: 字符串,通常由小写字母、数字和连字符组成。获取方式: 在Google Cloud Console的项目选择器中可以找到您的项目ID。
|
main.py
CHANGED
@@ -1013,9 +1013,11 @@ class ModelRequestHandler:
|
|
1013 |
num_matching_providers = len(matching_providers)
|
1014 |
index = 0
|
1015 |
|
1016 |
-
|
|
|
|
|
1017 |
current_api = await provider_api_circular_list[channel_id].after_next_current()
|
1018 |
-
await provider_api_circular_list[channel_id].set_cooling(current_api, cooling_time=
|
1019 |
|
1020 |
logger.error(f"Error {status_code} with provider {channel_id}: {error_message}")
|
1021 |
if is_debug:
|
|
|
1013 |
num_matching_providers = len(matching_providers)
|
1014 |
index = 0
|
1015 |
|
1016 |
+
cooling_time = safe_get(provider, "preferences", "API_KEY_COOLDOWN_PERIOD", default=0)
|
1017 |
+
api_key_count = provider_api_circular_list[channel_id].get_items_count()
|
1018 |
+
if cooling_time > 0 and api_key_count > 1:
|
1019 |
current_api = await provider_api_circular_list[channel_id].after_next_current()
|
1020 |
+
await provider_api_circular_list[channel_id].set_cooling(current_api, cooling_time=cooling_time)
|
1021 |
|
1022 |
logger.error(f"Error {status_code} with provider {channel_id}: {error_message}")
|
1023 |
if is_debug:
|