【发布时间】:2019-11-16 19:32:51
【问题描述】:
我正在尝试将 envoy 配置为负载均衡器,目前遇到回退问题。在我的游乐场集群中,我有 3 个后端服务器和特使作为前端代理。我正在使用 siege 和观察响应在特使上产生一些流量。在执行此操作时,我会停止其中一个后端。
我想要什么: Envoy 应该将失败的请求从停止的后端重新发送到健康的后端,这样我就不会收到任何 5xx 响应
我得到了什么:停止后端时,我收到一些 503 响应,然后一切又恢复正常
我做错了什么?我认为,fallback_policy 应该提供这个功能,但它不起作用。
这是我的配置文件:
node:
id: LoadBalancer_01
cluster: HighloadCluster
admin:
access_log_path: /var/log/envoy/admin_access.log
address:
socket_address: { address: 0.0.0.0, port_value: 9901 }
static_resources:
listeners:
- name: http_listener
address:
socket_address: { address: 0.0.0.0, port_value: 80 }
filter_chains:
- filters:
- name: envoy.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
stat_prefix: ingress_http
codec_type: AUTO
route_config:
name: request_route
virtual_hosts:
- name: local_service
domains: ["*"]
require_tls: NONE
routes:
- match: { prefix: "/" }
route:
cluster: backend_service
timeout: 1.5s
retry_policy:
retry_on: 5xx
num_retries: 3
per_try_timeout: 3s
http_filters:
- name: envoy.router
typed_config:
"@type": type.googleapis.com/envoy.config.filter.http.router.v2.Router
name: envoy.file_access_log
typed_config:
"@type": type.googleapis.com/envoy.config.accesslog.v2.FileAccessLog
path: /var/log/envoy/access.log
clusters:
- name: backend_service
connect_timeout: 0.25s
type: STATIC
lb_policy: ROUND_ROBIN
lb_subset_config:
fallback_policy: ANY_ENDPOINT
outlier_detection:
consecutive_5xx: 1
interval: 10s
load_assignment:
cluster_name: backend_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 1.1.1.1
port_value: 10000
- endpoint:
address:
socket_address:
address: 2.2.2.2
port_value: 10000
- endpoint:
address:
socket_address:
address: 3.3.3.3
port_value: 10000
health_checks:
- http_health_check:
path: /api/liveness-probe
timeout: 1s
interval: 30s
unhealthy_interval: 10s
unhealthy_threshold: 2
healthy_threshold: 1
always_log_health_check_failures: true
event_log_path: /var/log/envoy/health_check.log```
【问题讨论】: