使用外部(非托管)PostgreSQL api/v2/instances 创建 AWX 部署后显示计数 < 副本数
kubectl apply -f awx-deploy.yml
---
apiVersion: awx.ansible.com/v1beta1
kind: AWX
metadata:
name: awx
namespace: awx
spec:
replicas: 2
image_version: 19.1.0
admin_user: admin
admin_password_secret: awx-admin-password
ingress_type: ingress
ingress_annotations: |
kubernetes.io/ingress.class: nginx
hostname: awx-demo.example.com
ingress_tls_secret: awx-ingress-tls
web_resource_requirements:
requests:
cpu: 400m
memory: 2Gi
limits:
cpu: 1000m
memory: 4Gi
task_resource_requirements:
requests:
cpu: 250m
memory: 1Gi
limits:
cpu: 500m
memory: 2Gi
ee_resource_requirements:
requests:
cpu: 250m
memory: 1Gi
limits:
cpu: 500m
memory: 2Gi
---
apiVersion: v1
kind: Secret
metadata:
name: awx-postgres-configuration
namespace: awx
stringData:
host: XXXX
port: "XXXX"
database: XXX
username: XXX
password: XXX
type: unmanaged
type: Opaque
具有 2 个实例的 AWX HA 配置
api/v2/ping/
{
"ha": false,
"version": "19.1.0",
"active_node": "awx-5776c59677-h9mrj",
"install_uuid": "ba8b8bc6-1010-4e09-b5b2-08cc06901800",
"instances": [
{
"node": "awx-5776c59677-h9mrj",
"uuid": "5b18352d-24e7-47ce-a18d-e0e4cbd994d5",
"heartbeat": "2021-07-09T09:09:26.165742Z",
"capacity": 0,
"version": "19.1.0"
}
],
"instance_groups": [
{
"name": "tower",
"capacity": 0,
"instances": []
}
]
}
api/v2/实例/
{
"count": 1,
"next": null,
"previous": null,
"results": [
{
"id": 1,
"type": "instance",
"url": "/api/v2/instances/1/",
"related": {
"jobs": "/api/v2/instances/1/jobs/",
"instance_groups": "/api/v2/instances/1/instance_groups/"
},
"uuid": "5b18352d-24e7-47ce-a18d-e0e4cbd994d5",
"hostname": "awx-5776c59677-h9mrj",
"created": "2021-07-09T09:08:31.072893Z",
"modified": "2021-07-09T09:09:26.165742Z",
"capacity_adjustment": "1.00",
"version": "19.1.0",
"capacity": 0,
"consumed_capacity": 0,
"percent_capacity_remaining": 0.0,
"jobs_running": 0,
"jobs_total": 0,
"cpu": 0,
"memory": 0,
"cpu_capacity": 0,
"mem_capacity": 0,
"enabled": true,
"managed_by_policy": true
}
]
}
kubectl 获取 pods -n awx
姓名准备状态重新开始年龄
awx-5776c59677-74964 4/4 运行 0 14m
awx-5776c59677-h9mrj 4/4 运行 0 14m
kubectl exec pod/awx-5776c59677-74964 -n awx -c awx-web -it -- /bin/bash
bash-4.4$ awx-manage check_db
数据库版本:PostgreSQL 12.7 (Ubuntu 12.7-1.pgdg18.04+1) on x86_64-pc-linux-gnu,gcc 编译 (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
kubectl 日志 -n awx pod/awx-5776c59677-74964 -c awx-task
...
文件“/var/lib/awx/venv/awx/lib64/python3.8/site-packages/awx/main/managers.py”,第107行,在我里面
raise RuntimeError("未找到具有当前集群主机 ID 的实例")
运行时错误:未找到具有当前集群主机 ID 的实例
2021-07-09 09:17:27,304 INFO 退出:回调接收器(退出状态 1;不期望)
...
使用托管的 PostgreSQL 没有注意到这样的问题。
问题解决后
kubectl rollout restart -n awx 部署/awx
kubectl 获取 pods -n awx
姓名准备状态重新开始年龄
awx-686dd7df69-52kgh 4/4 运行 0 4m26s
awx-686dd7df69-v8w2g 4/4 运行 0 4m23s
api/v2/ping/
{
"ha": true,
"version": "19.1.0",
"active_node": "awx-686dd7df69-52kgh",
"install_uuid": "ba8b8bc6-1010-4e09-b5b2-08cc06901800",
"instances": [
{
"node": "awx-686dd7df69-52kgh",
"uuid": "ea773db2-7007-47a8-9987-16ddc79d6ec3",
"heartbeat": "2021-07-09T09:33:46.787935Z",
"capacity": 0,
"version": "19.1.0"
},
{
"node": "awx-686dd7df69-v8w2g",
"uuid": "acef28b0-3977-4dbe-8c10-e9c4f11adab8",
"heartbeat": "2021-07-09T09:33:52.378214Z",
"capacity": 0,
"version": "19.1.0"
}
],
"instance_groups": [
{
"name": "tower",
"capacity": 0,
"instances": []
}
]
}
api/v2/实例/
{
"count": 2,
"next": null,
"previous": null,
"results": [
{
"id": 3,
"type": "instance",
"url": "/api/v2/instances/3/",
"related": {
"jobs": "/api/v2/instances/3/jobs/",
"instance_groups": "/api/v2/instances/3/instance_groups/"
},
"uuid": "ea773db2-7007-47a8-9987-16ddc79d6ec3",
"hostname": "awx-686dd7df69-52kgh",
"created": "2021-07-09T09:33:46.194006Z",
"modified": "2021-07-09T09:33:46.787935Z",
"capacity_adjustment": "1.00",
"version": "19.1.0",
"capacity": 0,
"consumed_capacity": 0,
"percent_capacity_remaining": 0.0,
"jobs_running": 0,
"jobs_total": 0,
"cpu": 0,
"memory": 0,
"cpu_capacity": 0,
"mem_capacity": 0,
"enabled": true,
"managed_by_policy": true
},
{
"id": 4,
"type": "instance",
"url": "/api/v2/instances/4/",
"related": {
"jobs": "/api/v2/instances/4/jobs/",
"instance_groups": "/api/v2/instances/4/instance_groups/"
},
"uuid": "acef28b0-3977-4dbe-8c10-e9c4f11adab8",
"hostname": "awx-686dd7df69-v8w2g",
"created": "2021-07-09T09:33:51.780698Z",
"modified": "2021-07-09T09:33:52.378214Z",
"capacity_adjustment": "1.00",
"version": "19.1.0",
"capacity": 0,
"consumed_capacity": 0,
"percent_capacity_remaining": 0.0,
"jobs_running": 0,
"jobs_total": 0,
"cpu": 0,
"memory": 0,
"cpu_capacity": 0,
"mem_capacity": 0,
"enabled": true,
"managed_by_policy": true
}
]
}
后来,如果扩展到3个副本,问题是一样的,但也解决了使用
kubectl rollout restart -n awx 部署/awx
操作员和容器日志文件
@tchellomello @rooftopcellist如果你们中的任何一个下周有时间,你能帮忙看看吗?
如果需要,我可以重现并为您提供 kubeconfig 和 api 的外部 ip 几个小时。
用外部数据库为我工作,但我会多挖掘一点。
HTTP 200 OK
Allow: GET, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept
X-API-Node: awx-toca-657778f5cb-7pdrp
X-API-Product-Name: AWX
X-API-Product-Version: 19.2.2
X-API-Time: 0.023s
{
"ha": true,
"version": "19.2.2",
"active_node": "awx-toca-657778f5cb-7pdrp",
"install_uuid": "e27ea7cb-c400-45fe-a595-9bb5217c71ac",
"instances": [
{
"node": "awx-toca-657778f5cb-7pdrp",
"uuid": "3a34f8fe-8336-4910-8c47-7193694a9536",
"heartbeat": "2021-07-10T19:35:26.186881Z",
"capacity": 296,
"version": "19.2.2"
},
{
"node": "awx-toca-657778f5cb-lm776",
"uuid": "fab5cf31-ae1d-4ddc-8b55-459618c20845",
"heartbeat": "2021-07-10T19:35:52.645367Z",
"capacity": 293,
"version": "19.2.2"
}
],
"instance_groups": [
{
"name": "tower",
"capacity": 0,
"instances": []
},
{
"name": "controlplane",
"capacity": 589,
"instances": [
"awx-toca-657778f5cb-7pdrp",
"awx-toca-657778f5cb-lm776"
]
},
{
"name": "default",
"capacity": 0,
"instances": []
}
kubectl get pods -w | grep awx-toca 15:35:01
awx-toca-657778f5cb-7pdrp 4/4 Running 0 2d14h
awx-toca-657778f5cb-lm776 0/4 Pending 0 0s
awx-toca-657778f5cb-lm776 0/4 Pending 0 0s
awx-toca-657778f5cb-lm776 0/4 Init:0/1 0 0s
awx-toca-657778f5cb-lm776 0/4 PodInitializing 0 1s
awx-toca-657778f5cb-lm776 4/4 Running 0 23s
[awx-toca-657778f5cb-lm776 awx-toca-web] 2021-07-10 19:36:00,757 INFO [-] awx.main.consumers client 'specific.d10caed53de54b76b34bc914c0ab92b6!290a7011729a4e5b8558e9519d1afd95' joined the broadcast group.
[awx-toca-657778f5cb-lm776 awx-toca-web] 2021-07-10 19:36:00,757 INFO [-] awx.main.consumers client 'specific.d10caed53de54b76b34bc914c0ab92b6!290a7011729a4e5b8558e9519d1afd95' joined the broadcast group.
[awx-toca-657778f5cb-lm776 awx-toca-web] 2021-07-10 19:36:00,757 INFO client 'specific.d10caed53de54b76b34bc914c0ab92b6!290a7011729a4e5b8558e9519d1afd95' joined the broadcast group.
[awx-toca-657778f5cb-lm776 awx-toca-web] RESULT 2
该问题可能在 image_version=19.1.0 时出现
我尝试使用 19.2.2,创建 2 个副本成功。
但是如果然后设置副本:3(从 2 到 3)
kubectl apply -f awx-deploy.yml
api/v2/ping/
{
"ha": false,
"version": "19.2.2",
"active_node": "awx-848f64cdb4-29pcv",
"install_uuid": "88b63b97-2942-49c5-bc5f-e5006a7b5456",
"instances": [
{
"node": "awx-848f64cdb4-spt82",
"uuid": "27494bf7-6fa2-489e-bdb3-82466edbd49c",
"heartbeat": "2021-07-12T09:48:24.680690Z",
"capacity": 79,
"version": "19.2.2"
}
],
"instance_groups": [
{
"name": "controlplane",
"capacity": 79,
"instances": [
"awx-848f64cdb4-spt82"
]
},
{
"name": "default",
"capacity": 0,
"instances": []
}
]
}
kubectl rollout restart -n awx deployment/awx
api/v2/ping/
{
"ha": true,
"version": "19.2.2",
"active_node": "awx-657cd5b84-t5htk",
"install_uuid": "88b63b97-2942-49c5-bc5f-e5006a7b5456",
"instances": [
{
"node": "awx-657cd5b84-g8kx2",
"uuid": "30e28fc4-8c88-4922-a7e1-0196fe790f2f",
"heartbeat": "2021-07-12T10:00:33.162404Z",
"capacity": 79,
"version": "19.2.2"
},
{
"node": "awx-657cd5b84-rg9v4",
"uuid": "501a0ff7-9043-46f4-baae-4602de3107d2",
"heartbeat": "2021-07-12T10:00:36.591979Z",
"capacity": 79,
"version": "19.2.2"
},
{
"node": "awx-657cd5b84-t5htk",
"uuid": "a3308acc-04e9-4da7-88cf-71048d666ffb",
"heartbeat": "2021-07-12T10:00:38.958448Z",
"capacity": 79,
"version": "19.2.2"
}
],
"instance_groups": [
{
"name": "controlplane",
"capacity": 237,
"instances": [
"awx-657cd5b84-g8kx2",
"awx-657cd5b84-rg9v4",
"awx-657cd5b84-t5htk"
]
},
{
"name": "default",
"capacity": 0,
"instances": []
}
]
}
replicas:2
这是 AWX API 的输出
HTTP 200 OK
Allow: GET, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept
X-API-Node: awx-toca-657778f5cb-lm776
X-API-Product-Name: AWX
X-API-Product-Version: 19.2.2
X-API-Time: 0.015s
{
"ha": true,
"version": "19.2.2",
"active_node": "awx-toca-657778f5cb-lm776",
"install_uuid": "e27ea7cb-c400-45fe-a595-9bb5217c71ac",
"instances": [
{
"node": "awx-toca-657778f5cb-4bzps",
"uuid": "617ccf03-2231-44ef-b512-7b97d3207feb",
"heartbeat": "2021-07-25T03:09:29.263282Z",
"capacity": 293,
"version": "19.2.2"
},
{
"node": "awx-toca-657778f5cb-lm776",
"uuid": "fab5cf31-ae1d-4ddc-8b55-459618c20845",
"heartbeat": "2021-07-25T03:09:49.130909Z",
"capacity": 293,
"version": "19.2.2"
}
],
"instance_groups": [
{
"name": "tower",
"capacity": 0,
"instances": []
},
{
"name": "controlplane",
"capacity": 586,
"instances": [
"awx-toca-657778f5cb-4bzps",
"awx-toca-657778f5cb-lm776"
]
},
{
"name": "default",
"capacity": 0,
"instances": []
}
]
}
然后修改了 AWX 规范kubectl edit awx awx-toca
并设置replicas:3
得到了预期的 3:
kubectl get pods -w | grep awx 23:10:10
awx-operator-df789fd9c-rqn2k 1/1 Running 0 32h
awx-toca-657778f5cb-4bzps 4/4 Running 0 32h
awx-toca-657778f5cb-lm776 4/4 Running 78 14d
awx-toca-657778f5cb-28fq9 0/4 Pending 0 0s
awx-toca-657778f5cb-28fq9 0/4 Pending 0 0s
awx-toca-657778f5cb-28fq9 0/4 Init:0/1 0 0s
awx-toca-657778f5cb-28fq9 0/4 PodInitializing 0 2s
awx-toca-657778f5cb-28fq9 4/4 Running 0 4s
查看 API,它按预期工作:
HTTP 200 OK
Allow: GET, HEAD, OPTIONS
Content-Type: application/json
Vary: Accept
X-API-Node: awx-toca-657778f5cb-4bzps
X-API-Product-Name: AWX
X-API-Product-Version: 19.2.2
X-API-Time: 0.014s
{
"ha": true,
"version": "19.2.2",
"active_node": "awx-toca-657778f5cb-4bzps",
"install_uuid": "e27ea7cb-c400-45fe-a595-9bb5217c71ac",
"instances": [
{
"node": "awx-toca-657778f5cb-28fq9",
"uuid": "7801777c-93de-416f-841e-0eb9a1b721d2",
"heartbeat": "2021-07-25T03:10:55.501238Z",
"capacity": 296,
"version": "19.2.2"
},
{
"node": "awx-toca-657778f5cb-4bzps",
"uuid": "617ccf03-2231-44ef-b512-7b97d3207feb",
"heartbeat": "2021-07-25T03:11:29.447748Z",
"capacity": 293,
"version": "19.2.2"
},
{
"node": "awx-toca-657778f5cb-lm776",
"uuid": "fab5cf31-ae1d-4ddc-8b55-459618c20845",
"heartbeat": "2021-07-25T03:10:49.231003Z",
"capacity": 293,
"version": "19.2.2"
}
],
"instance_groups": [
{
"name": "tower",
"capacity": 0,
"instances": []
},
{
"name": "controlplane",
"capacity": 882,
"instances": [
"awx-toca-657778f5cb-28fq9",
"awx-toca-657778f5cb-4bzps",
"awx-toca-657778f5cb-lm776"
]
},
{
"name": "default",
"capacity": 0,
"instances": []
}
]
}
请记住,任何手动发出的kubecl scale --replicas
命令都将被操作员覆盖。 所有更改都必须直接在 AWX 规范中执行。 @tklsnk因为我无法重现它,你能确认你遵循的步骤来扩大它吗?
使用 replicas=2 部署后,我编辑 awx-deploy.yml(设置 replicas=3)并执行 kubectl apply -f awx-deploy.yml
@tklsnk是的,这就是我在这里所做的,但是我无法重现相同的问题。
好的,我会尝试使用另一个 k8s 集群。
谢谢你。
好的,我会尝试使用另一个 k8s 集群。
谢谢你。
这个@tklsnk 有什么更新吗?
对不起,还没有机会尝试这个。 希望这周做。
与备用 k8s 集群一起按预期工作。 可能是特定云提供商的特定 k8s 实现的问题。
感谢您的反馈@tklsnk