ããã«ã¡ã¯ã
ç§ã¯Kubernetesã䜿çšããŠPythonã¢ããªã±ãŒã·ã§ã³ããããã€ããŠããŸããKubernetesã¯livenessProbeãšreadinessProbeãæäŸããŠããã¡ããã芧ãã ããã
ã»ããªããŒããŸãã¯ã»ããªã¯ãŒã«ãŒãçããŠããŠæ£ããç¶æ
ã«ãããã©ããã確èªããã«ã¯ã©ãããã°ããã§ããïŒ
PIDã¯ãããšãã°ãããããã¯ããã£ããããããã«äœ¿çšã§ããªãããã解決çã§ã¯ãããŸããã
ãããããé¡ãããŸãã
å®ãããé¡ãããŸãã
Celeryã«ã¯ã䜿çšã§ããç£èŠAPIããããŸãã
Celeryã¯ãŒã«ãŒãããŒãããŒããéä¿¡ããå Žåããããã¯ã©ã€ããšèŠãªãããŸãã
ã¯ãŒã«ãŒãworker-onlineã€ãã³ããéä¿¡ããå Žåããããã¯æºåãã§ããŠãããšèŠãªãå¿
èŠããã
ç¹å®ã®åé¡ãæ©èœã®ãªã¯ãšã¹ããããå Žåã¯ãå¥ã®åé¡ãéããŠãã ããã
ããã¯æ©èœããŸããïŒ
readinessProbe:
exec:
command:
- "/bin/sh"
- "-c"
- "celery -A path.to.app status | grep -o ': OK'"
initialDelaySeconds: 30
periodSeconds: 10
@ 7wondersæåã«ã»ããªããŒãåãæœåºããå¿ èŠããããŸãã ãã®readinessProbeã¯ãå¿ èŠãªãã®ã§ã¯ãªãã»ããªã€ã³ã¹ã¿ã³ã¹ã倱æããå Žåã«å€±æããŸãã
@thedrowããŒããå®éã«ã¯ãå®éã®ããŒãã«
ã®ããã«èŠããŸã
/bin/sh -c 'exec celery -A path.to.app inspect ping -d celery@$HOSTNAME'
ã¯ãæºåãã§ãã¯ã«ååã§ããã1ã€ã®ããŒãã®ã¿ãæ€èšŒããŸãã
äžéšã®ã¢ããªã§ã¯ããã®ã³ãã³ãã®å®è¡ã«ãã«CPUã䜿çšãããšæ°ç§ãããå Žåããããkubernetesã®ããã©ã«ãã§ã¯10ç§ããšã«å®è¡ãããããšã«æ³šæããŠãã ããã
ãããã£ãŠãperiodSecondsãé«ãããæ¹ãã¯ããã«å®å šã§ãïŒ300ã«èšå®ãããŠããŸãïŒã
@redbaronãã®ã³ãã³ãã¯ããªãã®ããã«åããŸãããïŒ ãããæ©èœããå Žåã掻æ°ãšæºåã®ç¢ºçã®èšå®ã¯äœã§ããïŒ
äœããã®çç±ã§ããã®æºåãããŒãã¯ç§ãã¡ã«ãšã£ãŠæºè¶³ã®ãããã®ã§ã¯ãããŸããã æ€æ»ã¯ãã¯ã©ã¹ã¿ãŒã«è² è·ããããã«é決å®è«çã«å¿çããŸãã 次ã®ãããªåœ¢åŒãå®è¡ããŸãã
celery inspect ping -b " redisïŒ// archii-redis-master ïŒ6379" -d celery @ archii-task-crawl-integration-7d96d86b9d-jwtq7
ãŸããéåžžã®pingæéïŒ10ç§ïŒã§ã¯ãã¯ã©ã¹ã¿ãŒã¯ã»ããªãå¿ èŠãšããCPUã«ãã£ãŠå®å šã«åŒ·å¶çµäºãããŸãã
ãç§ã¯ããã30ç§ééã®æŽ»æ°ã®ããã«äœ¿çšããŸãïŒ sh -c celery -A path.to.app status | grep "${HOSTNAME}:.*OK"
ã
ç§ã¯ããã30ç§ééã®æŽ»æ°ã®ããã«äœ¿çšããŸãïŒ sh -c celery -A path.to.app inspect ping --destination celery@${HOSTNAME}
äœåãªè² è·ã¯çºçããŠããªãããã§ãã100人ãã¯ããã«è¶
ããäœæ¥å¡ãæ±ããŠããŸãã
æºåãããŒãã¯å¿
èŠãããŸãããCeleryããµãŒãã¹ã§äœ¿çšãããããšã¯ãããŸããã ããŒãªã³ã°ãããã€ã¡ã³ãã§ã¯ãŒã«ãŒã®èµ·åãé
ãããã®ã«ååãªminReadySeconds: 10
ãèšå®ããŸãããããããžã§ã¯ãã®Celeryã®èµ·åæéã«äŸåããããšã¯æãããªã®ã§ããã°ã調ã¹ãŠããã«å¿ããŠèšå®ããŸãã
ã¬ãã£ãã¹ãããŒãã¯ããµãŒãã¹ã§äœ¿çšãããŠããªãå Žåã§ãåŒãç¶ã圹ç«ã¡ãŸãã å
·äœçã«ã¯ãã¯ãŒã«ãŒã®ãããã€ã¡ã³ããå®è¡ãããããã€ã¡ã³ããæåããããšã確èªããå Žåãéåžžã¯kubectl rollout status deployment
ãŸãã æºåãããŒããªãã§ãã»ããªãéå§ããããããç¥ããªãã£ãæªãã³ãŒããå±éããŸããã
ç§ã®è§£æ±ºçã¯æ¬¡ã®ãšããã§ãã
readinessProbe:
exec:
command:
[
"/usr/local/bin/python",
"-c",
"\"import os;from celery.task.control import inspect;from <APP> import celery_app;exit(0 if os.environ['HOSTNAME'] in ','.join(inspect(app=celery_app).stats().keys()) else 1)\""
]
ä»ã®äººã¯ããŸããããªãããã§ãð€·ââïž
ããããšã@yardensachsïŒ
ä»ã®ãœãªã¥ãŒã·ã§ã³ã®äœãåé¡ã«ãªã£ãŠããã®ãããããã°ããããã«å€ãã®æéãè²»ãããŸãããæ¹æ³ã¯ãããŸãã
celery inspect ping
ã³ãã³ããexitïŒ0ïŒãªã©ãè¿ããªãããã§ã
celery inspect ping
ã¯æ©èœããŸããã次ã®ããã«ç°å¢å€æ°ã眮ãæããã«ã¯bash
ãå¿
èŠã§ãã
livenessProbe:
exec:
# bash is needed to replace the environment variable
command: [
"bash",
"-c",
"celery inspect ping -A apps -d celery@$HOSTNAME"
]
initialDelaySeconds: 30 # startup takes some time
periodSeconds: 60 # default is quite often and celery uses a lot cpu/ram then.
timeoutSeconds: 10 # default is too low
ç¥ã£ãŠãããšè¯ã
è² è·ãéããšããžã§ããæ£åžžã«åŠçãããããã¯ãã°ããªãå Žåã§ããpingãäžåºŠã«æ°åéãã³ã°ããããšãããã£ããããã»ããªæ€æ»ããªããã³ã°ããŠæŽ»æ§ãããŒãããpingãå®è¡ããããšã«ãªããŸããã ã€ãã³ãã¬ããã®äœ¿çšãšé¢ä¿ããããšæããŠããŸãããåŒãç¶ã調æ»ãç¶ããŠããŸãã
@WillPlatnick Celeryã¯éåæã§ããããã5.0ã§ã¯çºçããŸããããã®ãããå¶åŸ¡ã³ã«ãŒãã³çšã«äºçŽããã容éããããŸãã
inspect ping
æ©èœããªããªã£ã/ãŸã³ãããã»ã¹ãçæããã®ã«åé¡ããããŸãïŒ
root 2296 0.0 0.0 0 0 ? Z 16:04 0:00 [python] <defunct>
root 2323 0.0 0.0 0 0 ? Z 16:05 0:00 [python] <defunct>
...
ä»ã«ããã«ééãã人ã¯ããŸããïŒ åäžã®ããã»ã¹ã®å®è¡ã匷å¶ããããã®--pool
åŒæ°ã¯ãããŸããã
celery inspect ping
@WillPlatnickã®ä»£ããã«äœã䜿çšããŠããã®ãèããŠãããã§ããïŒ éè² è·ã§ãããŒããæ
éãããšããåæ§ã®åé¡ãçºçããŸããã
@mcyprian掻æ§ãããŒããåãé€ããŸããã ç§ã®è žã¯ãããã€ãã³ãã¬ãããšé¢ä¿ããããšç§ã«èšã£ãŠããŸãããç§ãã¡ã¯ãããç解ããããšãåªå ããŠããŸããã
RedisãããŒã«ãŒã§åãCPUã®åé¡ã«ééããŸã
誰ãã解決çãèŠã€ããŸãããïŒ
ãŸããã³ã³ããåã«åºã¥ããŠååãä»ãããã¥ãŒã§ãdebug_taskããã¹ã±ãžã¥ãŒã«ããããšãè©Šã¿ãŠããŸããã åé¡ã¯ãRabbitMQã«å€§éã®å€ããã¥ãŒãããããšã§ã
ã泚æãã ãã
sh -c celery -A path.to.app status | grep "${HOSTNAME}:.*OK"
https://github.com/celery/celery/issues/4079#issuecomment -437415370ã§ææ¡ãããŠããããã«ãrabbitmqã§å€§éã®ãšã©ãŒã¬ããŒããçºçããŸããhttpsïŒ //github.com/celery/celery/issues/4355#issuecommentãåç §ããŠ
pingã®æ€æ»ã®CPU䜿çšçãæžããæ¹æ³ãèŠã€ãããšæããŸãã
celery -b amqpïŒ// userïŒpass @ rabitmq ïŒ5672 / vhost pingãæ€æ»ããŸã
-A path.to.celeryã䜿çšããŠã»ããªæ§æãããŒãããªãããšã¯ãCPUã®äœ¿çšã«ç¢ºãã«åœ¹ç«ã¡ãŸããã
誰ãã確èªã§ããŸããã
pingã®æ€æ»ã®CPU䜿çšçãæžããæ¹æ³ãèŠã€ãããšæããŸãã
celery -b amqpïŒ// userïŒpass @ rabitmq ïŒ5672 / vhost pingãæ€æ»ããŸã
-A path.to.celeryã䜿çšããŠã»ããªæ§æãããŒãããªãããšã¯ãCPUã®äœ¿çšã«ç¢ºãã«åœ¹ç«ã¡ãŸããã
誰ãã確èªã§ããŸããã
ãããïŒ ã¢ããªãããŒãããå Žåãããã¯ããã«åªããŠããŸãã
ãã ããPythonããã»ã¹ã®éå§+ã»ããªã®ã€ã³ããŒãã«ã¯äŸç¶ãšããŠå€§ããªãªãŒããŒãããããããŸãã ç§ã¯ãŸã é«ãæéããå§ãããŸãã
ããã«ã¡ã¯ã
celery inspect ping -A app.tasks -d celery @ $ HOSTNAMEã¯ãããšã©ãŒïŒãããŒããã£ã¹ãã¯ãã©ã³ã¹ããŒã 'sqs'ã§ãµããŒããããŠããŸããããšè¡šç€ºããŸãã
ãããŒã«ãŒãšããŠSQSã䜿çšããŠããã®ã§ãããã¯ãinspectã/ãstatusãã³ãã³ããSQSã§æ©èœããªãããšãæå³ããŸããïŒ
倧èŠæš¡ãªå Žåããã¹ãŠã®ãªã¢ãŒãã³ã³ãããŒã«æ©èœã«ããã kombu.pidbox
ããŒã®ã³ãã³ããèšå®ãããŠããããã«Redisã€ã³ã¹ã¿ã³ã¹ãCPUã§æ¥äžæããŠãããããpingãã¹ããŒã¿ã¹ããŸãã¯æ€æ»ããã®ãŸãŸäœ¿çšããããšã¯ã§ããŸããããã¹ãŠãªã¢ãŒãã³ã³ãããŒã«ã䜿çšããæ¬çªãŠãŒã¹ã±ãŒã¹ã§ãªã¢ãŒãã³ã³ãããŒã«ãç¡å¹ã«ããããšããŠããŸãã
å°çšã®ãã«ã¹ãã§ãã¯ãã¥ãŒãçšæããã®ãæ£ããæ¹æ³ã®ããã«æããŸããããŸã£ããããããŸãã
ãã«ã¹ãã§ãã¯ããã¹ãããããã®ãªã¢ãŒãã³ã³ãããŒã«ã䌎ããªãä»ã®æ¹åæ§ãæã£ãŠãã人ã¯ããŸããïŒ
RabbitMQãšãã¯ã·ã§ã³ããªã·ãŒïŒãã¥ãŒã¯èªåçã«åé€ãããŸãïŒãåããå°çšã®ãã«ã¹ãã§ãã¯ãã¥ãŒããã°ããã®éæ£åžžã«äœ¿çšããŠããããœãªã¥ãŒã·ã§ã³ã«æºè¶³ããŠããŸãã ããã¯äž»ã«ããã®ãã§ãã¯ãå®éã«ã¯ãŒã«ãŒãã¿ã¹ã¯ãåŠçããŠçµäºããããšããã§ãã¯ããŠããããã§ãã ãããå°å ¥ããŠä»¥æ¥ãç«ã¡åŸçããŠããåŽåè ã®åé¡ã¯ãããããŸããã§ããã
@bartoszhernasã¯ãã®ããã®ã³ãŒããå ±æããæ°ã§ããïŒ ããªãã¯ããŒããä»ããŠãããããã¥ãŒã«å ¥ããåŽåè ããããæŸããããïŒ
ã³ãŒã+掻æ§ãããŒãã»ã¯ã·ã§ã³ãèŠãã
ããã«ã¡ã¯ãã³ãŒãã¯æ¬åœã«ç°¡åã§ãïŒ
Kubernetesã§ã¯ãPOD_NAMEã«åºã¥ããŠãã¥ãŒåãæå®ãããããã©ã€ããã§ãã¯ã¹ã¯ãªããã«æž¡ããŸãã
livenessProbe:
initialDelaySeconds: 120
periodSeconds: 70
failureThreshold: 1
exec:
command:
- bash
- "-c"
- |
python celery_liveness_probe.py $LIVENESS_QUEUE_NAME
env:
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: LIVENESS_QUEUE_NAME
value: queue-$(MY_POD_NAME)
ïŒã³ãã³ããšããŠçŽæ¥æž¡ãããšãããšKubernetesã¯ENVãå±éããªããããbash -cã䜿çšããå¿ èŠããããŸãïŒ
次ã«ãcelery_liveness_probe.pyã¯ãCeleryã䜿çšã§ããããã«Djangoãèšå®ããPODã®ãã¥ãŒã§ã¿ã¹ã¯ãã¹ã±ãžã¥ãŒã«ããŸãã
# encoding: utf-8
from __future__ import absolute_import, unicode_literals
import os
import sys
if __name__ == "__main__":
import django
sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..'))
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "ahoy.archive.settings")
django.setup()
from ahoy.archive.apps.eventbus.service import eventbus_service
exit(0 if eventbus_service.health_check(sys.argv[1] if sys.argv and len(sys.argv) > 1 else None) else 1)
ãã«ã¹ãã§ãã¯æ©èœã¯ã¿ã¹ã¯ãéä¿¡ããçµæãåŸ ã¡ãŸã
def health_check(self, queue_name: Optional[str] = None) -> bool:
event = self.celery.send_task(
AhoyEventBusTaskName.LIVENESS_PROBE,
None,
queue=queue_name or self.origin_queue,
ignore_result=False,
acks_late=True,
retry=False,
priority=255
)
try:
is_success = event.get(timeout=10)
except (celery.exceptions.TimeoutError, AttributeError):
is_success = False
return is_success
ã€ãŸããåºæ¬çã«ã¯ãã¿ã¹ã¯ãéä¿¡ãããããçµæãè¿ãå Žåãã¯ãŒã«ãŒã¯æ£åžžã§ãã ã¯ãŒã«ãŒãã¹ã¿ãã¯ããå ŽåïŒäœåºŠãçºçããå ŽåïŒãã¿ã¹ã¯ã¯å®äºãããããããåèµ·åããããã¹ãŠãæ£åžžã«æ»ããŸãã
å¯äžã®æ³šæç¹ã¯ãå€ããã¥ãŒãåŠçããå¿
èŠãããããšã§ããRabbitMQã䜿çšãããšç°¡åã§ããã¥ãŒã«æå¹æéããªã·ãŒãèšå®ããã ãã§ãã
https://www.rabbitmq.com/ttl.html#queue -ttl
@bartoszhernasã³ãŒããå ±æããŠ
ããªããèšã£ãããã«ãç§ã®ãã¥ãŒã¯åçã§ãããRedisã䜿çšããŠããŸã-ãããã£ãŠãRedisã§ãã¥ãŒåã®æå¹æéãåŠçããæ¹æ³ãèŠã€ããå¿ èŠããããŸã
ãããRedisã䜿çšããBullMQã§ãåæ§ã®åé¡ããããŸãã ç§ã®ã¢ã€ãã¢ã¯ãKubernetesçšã®CronJobãäœæããŠããã¥ãŒãæ¯åã¯ãªã¢ããããšã§ãã
æãåèã«ãªãã³ã¡ã³ã
celery inspect ping
ã¯æ©èœããŸããã次ã®ããã«ç°å¢å€æ°ã眮ãæããã«ã¯bash
ãå¿ èŠã§ãã