μ΄μ μλμ NPEμ μ μ€νμ΅λλ€(μ»€λ° 3c95089c μ¬μ©). μμ§ μ¬κΈ°μ μμ§λ§ λ§μ€ν°(f30b9ad9)μ μ΅μ 컀λ°μΌλ‘ μ€λ μλνμ΅λλ€. μλ μΆλ ₯μ μ΅μ λ²μ μ λλ€.
λ³κ²½λ μ¬νμ
λλ€. eCryptfsλ‘ λ§μ΄κ·Έλ μ΄μ
νμ΅λλ€. λλ kafka-backupμ μ€μ§νκ³ target dirμ μ΄λ¦μ λ°κΎΈκ³ λΉμ°κ³ chattr +i
λ°±μ
μ±ν¬ ꡬμ±μ μ§μ νμ΅λλ€(kafka-backupμ΄ Puppetμ μν΄ λ€μ μμλλ κ²μ λ°©μ§νκΈ° μν΄). κ·Έλ° λ€μ eCryptfs λ³κ²½ μ¬νμ λ°°ν¬νκ³ rsyncλ₯Ό λ€μ μνν λ€μ chattr +i
ν΄μ νκ³ Puppetμ λ€μ μ μ©νμ΅λλ€.
μ΄μ μ£Όμ μ§λ¬Έμ μ΄κ²μ λλ²κ·Ένλ €κ³ μλν΄μΌ ν©λκΉ? μλλ©΄ κ·Έλ₯ μ§μ°κ³ λ€λ₯Έ μ λ°±μ μ ν΄μΌ ν©λκΉ? μ΄κ²μ QAμ΄λ―λ‘ μκ°μ΄ μμ΅λλ€.
[2020-03-17 02:23:47,321] INFO [Consumer clientId=connector-consumer-chrono_qa-backup-sink-0, groupId=connect-chrono_qa-backup-sink] Setting offset for partition [redacted].chrono-billable-datasink-0 to the committed offset FetchPosition{offset=0, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=kafka5.node:9093 (id: 5 rack: null), epoch=187}} (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:762)
[2020-03-17 02:23:47,697] ERROR WorkerSinkTask{id=chrono_qa-backup-sink-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:179)
java.lang.NullPointerException
at de.azapps.kafkabackup.sink.BackupSinkTask.close(BackupSinkTask.java:122)
at org.apache.kafka.connect.runtime.WorkerSinkTask.commitOffsets(WorkerSinkTask.java:397)
at org.apache.kafka.connect.runtime.WorkerSinkTask.closePartitions(WorkerSinkTask.java:591)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:196)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
[2020-03-17 02:23:47,705] ERROR WorkerSinkTask{id=chrono_qa-backup-sink-0} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:180)
[2020-03-17 02:23:47,705] INFO Stopped BackupSinkTask (de.azapps.kafkabackup.sink.BackupSinkTask:139)
JFYI, λλ κ·Έκ²μ μ§μ°κ³ μλ‘μ΄ λ°±μ μΌλ‘ μμνμ΅λλ€. μ΄ λ¬Έμ λ₯Ό λ«μΌμμμ€.
ν ... μ΄μνλ€μ... κ΄λ ¨ μ½λ μ€μ λ€μκ³Ό κ°μ΅λλ€. https://github.com/itadventurer/kafka-backup/blob/f30b9ad963c8a7d266c8eacd50bd7c5c3ddbbc16/src/main/java/de/Backazapps/kafkabackup/sink/java #L121 -L122
partitionWriters
λ https://github.com/itadventurer/kafka-backup/blob/f30b9ad963c8a7d266c8eacd50bd7c5c3ddbbc16/src/main/java/de/azapps/sink/java/de/azapps/sinkfkabackup μ open()
νΈμΆμ μ±μμ§λλ€. BackupSinkTask.java#L107
λͺ¨λ TopicPartition
κ° μ΄λ¦΄ λλ§λ€ νΈμΆλ©λλ€... μ μ΄λ° μΌμ΄ λ°μνλμ§ μ΄ν΄κ° λμ§ μμ΅λλ€.
μΆκ° λλ²κΉ μ μ¬μ©ν μ μλ λ°μ΄ν°κ° μμ΅λκΉ? μ¬ννλ €κ³ μλνλ κ²μ΄ ν₯λ―Έλ‘μΈ κ²μ λλ€. NPEλ₯Ό λμ§λ λμ μ΅μν μλ―Έ μλ μ€λ₯λ₯Ό 보μ¬μ€μΌ ν©λλ€...
μλ νμΈμ! μ, μ¬μ ν μ΄μ λλ ν 리 λ°±μ μ΄ μμ΅λλ€. κ·Έκ²μ μ¬μ©νλ©΄ νμ¬ ν΄λ¬μ€ν° λ°±μ μνμ μν₯μ λ―ΈμΉ μ μμ§λ§ μ±ν¬ μ΄λ¦μ λ³κ²½νμ§ μμκΈ° λλ¬Έμ μΆμΈ‘ν©λλ€.
λ΄ μκ°μλ μ΄ λμ λλ ν λ¦¬κ° eCryptfsλ₯Ό νμ±ννλ λμ μ΄λ€ μμΌλ‘λ μμλμμ΅λλ€. μΌλΆ νμΌμ΄ μ€μλ‘ λ³κ²½λμκ±°λ μ΄μ μ μ¬ν κ²μΌ μ μμ΅λλ€.
ν β¦ λ―Όκ°ν μ λ³΄κ° ν¬ν¨λμ΄ μμ΅λκΉ? https://send.firefox.com/μ μ
λ‘λνκ³ λ§ν¬λ₯Ό 보λ΄μ£Όμμμ€. λλ μ€λ₯λ₯Ό μ¬ννλ €κ³ λ
Έλ ₯ν κ²μ
λλ€.
κ·Έλ μ§ μμΌλ©΄ μλ‘μ΄ ν΄λ¬μ€ν°λ‘ μ¬νμ μλνκ±°λ λΉμ μ΄ μ³κΈ°λ₯Ό λ°λΌλ©΄μ λ¬Έμ λ₯Ό μ’
λ£ν©λλ€ ;)
μ€λ λ€λ₯Έ ν΄λ¬μ€ν°μμλ λ°μνμ΅λλ€ ...
kafka-backupμ μ€μ§ν λ€μ eCryptfsλ₯Ό λ§μ΄νΈ ν΄μ ν λ€μ azcopy sync
λ₯Ό μνν λ€μ eCryptfsλ₯Ό λ€μ λ§μ΄νΈνκ³ kafka-backupμ μμνλ Azure λ°±μ
cronjobμ΄ μμ΅λλ€.
μ€λ λ°€ umount
λ¨κ³κ° μ€ν¨νμ¬ μ€ν¬λ¦½νΈλ μ€ν¨νμ΅λλ€( set -e
). λ¬Έμ κ° λ°μνλ μμ μΈ κ² κ°μ΅λλ€. νμλΌμΈμ μ£Όμ κΉκ² λ€μ νμΈν΄μΌ νμ§λ§. μ΄ λ¬Έμ λ λμ€μ μ
λ°μ΄νΈλ©λλ€.
UPD. λ°©κΈ λ‘κ·Έ νμΈμ νμ΅λλ€. NPEλ μ€μ λ‘ λ μΌμ° μΌμ΄λ¬μ΅λλ€. Kafka-backupμ΄ OOMμ μν΄ μ¬λ¬ λ² μ’
λ£λμμ΅λλ€... -Xmx1024M
λλ Docker memory_limit=1152M
κ° μ΄ ν΄λ¬μ€ν°μ μΆ©λΆνμ§ μμ κ² κ°μ΅λλ€. (kafka-backupμ HEAP/RAM ν¬κΈ°λ₯Ό κ³μ°νλ λ°©λ²μ λν μμ΄λμ΄ ?
μ΄ λ°μ΄ν°μ λν΄ λλ²κΉ μ νμκ² μ΅λκΉ? νμ¬μ λ―Όκ°ν μ λ³΄κ° ν¬ν¨λμ΄ μμ΄μ μ¬λ¦΄ μ μμ΅λλ€...
BTWκ° μ€ν¨ν μ±ν¬λ‘ μΈν΄ kafka-connectκ° μ’ λ£λ μ μμ΅λκΉ? λ¨μΌ μ±ν¬ μ€λ₯(λ€λ₯Έ μ±ν¬/컀λ₯ν°κ° μλ κ²½μ°)μ κ²½μ° μ 체 λ 립ν μ°κ²° νλ‘μΈμ€κ° μ€ν¨νλ©΄ μ’μ κ²μ λλ€.
BTWκ° μ€ν¨ν μ±ν¬λ‘ μΈν΄ kafka-connectκ° μ’ λ£λ μ μμ΅λκΉ? λ¨μΌ μ±ν¬ μ€λ₯(λ€λ₯Έ μ±ν¬/컀λ₯ν°κ° μλ κ²½μ°)μ κ²½μ° μ 체 λ 립ν μ°κ²° νλ‘μΈμ€κ° μ€ν¨νλ©΄ μ’μ κ²μ λλ€.
#46 μ°Έμ‘°
UPD. λ°©κΈ λ‘κ·Έ νμΈμ νμ΅λλ€. NPEλ μ€μ λ‘ λ μΌμ° μΌμ΄λ¬μ΅λλ€. Kafka-backupμ΄ OOMμ μν΄ μ¬λ¬ λ² μ’ λ£λμμ΅λλ€... -Xmx1024M λλ Docker memory_limit=1152Mμ΄ μ΄ ν΄λ¬μ€ν°μ μΆ©λΆνμ§ μμ κ² κ°μ΅λλ€. (kafka-backupμ HEAP/RAM ν¬κΈ°λ₯Ό κ³μ°νλ λ°©λ²μ λν μμ΄λμ΄κ° μμ΅λκΉ?
μ λ°μ΄νΈλ₯Ό λμ³μ μ£μ‘ν©λλ€. λ€λ₯Έ μ견μ μμ λ‘κ² μΆκ°νμμμ€ ;)
μ§κΈμ μ΄λ»κ² κ³μ°ν μ§ λͺ¨λ₯΄κ² μ΅λλ€. ν΄λΉ ν λ‘ μ λν μ ν°μΌ #47μ μ΄μμ΅λλ€.
μ΄ λ°μ΄ν°μ λν΄ λλ²κΉ μ νμκ² μ΅λκΉ? νμ¬μ λ―Όκ°ν μ λ³΄κ° ν¬ν¨λμ΄ μμ΄μ μ¬λ¦΄ μ μμ΅λλ€...
μ, λΆνν©λλ€! κ·Έκ²μ κ΅μ₯ν κ²μ λλ€!
μ΄ λ°μ΄ν°μ λν΄ λλ²κΉ μ νμκ² μ΅λκΉ? νμ¬μ λ―Όκ°ν μ λ³΄κ° ν¬ν¨λμ΄ μμ΄μ μ¬λ¦΄ μ μμ΅λλ€...
μ, λΆνν©λλ€! κ·Έκ²μ κ΅μ₯ν κ²μ λλ€!
λΆννλ λλ μλ° λλ²κΉ μ λ₯μνμ§ μμ΅λλ€ ... λΉμ μ΄ μ΄κ²μ λν΄ μ λ₯Ό μλ΄νλ©΄ 무μΈκ°λ₯Ό μ€νν μ μμ΅λλ€.
μ’μ, λλ λ€μ λ λμ μ΄λ»κ² ν κ²μΈμ§ μκ°νλ €κ³ λ
Έλ ₯ν κ²μ΄λ€. μ΄μ©λ©΄ λ΄κ° μ°μ°ν λ¬Έμ λ₯Ό μ°Ύμ μλ μλ€ :joy: (λ λ§μ ν
μ€νΈλ₯Ό μμ±νκ³ μΆλ€)
νμ¬κ° μλ λ°μ΄ν°λ‘ κ·Έκ²μ μ¬νν μ μλ€λ©΄ μ λ§ λλ¨ν κ²μ
λλ€!
μ¬κΈ°μμ λ³Έ κ²μ λ°λ₯΄λ©΄ kafka-backup νλ‘μΈμ€λ₯Ό kill -9
λͺ λ² μ£½μ΄λ κ²μ΄ μ’μ΅λλ€. λλ λΉμ μ΄ μνμ λλ¬ ν μ μλ€κ³ μκ°ν©λλ€ :) eCryptfsμ κ΄λ ¨μ΄ μκΈ°λ₯Ό μ λ§λ‘ λ°λλλ€ ...
λλ μ€λ λ΄ ν μ€νΈ μ€μ μμλ κ·Έκ²μ 보μμ΅λλ€. νμ¬ λλ κ·Έκ²μ μ¬ννμ§ λͺ»νκ³ μμ΅λλ€. λ€μ λ μ λ€μ μλ ν κ²μ λλ€ ...
#88κ³Ό ν OOMμ λͺ μκ° νμ μ΄κ²μ λ€μ λλ₯΄μμμ€..
μ€λ λ°€ Azure blobstore λ°±μ μ μννκΈ° μ μ kafka-backup μλΉμ€ μ’ λ£ μ μ΄κ²μ 보μμ΅λλ€.
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: [2020-05-30 19:19:24,572] INFO WorkerSinkTask{id=chrono_prod-backup-sink-0} Committing offsets synchronously using sequence number 2782: {xxx-4=OffsetAndMetadata{offset=911115, leaderEpoch=null, metadata=''}, yyy-5=OffsetAndMetadata{offset=11850053, leaderEpoch=null, metadata=''}, [...]
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: [2020-05-30 19:19:24,622] ERROR WorkerSinkTask{id=chrono_prod-backup-sink-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:179)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: org.apache.kafka.common.errors.WakeupException
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.maybeTriggerWakeup(ConsumerNetworkClient.java:511)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:275)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:212)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:937)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1473)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1431)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at org.apache.kafka.connect.runtime.WorkerSinkTask.doCommitSync(WorkerSinkTask.java:333)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at org.apache.kafka.connect.runtime.WorkerSinkTask.doCommit(WorkerSinkTask.java:361)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at org.apache.kafka.connect.runtime.WorkerSinkTask.commitOffsets(WorkerSinkTask.java:432)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at org.apache.kafka.connect.runtime.WorkerSinkTask.closePartitions(WorkerSinkTask.java:591)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:196)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: #011at java.base/java.lang.Thread.run(Unknown Source)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: [2020-05-30 19:19:24,634] ERROR WorkerSinkTask{id=chrono_prod-backup-sink-0} Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:180)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: [2020-05-30 19:19:24,733] INFO Stopped BackupSinkTask (de.azapps.kafkabackup.sink.BackupSinkTask:139)
May 30 19:19:24 backupmgrp1 docker/kafka-backup-chrono_prod[16472]: [2020-05-30 19:19:24,771] INFO [Consumer clientId=connector-consumer-chrono_prod-backup-sink-0, groupId=connect-chrono_prod-backup-sink] Revoke previously assigned partitions [...]
:μκ°: Kafka Backupμ λͺ μκ° λμ μ€ννκ³ λ§μ λ°μ΄ν°λ₯Ό μμ±νμ¬ μ¬νν΄ λ΄μΌ ν κ² κ°μ΅λλ€... κ°μ₯ μλ―Έ μλ λ°©μμΌλ‘ λλ²κΉ νλ λ°©λ²μ λν΄ μκ°ν΄μΌ ν©λλ€...
Kafka Backup μ€μ μ λͺ¨λν°λ§ν μ μλ€λ©΄ μ‘°κΈμ΄λλ§ λμμ΄ λ κ² κ°μ΅λλ€. μλ§λ μ§νμμ μ μ©ν κ²μ λ³Ό μ μμ κ²μ λλ€.
λμΌν μ€λ₯λ₯Ό μ¬νν©λλ€.
[2020-07-10 11:05:21,755] ERROR WorkerSinkTask{id=backup-sink-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask)
java.lang.NullPointerException
at de.azapps.kafkabackup.sink.BackupSinkTask.close(BackupSinkTask.java:122)
at org.apache.kafka.connect.runtime.WorkerSinkTask.commitOffsets(WorkerSinkTask.java:397)
at org.apache.kafka.connect.runtime.WorkerSinkTask.closePartitions(WorkerSinkTask.java:591)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:196)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
ν¬λ μ€μ en k8s λ° Azure νμΌ κ³΅μ νμΌ μμ€ν μ μ¬μ©νμ¬ λ°±μ μ μ μ₯νκ³ μμ΅λλ€. μ΄ μμ μμ λͺ κ°μ§ λ‘κ·Έλ₯Ό μΆκ°νλ €κ³ ν©λλ€.