Gsutil: macOS: crcmod not installed? (gsutil rsync)

Created on 28 Jul 2020  ·  6Comments  ·  Source: GoogleCloudPlatform/gsutil

When running a gsutil rsync from a cloud storage bucket to a local directory, I got the following warning:

WARNING: gsutil rsync uses hashes when modification time is not available at
both the source and destination. Your crcmod installation isn't using the
module's C extension, so checksumming will run very slowly. If this is your
first rsync since updating gsutil, this rsync can take significantly longer than
usual. For help installing the extension, please see "gsutil help crcmod".

Note that the documentation in gsutil help crcmod indicates that macOS should include this by default, quoting:

gsutil distributes a pre-compiled version of crcmod for macOS, so you shouldn't
need to compile and install it yourself. If for some reason the pre-compiled
version is not being detected, please let the Google Cloud Storage team know

So I'm filing an issue as directed.

macOS 10.15.6
Cloud SDK 303.0.0 (core libraries 2020.07.24)

All 6 comments

Also having this issue. FWIW, manually installing crcmod fixed this.

sudo pip3 install -U crcmod

Note I am using python3 per the documentation here: https://cloud.google.com/sdk/gcloud/reference/topic/startup by adding

export CLOUDSDK_PYTHON=python3

to my .zshrc before installing the sdk.

this actually hasn't worked out, I now get this error for any cp process on gsutil

Copying gs://<path>...
Traceback (most recent call last):
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gsutil", line 21, in <module>
    gsutil.RunMain()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gsutil.py", line 122, in RunMain
    sys.exit(gslib.__main__.main())
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 429, in main
    return _RunNamedCommandAndHandleExceptions(
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 767, in _RunNamedCommandAndHandleExceptions
    _HandleUnknownFailure(e)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/__main__.py", line 625, in _RunNamedCommandAndHandleExceptions
    return command_runner.RunNamedCommand(command_name,
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command_runner.py", line 411, in RunNamedCommand
    return_code = command_inst.RunCommand()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 1190, in RunCommand
    self.Apply(_CopyFuncWrapper,
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1514, in Apply
    self._SequentialApply(func, args_iterator, exception_handler, caller_id,
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1586, in _SequentialApply
    worker_thread.PerformTask(task, self)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 2306, in PerformTask
    results = task.func(cls, task.args, thread_state=self.thread_gsutil_api)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 778, in _CopyFuncWrapper
    cls.CopyFunc(args,
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/commands/cp.py", line 982, in CopyFunc
    _, bytes_transferred, result_url, md5 = copy_helper.PerformCopy(
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 3873, in PerformCopy
    return _DownloadObjectToFile(src_url,
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 3054, in _DownloadObjectToFile
    crc32c) = (_DoSlicedDownload(src_url,
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/utils/copy_helper.py", line 2700, in _DoSlicedDownload
    cp_results = command_obj.Apply(
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1499, in Apply
    self._ParallelApply(
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1719, in _ParallelApply
    self._CreateNewConsumerPool(process_count, thread_count,
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/gsutil/gslib/command.py", line 1384, in _CreateNewConsumerPool
    p.start()
  File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle '_io.TextIOWrapper' object

@brianbrownton The _io.TextIOWrapper error is a separate issue on Mac + Python 3.8. https://github.com/GoogleCloudPlatform/gsutil/issues/961

I am guessing that you shouldn't see the same issue if you use Python 3.7

I have the same problem. Added env vars pointing to python3 as well, to no avail.

gsutil version: 4.57
checksum: 43b6eb5e813ffed48ec2e541025259cb (OK)
boto version: 2.49.0
python version: 3.9.1 (default, Dec 10 2020, 11:11:14) [Clang 12.0.0 (clang-1200.0.32.27)]
OS: Darwin 20.2.0
multiprocessing available: True
using cloud sdk: True
pass cloud sdk credentials to gsutil: True
config path(s): /Users/XXX/.config/gcloud/legacy_credentials/XXX/.boto
gsutil path: /Users/XXX/Code/google-cloud-sdk/bin/gsutil
compiled crcmod: False
installed via package manager: False
editable install: False

I've run sudo pip3 install -U crcmod.

Any updates on this?

@mihar It might be possible that the crcmod is not getting installed for the correct python binary.
You can try doing this

# Get the python path that Cloud SDK is using
python_path=$(gcloud info | grep "Python Location" | sed 's/.*\[\(.*\)\]/\1/g' )
# Install crcmod for that python binary
$python_path -m pip install -U crcmod
Was this page helpful?
0 / 5 - 0 ratings