This isn't an error, just warnings saying if you build TensorFlow from source it can be faster on your machine.

SO question about this: http://stackoverflow.com/questions/41293077/how-to-compile-tensorflow-with-sse4-2-and-avx-instructions
TensorFlow guide to build from source: https://www.tensorflow.org/install/install_sources

Carmezim on 3 Mar 2017

👍173 😄15 😕5

Just as @Carmezim stated these are simply warning messages.
For each of your programs, you will only see them once.
And just like the warnings say, you should only compile TF with these flags if you need to make TF faster.

You can follow our guide to install TensorFlow from sources to compile TF with support for SIMD instruction sets.

gunan on 3 Mar 2017

👍12 😄2

Ok, thanks. I get it.

LittlePandaSea on 10 Mar 2017

Is there a way we can silence this?

CGTheProgrammer on 15 Mar 2017

The only way to silence these warning messages is to build from sources, using --config opt option.

gunan on 15 Mar 2017

👍20 👎9 😄8

A sort of "workaround" (albeit imperfect) that redirects the messages on Unix/Linux/OSX:
python myscript.py 2>/dev/null

ocampesato on 19 Mar 2017

👎55 😄15 👍1

@CGTheLegend @ocampesato you can use TF environment variable TF_CPP_MIN_LOG_LEVEL and it works as follows:

It defaults to 0, displaying all logs
To filter out INFO logs set it to 1
WARNINGS additionally, 2
and to additionally filter out ERROR logs set it to 3

So you can do the following to silence the warnings:

import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf

@gunan @mrry I've seen many folks interested in silencing the warnings, would there be interest in adding this kind of info to the docs?

Carmezim on 19 Mar 2017

👍72 😄1

I install from tensorflow install guide, also got this warning.

pip3 install --upgrade tensorflow

jadeydi on 20 Mar 2017

👍3

@jadeydi Instead of compiling from source, "pip" just install the binary as well, so that you'll still got these warnings.

wilsli on 25 Mar 2017

I just compiled tensorflow with support for SSE4.1 SSE4.2 AVX AVX2 and FMA. The build is available here: https://github.com/lakshayg/tensorflow-build . I hope this is useful.

lakshayg on 18 May 2017

👍60 🎉17 ❤13 😄3 👀2 🚀2

Hi @lakshayg, thanks for sharing. You might want to check https://github.com/yaroslavvb/tensorflow-community-wheels

Carmezim on 18 May 2017

👍6 ❤2

Approximately much faster is the build compared to the standard pip install tensorflow-gpu on Ubuntu? Is it only faster for CPU computations, or is there any benefit to GPU computations?

alphamupsiomega on 17 Jul 2017

👍9

http://www.anandtech.com/show/2362/5

This came up on google and has some decent technical details.

test is a DivX encode using VirtualDub 1.7.6 and DivX 6.7. SSE4 comes in if you choose to enable a new full search algorithm for motion estimation, which is accelerated by two SSE4 instructions: MPSADBW and PHMINPOSUW. The idea is that motion estimation (figuring out what will happen in subsequent frames of video) requires a lot of computation of sums of absolute differences, as well as finding the minimum values of the results of those computations. The SSE2 instruction PSADBW can compute two sums of differences from a pair of 16B unsigned integers; the SSE4 instruction MPSADBW can do eight.

...

On our QX9650, the full search with SSE4 enabled runs about 45% faster than with SSE2 only

Now sure what functions tensorflow is using, but might be worth the effort.

mryellow on 20 Jul 2017

Sorry but this is a ridiculous thing to have output in all TF scripts by default. Most people probably aren't compiling TF from source nor want to.

TensorTom on 25 Jul 2017

👎31 👍14 😄3

@TomAshley303, this is a pretty awesome info to get! I don't plan to recompile from source. I don't want to. But the info tells me what to do if my model becomes big and slow and will need a performance boost. It's usually cheaper to recompile with extensions than to buy new hardware, given that having good walkthroughts (which we do have) minimizes the labour cost of recompiling (CPU time does not matter, can run overnight).

szabi on 14 Aug 2017

👍20

I went through the process... Was straight-forward and took no time at all. Not your usual cmake C++ kinda nightmare.

mryellow on 14 Aug 2017

😄4

I have a small bash script to compile TF under MacOS/Linux. It dynamically calculates CPU features and put them as the build parameters. I was thinking to create a PR but didn't find a folder with scripts (helpers) for local builds, only ci_build. If it makes sense I will do it

gist
https://gist.github.com/venik/9ba962c8b301b0e21f99884cbd35082f

venik on 17 Aug 2017

👍7

A note to @gunan

I've encountered this issue when I was installing TensorFlow for the first time. Now I am having to figure out how to resolve it again because I'm installing TensorFlow on a new machine. It's a pain in the neck, and the documentation you've provided is not clear at all.

The fact that I have to do it on my end is ridiculous and infuriating. It's no good making something available from pip/pip3 if it then just throws warnings at you all day.

At the very least, you should edit https://www.tensorflow.org/install/install_sources and explicitly explain how to compile it with SSE / AVX

The solution that worked for me: input "-mavx -msse4.1 -msse4.2" when prompted during the configuration process (when you run ./configure).

Is it that hard to add this to your installation instructions?

skifpankov on 20 Aug 2017

👍11 😕1

Thank you, according to @Carmezim answer, I get the cpu speed up version based on avx and sse. I'v tested faster-rcnn(resnet-101) on Intel. Cost time speeds up about 30%, it is truly useful.

guoxiaolu on 22 Aug 2017

👍4

You can silence the warnings.
Just add these codes at the top.
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf
As mentioned here: https://stackoverflow.com/a/44984610

SayimTR on 9 Sep 2017

👍6

you could easily add a user variable in System Environment Variable:
TF_CPP_MIN_LOG_LEVEL, value = 2. Then restart your IDE

microhlab on 22 Sep 2017

@mikalyoung improvements for GPU computations cannot be expected, since those instructions set are CPU only, and they allow for vectorized operations.
So if you compare two codes running (ideally) 100% on GPUs, one on a Tensorflow instance compiled with SIMD support and one without, you should get the same results in terms of speed (and hopefully numerically also).

fvisconti on 14 Dec 2017

I C:\tf_jenkinshome\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2

As You can see the warning is with my system also but In that, I am not able to understand 'I' in the starting of the Warning so someone can help me in that case

karangupta26 on 11 Jan 2018

"I" there just is a shorthand for "INFO". The other letters you can see there are E for error, or F for fatal.

gunan on 11 Jan 2018

👍4

So I installed using conda. If I wish now to compile from source instead to take advantage of any speed boost, do I need to do anything to remove my conda install of tensorflow? Or is it in its own little container and I can separately compile from source?

offgridcrazy on 14 Jan 2018

👍3

I had installed DeepSpeech and also a DeepSpeech server. Went to start the server and got an error message - "2018-01-17 08:21:49.120154: F tensorflow/core/platform/cpu_feature_guard.cc:35] The TensorFlow library was compiled to use AVX2 instructions, but these aren't available on your machine.
Aborted (core dumped)"

Apparently I need to compile TensorFlow on the same computer. Is there a list somewhere to match Kubuntu 17.10.1 and a HP Probook 4330S please ?

jehoshua7 on 17 Jan 2018

Why are there no windows compiles? I am having the same issues, but instead of muting the warnings I would like to use my GPU, I also have an and graphics card and not Nvidia what do I do?

johnmisquita on 21 Feb 2018

*I Do not have a Nvidia graphics card, I have an and one what do I do?

johnmisquita on 21 Feb 2018

*AMD graphics card.. autocorrect

johnmisquita on 21 Feb 2018

These are not merely warnings as it kills the process on my test boxes. Since I also use AMD GPUs I spun up a Digital Ocean tensorflow box to give this a go, but it seems there is no GPU support there either, and it's failing miserably.

`# Job id 0

Loading hparams from /home/science/tf-demo/models/nmt-chatbot/model/hparams

saving hparams to /home/science/tf-demo/models/nmt-chatbot/model/hparams
saving hparams to /home/science/tf-demo/models/nmt-chatbot/model/best_bleu/hparams
attention=scaled_luong
attention_architecture=standard
batch_size=128
beam_width=10
best_bleu=0
best_bleu_dir=/home/science/tf-demo/models/nmt-chatbot/model/best_bleu
check_special_token=True
colocate_gradients_with_ops=True
decay_factor=1.0
decay_steps=10000
dev_prefix=/home/science/tf-demo/models/nmt-chatbot/data/tst2012
dropout=0.2
encoder_type=bi
eos=
epoch_step=0
forget_bias=1.0
infer_batch_size=32
init_op=uniform
init_weight=0.1
learning_rate=0.001
learning_rate_decay_scheme=
length_penalty_weight=1.0
log_device_placement=False
max_gradient_norm=5.0
max_train=0
metrics=['bleu']
num_buckets=5
num_embeddings_partitions=0
num_gpus=1
num_layers=2
num_residual_layers=0
num_train_steps=500000
num_translations_per_input=10
num_units=512
optimizer=adam
out_dir=/home/science/tf-demo/models/nmt-chatbot/model
output_attention=True
override_loaded_hparams=True
pass_hidden_state=True
random_seed=None
residual=False
share_vocab=False
sos=
source_reverse=False
src=from
src_max_len=50
src_max_len_infer=None
src_vocab_file=/home/science/tf-demo/models/nmt-chatbot/data/vocab.from
src_vocab_size=15003
start_decay_step=0
steps_per_external_eval=None
steps_per_stats=100
subword_option=
test_prefix=/home/science/tf-demo/models/nmt-chatbot/data/tst2013
tgt=to
tgt_max_len=50
tgt_max_len_infer=None
tgt_vocab_file=/home/science/tf-demo/models/nmt-chatbot/data/vocab.to
tgt_vocab_size=15003
time_major=True
train_prefix=/home/science/tf-demo/models/nmt-chatbot/data/train
unit_type=lstm
vocab_prefix=/home/science/tf-demo/models/nmt-chatbot/data/vocab
warmup_scheme=t2t
warmup_steps=0

creating train graph ...

num_bi_layers = 1, num_bi_residual_layers=0
cell 0 LSTM, forget_bias=1 DropoutWrapper, dropout=0.2 DeviceWrapper, device=/gpu:0
cell 0 LSTM, forget_bias=1 DropoutWrapper, dropout=0.2 DeviceWrapper, device=/gpu:0
cell 0 LSTM, forget_bias=1 DropoutWrapper, dropout=0.2 DeviceWrapper, device=/gpu:0
cell 1 LSTM, forget_bias=1 DropoutWrapper, dropout=0.2 DeviceWrapper, device=/gpu:0
learning_rate=0.001, warmup_steps=0, warmup_scheme=t2t
decay_scheme=, start_decay_step=0, decay_steps 10000, decay_factor 1

Trainable variables

embeddings/encoder/embedding_encoder:0, (15003, 512),
embeddings/decoder/embedding_decoder:0, (15003, 512),
dynamic_seq2seq/encoder/bidirectional_rnn/fw/basic_lstm_cell/kernel:0, (1024, 2048), /device:GPU:0
dynamic_seq2seq/encoder/bidirectional_rnn/fw/basic_lstm_cell/bias:0, (2048,), /device:GPU:0
dynamic_seq2seq/encoder/bidirectional_rnn/bw/basic_lstm_cell/kernel:0, (1024, 2048), /device:GPU:0
dynamic_seq2seq/encoder/bidirectional_rnn/bw/basic_lstm_cell/bias:0, (2048,), /device:GPU:0
dynamic_seq2seq/decoder/memory_layer/kernel:0, (1024, 512),
dynamic_seq2seq/decoder/attention/multi_rnn_cell/cell_0/basic_lstm_cell/kernel:0, (1536, 2048), /device:GPU:0
dynamic_seq2seq/decoder/attention/multi_rnn_cell/cell_0/basic_lstm_cell/bias:0, (2048,), /device:GPU:0
dynamic_seq2seq/decoder/attention/multi_rnn_cell/cell_1/basic_lstm_cell/kernel:0, (1024, 2048), /device:GPU:0
dynamic_seq2seq/decoder/attention/multi_rnn_cell/cell_1/basic_lstm_cell/bias:0, (2048,), /device:GPU:0
dynamic_seq2seq/decoder/attention/luong_attention/attention_g:0, (), /device:GPU:0
dynamic_seq2seq/decoder/attention/attention_layer/kernel:0, (1536, 512), /device:GPU:0
dynamic_seq2seq/decoder/output_projection/kernel:0, (512, 15003), /device:GPU:0

creating eval graph ...

num_bi_layers = 1, num_bi_residual_layers=0
cell 0 LSTM, forget_bias=1 DeviceWrapper, device=/gpu:0
cell 0 LSTM, forget_bias=1 DeviceWrapper, device=/gpu:0
cell 0 LSTM, forget_bias=1 DeviceWrapper, device=/gpu:0
cell 1 LSTM, forget_bias=1 DeviceWrapper, device=/gpu:0

Trainable variables

embeddings/encoder/embedding_encoder:0, (15003, 512),
embeddings/decoder/embedding_decoder:0, (15003, 512),
dynamic_seq2seq/encoder/bidirectional_rnn/fw/basic_lstm_cell/kernel:0, (1024, 2048), /device:GPU:0
dynamic_seq2seq/encoder/bidirectional_rnn/fw/basic_lstm_cell/bias:0, (2048,), /device:GPU:0
dynamic_seq2seq/encoder/bidirectional_rnn/bw/basic_lstm_cell/kernel:0, (1024, 2048), /device:GPU:0
dynamic_seq2seq/encoder/bidirectional_rnn/bw/basic_lstm_cell/bias:0, (2048,), /device:GPU:0
dynamic_seq2seq/decoder/memory_layer/kernel:0, (1024, 512),
dynamic_seq2seq/decoder/attention/multi_rnn_cell/cell_0/basic_lstm_cell/kernel:0, (1536, 2048), /device:GPU:0
dynamic_seq2seq/decoder/attention/multi_rnn_cell/cell_0/basic_lstm_cell/bias:0, (2048,), /device:GPU:0
dynamic_seq2seq/decoder/attention/multi_rnn_cell/cell_1/basic_lstm_cell/kernel:0, (1024, 2048), /device:GPU:0
dynamic_seq2seq/decoder/attention/multi_rnn_cell/cell_1/basic_lstm_cell/bias:0, (2048,), /device:GPU:0
dynamic_seq2seq/decoder/attention/luong_attention/attention_g:0, (), /device:GPU:0
dynamic_seq2seq/decoder/attention/attention_layer/kernel:0, (1536, 512), /device:GPU:0
dynamic_seq2seq/decoder/output_projection/kernel:0, (512, 15003), /device:GPU:0

creating infer graph ...

num_bi_layers = 1, num_bi_residual_layers=0
cell 0 LSTM, forget_bias=1 DeviceWrapper, device=/gpu:0
cell 0 LSTM, forget_bias=1 DeviceWrapper, device=/gpu:0
cell 0 LSTM, forget_bias=1 DeviceWrapper, device=/gpu:0
cell 1 LSTM, forget_bias=1 DeviceWrapper, device=/gpu:0

Trainable variables

embeddings/encoder/embedding_encoder:0, (15003, 512),
embeddings/decoder/embedding_decoder:0, (15003, 512),
dynamic_seq2seq/encoder/bidirectional_rnn/fw/basic_lstm_cell/kernel:0, (1024, 2048), /device:GPU:0
dynamic_seq2seq/encoder/bidirectional_rnn/fw/basic_lstm_cell/bias:0, (2048,), /device:GPU:0
dynamic_seq2seq/encoder/bidirectional_rnn/bw/basic_lstm_cell/kernel:0, (1024, 2048), /device:GPU:0
dynamic_seq2seq/encoder/bidirectional_rnn/bw/basic_lstm_cell/bias:0, (2048,), /device:GPU:0
dynamic_seq2seq/decoder/memory_layer/kernel:0, (1024, 512),
dynamic_seq2seq/decoder/attention/multi_rnn_cell/cell_0/basic_lstm_cell/kernel:0, (1536, 2048), /device:GPU:0
dynamic_seq2seq/decoder/attention/multi_rnn_cell/cell_0/basic_lstm_cell/bias:0, (2048,), /device:GPU:0
dynamic_seq2seq/decoder/attention/multi_rnn_cell/cell_1/basic_lstm_cell/kernel:0, (1024, 2048), /device:GPU:0
dynamic_seq2seq/decoder/attention/multi_rnn_cell/cell_1/basic_lstm_cell/bias:0, (2048,), /device:GPU:0
dynamic_seq2seq/decoder/attention/luong_attention/attention_g:0, (), /device:GPU:0
dynamic_seq2seq/decoder/attention/attention_layer/kernel:0, (1536, 512), /device:GPU:0
dynamic_seq2seq/decoder/output_projection/kernel:0, (512, 15003),

log_file=/home/science/tf-demo/models/nmt-chatbot/model/log_1519669184

2018-02-26 18:19:44.862736: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
Killed`

spiderwisp on 26 Feb 2018

what command needs to run and where to run these commands and how please tell. I desperately need help.

Abhishek113 on 14 Mar 2018

But does it mean that the system is not using GPU for the process?

GitVatsal on 31 Mar 2018

Well you need to resolve this if you are building tensorflow in an acceleration environment such as using k-fold in the KerasClassifier.
To resolve this you will need to build tensorflow from source just as everyone recommends.
To build tensorflow from source you will need to have the following tool

Install git on you machine if you haven't down so already - on ubuntu machine just type "sudo apt-get install git
You will need to install bazel. It is highly recommended to use the custom APT repository. Follow the instruction on this link to install bazel https://docs.bazel.build/versions/master/install-ubuntu.html.
You need to the following python dependencies... using the command below
numpy, dev, and wheel
sudo apt-get install python-numpy python-dev python-pip python-wheel
4.Once you have all the dependencies installed, clone the tensorflow github to your local drive
git clone https://github.com/tensorflow/tensorflow
Go to the location to clone tensorflow and cd to the tensorflow file and run the config file
cd tensor
./configure

Just follow the instruction on the screen to complete tensorflow installation.
I will highly recommend to update your machine once tensorflow is installed
sudo apt-get update

Good luck and enjoy...

ojadaw on 4 Apr 2018

Just chiming in on this thread that you shouldn't just silence these warnings - I'm getting about 43% faster training time by building from source, I think it's worth the effort.

Tensorflow's instructions on building from source are pretty clear...
...but they don't actually explain how to turn on SSE/AVX/FMA etc - so use this thread to get an idea of how to set your Bazel build flags

sometimescasey on 15 Jun 2018

👍4

how to install tensorflow using this file" tensorflow-1.6.0-cp36-cp36m-win_amd64.whl"

anozele on 22 Jun 2018

@anozele pip3 install --upgrade *path to wheel file*

engelmohr on 4 Jul 2018

@gunan --config=opt is not enough, you should also add, e.g., --copt="-msse4.2", when you build TensorFlow from source.

xixiddd on 23 Aug 2018

According to Intel, https://software.intel.com/en-us/articles/intel-optimization-for-tensorflow-installation-guide, If you use intel built Tensorflow, you can ignore those warning since all available instruction set would be used by the backend MKL. Can anyone from Tensorflow confirm this?

lonelykid on 24 Feb 2019

👍3

This isn't an error, just warnings saying if you build TensorFlow from source it can be faster on your machine.

SO question about this: http://stackoverflow.com/questions/41293077/how-to-compile-tensorflow-with-sse4-2-and-avx-instructions
TensorFlow guide to build from source: https://www.tensorflow.org/install/install_sources

However,it is not faster than i do not use -FMA -AVX -SSE https://stackoverflow.com/questions/57197854/fma-avx-sse-flags-did-not-bring-me-good-performance

tangjie77wd on 25 Jul 2019

Hi. Sorry if I am beating a dead horse. Just wondering why is the default pip wheel not the binaries compiled with advance instructions?

daravi on 1 Jan 2020

Hi. Sorry if I am beating a dead horse. Just wondering why is the default pip wheel not the binaries compiled with advance instructions?

This is because old cpu architectures don't support advanced instruction set. See wiki for the detailed list of cpus supporting AVX, AVX2 or AVX512. If the default pip binary is compiled with these instruction sets then tensorflow cannot work on old CPUs.

$refraction-ray picture$ refraction-ray on 2 Jan 2020

But does it mean that the system is not using GPU for the process?

Nope, It shows even if you are using GPU, if you haven`t silenced the messages you should also see Tensorflor loading your GPU device in command prompt.

jpeleuterio on 1 Jun 2020

If you check with this repo:
请查看下面代码：

https://github.com/fo40225/tensorflow-windows-wheel

He has compiled almost all version of TF with SSE and AVX
他已经将几乎所有TF版本编译好了！

Rundong4026 on 11 Jul 2020

👀1 🚀1 ❤1 🎉1 👍1

This article was a good tutorial on how to build from source including the flags
https://medium.com/@pierreontech/setup-a-high-performance-conda-tensorflow-environment-976995158cb1

try forcing the inclusion of the appropriate extensions using additional bazel options like --copt=-mavx --copt=-msse4.1 --copt=-msse4.2

CoreyCole on 11 Sep 2020

Tensorflow: How to compile tensorflow using SSE4.1, SSE4.2, and AVX.

Most helpful comment

All 44 comments

Loading hparams from /home/science/tf-demo/models/nmt-chatbot/model/hparams

creating train graph ...

Trainable variables

creating eval graph ...

Trainable variables

creating infer graph ...

Trainable variables

log_file=/home/science/tf-demo/models/nmt-chatbot/model/log_1519669184

Related issues