ChatGPT解决这个技术问题 Extra ChatGPT

How to prevent tensorflow from allocating the totality of a GPU memory?

I work in an environment in which computational resources are shared, i.e., we have a few server machines equipped with a few Nvidia Titan X GPUs each.

For small to moderate size models, the 12 GB of the Titan X is usually enough for 2–3 people to run training concurrently on the same GPU. If the models are small enough that a single model does not take full advantage of all the computational units of the GPU, this can actually result in a speedup compared with running one training process after the other. Even in cases where the concurrent access to the GPU does slow down the individual training time, it is still nice to have the flexibility of having multiple users simultaneously train on the GPU.

The problem with TensorFlow is that, by default, it allocates the full amount of available GPU memory when it is launched. Even for a small two-layer neural network, I see that all 12 GB of the GPU memory is used up.

Is there a way to make TensorFlow only allocate, say, 4 GB of GPU memory, if one knows that this is enough for a given model?


m
mrry

You can set the fraction of GPU memory to be allocated when you construct a tf.Session by passing a tf.GPUOptions as part of the optional config argument:

# Assume that you have 12GB of GPU memory and want to allocate ~4GB:
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)

sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

The per_process_gpu_memory_fraction acts as a hard upper bound on the amount of GPU memory that will be used by the process on each GPU on the same machine. Currently, this fraction is applied uniformly to all of the GPUs on the same machine; there is no way to set this on a per-GPU basis.


Thank you very much. This info is quite hidden in the current doc. I would never have found it by myself :-) If you can answer, I would like to ask for two additional infos: 1- Does this limit the amount of memory ever used, or just the memory initially allocated? (ie. will it still allocate more memory if there is a need for it by the computation graph) 2- Is there a way to set this on a per-GPU basis?
Related note: setting CUDA_VISIBLE_DEVICES to limit TensorFlow to a single GPU works for me. See acceleware.com/blog/cudavisibledevices-masking-gpus
it seems that the memory allocation goes a bit over the request, e..g I requested per_process_gpu_memory_fraction=0.0909 on a 24443MiB gpu and got processes taking 2627MiB
I can't seem to get this to work in a MonitoredTrainingSession
@jeremy_rutman I believe this is due to cudnn and cublas context initialization. That is only relevant if you are executing kernels that use those libs though.
S
Sergey Demyanov
config = tf.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)

https://github.com/tensorflow/tensorflow/issues/1578


This one is exactly what I want because in a multi-user environment, it is very inconvenient to specify the exact amount of GPU memory to reserve in the code itself.
Also, if you're using Keras with a TF backend, you can use this and run from keras import backend as K and K.set_session(sess) to avoid memory limitations
M
Mateen Ulhaq

For TensorFlow 2.0 and 2.1 (docs):

import tensorflow as tf
tf.config.gpu.set_per_process_memory_growth(True)

For TensorFlow 2.2+ (docs):

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
  tf.config.experimental.set_memory_growth(gpu, True)

The docs also list some more methods:

Set environment variable TF_FORCE_GPU_ALLOW_GROWTH to true.

Use tf.config.experimental.set_virtual_device_configuration to set a hard limit on a Virtual GPU device.


@AkshayLAradhya no this is only for TF 2.0 and above. The other answers here will work fine for 1.13 and earlier.
Not beyond. For TF 2.2 it's 'tf.config.experimental.set_memory_growth'
Since this is a highly upvoted answer, I've updated to the latest version of TF.
@MateenUlhaq here is a link to the Tensorflow documentation you probably used: tensorflow.org/api_docs/python/tf/config/experimental/…
The first part "For TensorFlow 2.0 and 2.1..." is not accurate. It's not in the documentation source referenced and I have TF2.0 and when I tested it I got an error. The second part though works on TF2.0 as well as TF2.2+
u
user1767754

Here is an excerpt from the Book Deep Learning with TensorFlow

In some cases it is desirable for the process to only allocate a subset of the available memory, or to only grow the memory usage as it is needed by the process. TensorFlow provides two configuration options on the session to control this. The first is the allow_growth option, which attempts to allocate only as much GPU memory based on runtime allocations, it starts out allocating very little memory, and as sessions get run and more GPU memory is needed, we extend the GPU memory region needed by the TensorFlow process.

1) Allow growth: (more flexible)

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config, ...)

The second method is per_process_gpu_memory_fraction option, which determines the fraction of the overall amount of memory that each visible GPU should be allocated. Note: No release of memory needed, it can even worsen memory fragmentation when done.

2) Allocate fixed memory:

To only allocate 40% of the total memory of each GPU by:

config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.4
session = tf.Session(config=config, ...)

Note: That's only useful though if you truly want to bind the amount of GPU memory available on the TensorFlow process.


As far as your question is concerned, option 2 might be useful to you. In general if you do not have multiple applications running on GPU and dynamic networks then it makes sense to use 'Allow growth' option.
C
CATALUNA84

For Tensorflow version 2.0 and 2.1 use the following snippet:

 import tensorflow as tf
 gpu_devices = tf.config.experimental.list_physical_devices('GPU')
 tf.config.experimental.set_memory_growth(gpu_devices[0], True)

For prior versions , following snippet used to work for me:

import tensorflow as tf
tf_config=tf.ConfigProto()
tf_config.gpu_options.allow_growth=True
sess = tf.Session(config=tf_config)

G
GPrathap

All the answers above assume execution with a sess.run() call, which is becoming the exception rather than the rule in recent versions of TensorFlow.

When using the tf.Estimator framework (TensorFlow 1.4 and above) the way to pass the fraction along to the implicitly created MonitoredTrainingSession is,

opts = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
conf = tf.ConfigProto(gpu_options=opts)
trainingConfig = tf.estimator.RunConfig(session_config=conf, ...)
tf.estimator.Estimator(model_fn=..., 
                       config=trainingConfig)

Similarly in Eager mode (TensorFlow 1.5 and above),

opts = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
conf = tf.ConfigProto(gpu_options=opts)
tfe.enable_eager_execution(config=conf)

Edit: 11-04-2018 As an example, if you are to use tf.contrib.gan.train, then you can use something similar to bellow:

tf.contrib.gan.gan_train(........, config=conf)

M
Mey Khalili

You can use

TF_FORCE_GPU_ALLOW_GROWTH=true

in your environment variables.

In tensorflow code:

bool GPUBFCAllocator::GetAllowGrowthValue(const GPUOptions& gpu_options) {
  const char* force_allow_growth_string =
      std::getenv("TF_FORCE_GPU_ALLOW_GROWTH");
  if (force_allow_growth_string == nullptr) {
    return gpu_options.allow_growth();
}

m
maxstrobel

Tensorflow 2.0 Beta and (probably) beyond

The API changed again. It can be now found in:

tf.config.experimental.set_memory_growth(
    device,
    enable
)

Aliases:

tf.compat.v1.config.experimental.set_memory_growth

tf.compat.v2.config.experimental.set_memory_growth

References:

https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/config/experimental/set_memory_growth

https://www.tensorflow.org/guide/gpu#limiting_gpu_memory_growth

See also: Tensorflow - Use a GPU: https://www.tensorflow.org/guide/gpu

for Tensorflow 2.0 Alpha see: this answer


T
Timbus Calin

All the answers above refer to either setting the memory to a certain extent in TensorFlow 1.X versions or to allow memory growth in TensorFlow 2.X.

The method tf.config.experimental.set_memory_growth indeed works for allowing dynamic growth during the allocation/preprocessing. Nevertheless one may like to allocate from the start a specific-upper limit GPU memory.

The logic behind allocating a specific GPU memory would also be to prevent OOM memory during training sessions. For example, if one trains while opening video-memory consuming Chrome tabs/any other video consumption process, the tf.config.experimental.set_memory_growth(gpu, True) could result in OOM errors thrown, hence the necessity of allocating from the start more memory in certain cases.

The recommended and correct way in which to allot memory per GPU in TensorFlow 2.X is done in the following manner:

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  # Restrict TensorFlow to only allocate 1GB of memory on the first GPU
  try:
    tf.config.experimental.set_virtual_device_configuration(
        gpus[0],
        [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)]

My notebook has a dedicated NVIDIA (GForce 920M) with 2GB RAM. I tried set_memory_growth but it doesn't worked. And I tried to limit max memory to 1024MB, also doesn't worked. So I tried 1.5GB and it worked. Thank you!
L
Lerner Zhang

Shameless plug: If you install the GPU supported Tensorflow, the session will first allocate all GPUs whether you set it to use only CPU or GPU. I may add my tip that even you set the graph to use CPU only you should set the same configuration(as answered above:) ) to prevent the unwanted GPU occupation.

And in an interactive interface like IPython and Jupyter, you should also set that configure, otherwise, it will allocate all memory and leave almost none for others. This is sometimes hard to notice.


s
ssp

If you're using Tensorflow 2 try the following:

config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.Session(config=config)

S
SunsetQuest

For Tensorflow 2.0 this this solution worked for me. (TF-GPU 2.0, Windows 10, GeForce RTX 2070)

physical_devices = tf.config.experimental.list_physical_devices('GPU')
assert len(physical_devices) > 0, "Not enough GPU hardware devices available"
tf.config.experimental.set_memory_growth(physical_devices[0], True)

I am using TF-GPU 2.0, Ubuntu 16.04.6, Tesla K80.
@azar - Thanks for sharing. That's interesting the same issue on both Ubuntu and Windows. Somehow, I always think that the issues are different when getting closer to the hardware. Maybe this is becoming less so as time passes - maybe a good thing.
D
DSBLR
# allocate 60% of GPU memory 
from keras.backend.tensorflow_backend import set_session
import tensorflow as tf 
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.6
set_session(tf.Session(config=config))

The provided answer was flagged for review as a Low Quality Post. Here are some guidelines for How do I write a good answer?. This provided answer may be correct, but it could benefit from an explanation. Code only answers are not considered "good" answers. From review.
K
Kamil Marczak

this code has worked for me:

import tensorflow as tf
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.InteractiveSession(config=config)

I
Imran Ud Din

Well I am new to tensorflow, I have Geforce 740m or something GPU with 2GB ram, I was running mnist handwritten kind of example for a native language with training data containing of 38700 images and 4300 testing images and was trying to get precision , recall , F1 using following code as sklearn was not giving me precise reults. once i added this to my existing code i started getting GPU errors.

TP = tf.count_nonzero(predicted * actual)
TN = tf.count_nonzero((predicted - 1) * (actual - 1))
FP = tf.count_nonzero(predicted * (actual - 1))
FN = tf.count_nonzero((predicted - 1) * actual)

prec = TP / (TP + FP)
recall = TP / (TP + FN)
f1 = 2 * prec * recall / (prec + recall)

plus my model was heavy i guess, i was getting memory error after 147, 148 epochs, and then I thought why not create functions for the tasks so I dont know if it works this way in tensrorflow, but I thought if a local variable is used and when out of scope it may release memory and i defined the above elements for training and testing in modules, I was able to achieve 10000 epochs without any issues, I hope this will help..


I am amazed at TF's utility but also by it's memory use. On the CPU python allocating 30GB or so for a training job on the flowers dataset used in may TF examples. Insane.
K
Khan

i tried to train unet on voc data set but because of huge image size, memory finishes. i tried all the above tips, even tried with batch size==1, yet to no improvement. sometimes TensorFlow version also causes the memory issues. try by using

pip install tensorflow-gpu==1.8.0