More documentation about the pyobj field.

Summary: Pull Request resolved: pytorch#22885 Test Plan: Imported from OSS Differential Revision: D16283076 Pulled By: ezyang fbshipit-source-id: 4f6a87d900c4d430eedc90661de89e0f6916347e
zmmwl · Jul 16, 2019 · 7793ab0 · 7793ab0
1 parent cd11109
commit 7793ab0
Show file tree

Hide file tree

Showing 2 changed files with 26 additions and 1 deletion.
diff --git a/c10/core/TensorImpl.h b/c10/core/TensorImpl.h
@@ -1545,7 +1545,19 @@ struct C10_API TensorImpl : public c10::intrusive_ptr_target {
 
   c10::VariableVersion version_counter_;
 
-  PyObject* pyobj_ = nullptr; // weak reference
+  // This field contains a weak reference to a PyObject representing
+  // this Tensor.  It MUST NOT be a strong reference, as that would
+  // create a reference cycle between Tensor and the PyObject.  If
+  // pyobj is nullptr, when we transfer Tensor to Python, we allocate
+  // a new PyObject for it and set this field.  This is thread safe
+  // because all Python code is protected under the GIL.  This design does
+  // NOT WORK for Tensors which are shared across multiple Python
+  // subinterpreters (introduced in Python 3.8) since you don't have
+  // enough space to store the separate PyObject per subinterpreter.
+  // When a PyObject dies, you are obligated to clear this field
+  // (otherwise, you will try to use-after-free the pyobj); this currently
+  // occurs in THPVariable_clear in torch/csrc/autograd/python_variable.cpp
+  PyObject* pyobj_ = nullptr;
 
   // We could save a word or two by combining the SmallVector structs,
   // since their size is redundant, and if we need to overflow the buffer space

diff --git a/torch/csrc/autograd/python_variable.cpp b/torch/csrc/autograd/python_variable.cpp
@@ -113,6 +113,19 @@ static int THPVariable_clear(THPVariable *self)
     if (auto grad_acc = self->cdata.try_get_grad_accumulator()) {
       grad_acc->pre_hooks().clear();
     }
+    // We must clear the pyobj field in the base C++ Variable, to ensure
+    // that if we attempt to pass the Variable to Python, we don't
+    // attempt to reuse the (now-dead) PyObject.
+    //
+    // One non-obvious consequence of this: if you have a tensor x, you
+    // take its id(), and then you let it become dead in Python, if you
+    // get another reference to the tensor in Python later (because you
+    // passed it from C++ to Python), you'll get a *different* id() the
+    // second time around.  So you better make sure that if you're using
+    // id() to keep track of Tensors, you better make sure their Python
+    // objects stay live, buster!  See
+    // https://github.com/pytorch/pytorch/issues/22884 for an example of
+    // this actually showing up.
     self->cdata.set_pyobj(nullptr);
   }
   self->cdata.reset();