Ipython: How to figure out the path of the current ipynb file from within IPython?

Created on 5 Jan 2017  ·  15Comments  ·  Source: ipython/ipython

Is there a way to figure out the current ipynb file from within iPython?

Usecase: I want to trigger simulations from within IPython. To keep everything documented, I want to copy the IPython notebook in the results folder, ideally from within IPython.

Searching the web with this issue showed, that there seems to be a lot of interest for such a feature, but the solutions presented on stackoverflow all seemed to be a bit hacky. Or is this already implemented?

Most helpful comment

Excuse-me, but with this:

!echo %cd% # under windows
!pwd # under linux/mac

you get the desired information.
In order to re-use it, just do:

myInfo01 = !echo %cd% # under windows
myInfo02 = !pwd # under linux/mac

All 15 comments

It is not possible, not without hack that won't work (displaying Javascript that execute Python code).

Here are some reasons why the kernel (in this case IPython):

  • may not be running from single file
  • even if one file, the file may not be a notebook.
  • even if notebook, the notebook may not be on a filesystem.
  • even if on a file system, it may not be on the same machine.
  • even if on the same machine the path to the file may not make sens in the IPython context.
  • even if it make sens the Jupyter Protocol has not been designed to do so. And we have no plan to change this abstraction in short or long term.

Though, you _can_ run a notebook without a notebook server via an external script, and copy the notebook at the same time as well. That's a simple manner of jupyter nbconvert --execute --output-dir='results/'

Hope that helps.

Maybe the fact that you closed this issue immediately indicates, that this topic has been thoroughly discussed somewhere else. Could you give me a link to the discussion, so I can understand that decision better?

Otherwise I am wondering: Why can't the iPython environment set a python variable e.g. inside the IPython module as soon as a kernel get's started? This variable then could hold information on how the kernel got started, like the URL of the iPython notebook.

There is no particular place where this is thoroughly discussed, it's in many place, but I'll reuse another metaphor I'm seen before.

You are a book writer. Your reader want one thing regularly. As they identify to the characters, they want themain character to have the same eye color they do. How do you do that ? Well as a book writer you can't. For each single person the answer is obvious, but for the majority of user you can't.

You may print 10 versions with 10 eyes color, and ask the reader to choose. But the reader _have_ to do it.

It the same for the IPython kernel.

The kernel does not know what started it. The things that started could _try_ to set a env variable, but it might not even make sens in this context. You may not have a notebook connected. The process you start may not be python.

You have a thing (your kernel) whos sole purpose is to execute code. It may or may not have access to a file system, it may or may not be python. It may or may not be even already connected to a frontend. it may or may not be connected to multiple client during its life maybe even simultaneously.

So while in each case you _can likely_ give a definitive answer as to whether there is a document attached to a kernel and what it is, the general response and how to get it is unclear. The question does not make sens, or at least we haven't found one.

So as the books reader, you have to make a choice and tell the kernel the filename that _you_ think is the right one.

When the notebook server starts it set the name of the file linked to it. There are technical challenges to do so, mainly not coupling components, but assume we can. A couple of question from the top of my head.

When running your notebook via nbconvert, what name do you set ?

  • if input is stdin ?
  • if input is network
  • if output notebook =! input_notebook
  • in the "bookbook" mode that take multiple notebook in input.
    When attaching a console, what name do you set ?
    If you attach multiple notebook, what name do you set ?

    • If you execute multiple notebook in a row, what name do you set ?

    • If you execute multiple notebook in parallel, what name do you set ?

      when working in an env without filesystem (postgres DB), what name ?

      Binary or ASCII ? Defined encoding ?

      Notebook name of FullPath ?

      What if not on the same machine ?

      What if execution is purely in-memory because the notebook was generated on the fly?

      Even if you have a name and print() it... what if the file get's renamed ?

  • renamed while kernel is off ?
  • renamed during kernel execution ?
    Realtime collaboration and hardlinks when a file may have multiple names,which one is right ?

None of the above question have clear answers for me. If there is a consensus on how to do it correctly, without blocking us in a corner we'll think about it, and then there will be all the technical difficulties.

Hope that clarify things a bit.You can try hackish things like this, but you'll see that they rarely content everyone.

Excuse-me, but with this:

!echo %cd% # under windows
!pwd # under linux/mac

you get the desired information.
In order to re-use it, just do:

myInfo01 = !echo %cd% # under windows
myInfo02 = !pwd # under linux/mac

It won't work because the process CWD may change and may even not be where the notebook is stored.

Is it at least guaranteed that if you open a notebook in a fresh notebook server and implicitly start a kernel by running some code, it will get pwd the folder the ipynb-file is in?

Just because ipython can't magically handle every weird edge case, which I think nobody expected, shouldn't stop it from having a _simple_ rule like that for the _simple_ cases people actually care about (like handing a notebook + data files in the same folder to students)

Is it at least guaranteed that if you open a notebook in a fresh notebook server and implicitly start a kernel by running some code, it will get pwd the folder the ipynb-file is in?

No.

It is not guarantied that the kernel is on the same machine than the ipynb, it is not even guarantied that the ipynb file even exist, will exists, is unique, or have a unique path, or even is/will be a file. Example: real time collaboration on google drive.

I think I didn't formulate my question well enough. 200 students will have a python environment setup, most by installing anaconda on their own laptops. I will hand them the computer exercise as a notebook and data files in a folder. One of them might store the notebook in postgres DB, two might run the kernel on a different machine than their laptop where they have the notebook. Three students will setup a real-time collaboration on google drive together. Six students will do something else that you might or might have not mentioned so far. I'm mainly thinking about the 190 students that are going to reasonably follow the instructions, unzip the folder on their own laptop (Windows, OS X or Linux), start a notebook server on the _same_ laptop (either by the notebook server explorer or double clicking the notebook file), and let it implicitly start a new kernel (again on the same laptop) by executing the first cell. The question is whether cwd works for _those_ students. Will ~15 students come to my office because os.getcwd() didn't work or should I expect closer to 50-100?

I'm mainly thinking about the 190 students that are going to reasonably follow the instructions, unzip the folder on their own laptop (Windows, OS X or Linux), start a notebook server on the same laptop (either by the notebook server explorer or double clicking the notebook file), and let it implicitly start a new kernel (again on the same laptop) by executing the first cell.

Yes, using os.cwd() or even c = !cwd will work for these users; and I think in your context it's fine to ask them to do that. But as a _general_ use case that is not the case. we'll also try to be careful in when stating things on this bug tracker as it may pass as an explicit endorsement of this method. And we know people ted to not read in depth.

Fair enough, thanks for your concern of precise communication.

The first time the script is run in a workbook, and before changing it, os.cwd() is the notebook directory.
So what I often use in my code is

if not 'workbookDir' in globals():
    workbookDir = os.getcwd()
print('workbookDir: ' + workbookDir)
os.chdir(workbookDir)  # If you changed the current working dir, this will take you back to the workbook dir.

As it seems, most of the users here do not really want to access the "path of the notebook" whatever that might actually mean in a given deployment, but rather to access resources that are associated with that notebook, in such a way that the details of the deployment are abstracted away.

Obviously, distributing notebooks together with associated data is general and broad use-case. Maybe there is a need for an abstract mechanism to access resources from within a kernel? It would then be the responsibility of the deployment (i.e. the notebook server installation) to properly set-up that resource access API, possibly with the help of some metadata from the notebook? Then, the local notebook server could default to actually serving these resources from a path relative to the notebook. Other deployments might provide a separate interface (such as an upload method or an URL pointing to resources), or simply not support the interface at all.

It might be too late now but it sounds like Colaboratory might help your work here:
https://colab.research.google.com/notebooks/welcome.ipynb

There is a variable named "_dh" inserted into the globals when the notebook starts. It appears this is the directory of the notebook, though I haven't searched for any documentation on this. It is working for me right now, though.

Similar to @SurealCereal s solution of:

if not 'workbookDir' in globals():
    workbookDir = os.getcwd()

I've been using this right after my imports:

try: ipynb_path
except NameError: ipynb_path = os.getcwd()

Something about the word 'error' makes me think twice before messing with it's position or existence.

Alternatively:

if 'workbookDir' not in globals():

is little more readable.

Was this page helpful?
0 / 5 - 0 ratings