Environments

Environments play an important role in managing dependencies within programming projects. In this lecture, we will talk about why they are needed, how they are created and how they are defined.

Table of Contents

What are environments?

An environment is an isolated workspace containing dependencies (external packages you are using):

Environments are managed by Julia's package manager Pkg and defined in two files called the Project.toml and the Manifest.toml.

Creating a new virtual environment

To create a new environment, enter Pkg-mode in the Julia REPL by typing ], then type activate followed by the name of the environment you want to create.

Let's create a new environment called MyDeepLearningEnv that includes MLDatasets.jl and Flux.jl:

(@v1.10) pkg> activate MyDeepLearningEnv            # create environment
  Activating new project at `~/MyDeepLearningEnv`

(@MyDeepLearningEnv) pkg>                           # environment is active

The printout informs us that this created a new project folder at ~/MyDeepLearningEnv. The exact path depends on the folder in which you launched the Julia REPL. In this case, I opened Julia in my home directory, which is called ~ on Linux and macOS.

The project folder MyDeepLearningEnv contains a Project.toml and a Manifest.toml. Adding packages to this environment will update both of these files:

(MyDeepLearningEnv) pkg> add Flux, MLDatasets
   Resolving package versions...
   Installed ProgressLogging ─ v0.1.4
   Installed Functors ──────── v0.4.10
   Installed Optimisers ────── v0.3.3
   Installed OneHotArrays ──── v0.2.5
   Installed Flux ──────────── v0.14.15
    Updating `~/MyDeepLearningEnv/Project.toml`   # <---
  [587475ba] + Flux v0.14.15
  [eb30cadb] + MLDatasets v0.7.14
    Updating `~/MyDeepLearningEnv/Manifest.toml`  # <---
  [621f4979] + AbstractFFTs v1.5.0
  [79e6a3ab] + Adapt v4.0.4
  [dce04be8] + ArgCheck v2.3.0
  [a9b6321e] + Atomix v0.1.0
  [a963bdd2] + AtomsBase v0.3.5
  ...

We can check the packages in our environment by typing status or st in Pkg-mode:

(MyDeepLearningEnv) pkg> status
Status `~/MyDeepLearningEnv/Project.toml`
  [587475ba] Flux v0.14.15
  [eb30cadb] MLDatasets v0.7.14

Structure of a Julia environment

Project.toml

Let's first take a look at contents of the Project.toml. In a second terminal, move to your project folder using cd (change directory), then look at the file contents in your terminal using the command cat Project.toml (concatenate), or open the file in your favorite editor:

[deps]
Flux = "587475ba-b771-5e3f-ad9e-33799f191a9c"
MLDatasets = "eb30cadb-4394-5ae3-aed4-317e484a6458"

In the case of our environment, the Project.toml just contains a list of the installed packages we would expect: Flux and MLDatasets. They are followed by a string called a "universally unique identifiers" (UUIDs), which we can ignore for now.

As we will see in the lesson on writing a package, the Project.toml contains more information when used in packages.

Manifest.toml

Let's look at our Manifest using cat Manifest.toml:

# This file is machine-generated - editing it directly is not advised

julia_version = "1.10.2"
manifest_format = "2.0"
project_hash = "9aea089894f46207e0e51b9ad88b65d90b4230ac"

[[deps.AbstractFFTs]]
deps = ["LinearAlgebra"]
git-tree-sha1 = "d92ad398961a3ed262d8bf04a1a2b8340f915fef"
uuid = "621f4979-c628-5d54-868e-fcf4e3e8185c"
version = "1.5.0"
weakdeps = ["ChainRulesCore", "Test"]

    [deps.AbstractFFTs.extensions]
    AbstractFFTsChainRulesCoreExt = "ChainRulesCore"
    AbstractFFTsTestExt = "Test"

[[deps.Adapt]]
deps = ["LinearAlgebra", "Requires"]
git-tree-sha1 = "6a55b747d1812e699320963ffde36f1ebdda4099"
uuid = "79e6a3ab-5dfb-504d-930d-738a2a938a0e"
version = "4.0.4"
weakdeps = ["StaticArrays"]

    [deps.Adapt.extensions]
    AdaptStaticArraysExt = "StaticArrays"

[[deps.ArgCheck]]
git-tree-sha1 = "a3a402a35a2f7e0b87828ccabbd5ebfbebe356b4"
uuid = "dce04be8-c92d-5529-be00-80e4d2c0e197"
version = "2.3.0"
...

The Manifest is a much longer file than the Project.toml. Mine contains 1267 lines of code, even though we just added two dependencies: Flux and MLDatasets How is this possible?

This is due to the fact that the Manifest lists all packages in the dependency tree. Not only Flux and MLDatasets, but also their dependencies, the dependencies of their dependencies, and so on. For packages that are not part of Julia Base, Git tree hashes and versions are specified. The Manifest even includes external binaries (e.g. compiled C, C++ and Fortran programs) that might be required.

This makes our environment fully reproducible!

Why should I use environments?

Reason 1: Reproducibility

In the sciences, reproducibility is of utmost importance to validate research findings and improve reliability. The environment of a project can be shared with others by providing a Project.toml and Manifest.toml . This ensures that people will use the exact same dependencies as you did. Changes in future releases of a package won't affect your results.

Reason 2: Avoid dependency conflicts

As we will see in our lecture on writing packages, packages can set lower and upper bounds on versions of their dependencies. This is useful since developers don't know whether future releases of their dependencies will be compatible with their code.

Let's imagine a scenario where Flux and MLDatasets both have a common dependency on a third package Foo.jl. When creating an environment, Pkg will look at the acceptable versions of Foo for both Flux and MLDataset and compute the intersection of acceptable versions. This is called resolving dependencies.

Reason 3: Avoid polluting your global environment

The more packages are installed in one environment, the harder it gets to resolve dependencies, and the more likely you are to use outdated versions of packages.

For this reason, you should use separate environments for each project instead of installing everything into your global (@v1.10) environment. To show which of your dependencies are outdated, run status --outdated in Pkg-mode.

Reason 4: Simplify troubleshooting

This point is basically the same as Reason 1: When your code contains a bug, troubleshooting is simplified by making the bug reproducible. When opening an issue on GitHub, it is good practice to provide the following:

Reason 5: Environments include binaries

Most Julia code is written in pure Julia, but sometimes it is necessary to call external programs which have been compiled to binaries. Binaries are regular packages whose name ends on _jll by convention; for example OpenBLAS_jll, which contains OpenBLAS binaries.

When installing a Julia package, Pkg automatically downloads and installs all required binaries. Just like any other dependency, they are added to the Manifest and therefore fully reproducible.

Temporary environments

Temporary environments provide a convenient way to experiment with new packages without affecting your existing project environments. When you want to quickly test a package you've discovered on GitHub or elsewhere, the package manager allows you to create an isolated, disposable environment.

In your Julia REPL, enter package mode and type activate --temp. This will create an environment with a randomized name in a temporary folder.

(@v1.10) pkg> activate --temp
  Activating new project at `/var/folders/74/wcz8c9qs5dzc8wgkk7839k5c0000gn/T/jl_9AGcg1`

(jl_9AGcg1) pkg>

Environments in Pluto

Pluto notebooks also contain reproducible environments. Let's take a look at the source code of a notebook called empty_pluto.jl that just contains a single cell declaring using LinearAlgebra.

### A Pluto.jl notebook ###
# v0.19.25

using Markdown
using InteractiveUtils

# ╔═╡ 9842a4f5-69d1-4566-b605-49d5c6679b4a
using LinearAlgebra # 💡 the only cell we added! 💡

# ╔═╡ 00000000-0000-0000-0000-000000000001
PLUTO_PROJECT_TOML_CONTENTS = """
[deps]
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
"""
 
# ╔═╡ 00000000-0000-0000-0000-000000000002
PLUTO_MANIFEST_TOML_CONTENTS = """
# This file is machine-generated - editing it directly is not advised

julia_version = "1.8.5"
manifest_format = "2.0"
project_hash = "ac1187e548c6ab173ac57d4e72da1620216bce54"

[[deps.Artifacts]]
uuid = "56f22d72-fd6d-98f1-02f0-08ddc0907c33"

[[deps.CompilerSupportLibraries_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "e66e0078-7015-5450-92f7-15fbd957f2ae"
version = "1.0.1+0"

[[deps.Libdl]]
uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb"

[[deps.LinearAlgebra]]
deps = ["Libdl", "libblastrampoline_jll"]
uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"

[[deps.OpenBLAS_jll]]
deps = ["Artifacts", "CompilerSupportLibraries_jll", "Libdl"]
uuid = "4536629a-c528-5b80-bd46-f80d51c5b363"
version = "0.3.20+0"

[[deps.libblastrampoline_jll]]
deps = ["Artifacts", "Libdl", "OpenBLAS_jll"]
uuid = "8e850b90-86db-534c-a0d3-1478176c7d93"
version = "5.1.1+0"
"""

# ╔═╡ Cell order:
# ╠═9842a4f5-69d1-4566-b605-49d5c6679b4a
# ╟─00000000-0000-0000-0000-000000000001
# ╟─00000000-0000-0000-0000-000000000002

We can see that

Pluto notebooks are therefore fully reproducible and also regular Julia files!

Further reading

Last modified: October 09, 2024.
Website, code and notebooks are under MIT License © Adrian Hill.
Built with Franklin.jl, Pluto.jl and the Julia programming language.