Kubernetes: Image volumes and container volumes

Created on 8 Aug 2014  ·  140Comments  ·  Source: kubernetes/kubernetes

This would map closely to Docker's native volumes support, and allow people to build and version pre-baked data as containers. Maybe read-only? Haven't thought that far...

areapp-lifecycle areusability kinfeature lifecyclfrozen prioritimportant-soon siservice-catalog sistorage

Most helpful comment

multistage builds are not the same thing. multistage builds let you do stuff like build, throw away the build environment and copy the built artefacts to the final container. in the end, you are left with 1 container. it in this situation has for example, nginx and your static files.

In the spirit of k8s composability though, the desire is to have one container for nginx that can be independently updated from a second container storing your static files. they are combined together at runtime via k8s pod semantics. This is what the issue is about.

All 140 comments

I guess so? But why not just use a git repo?

On Thu, Aug 7, 2014 at 10:11 PM, Tim Hockin [email protected]
wrote:

This would map closely to Docker's native volumes support, and allow
people to build and version pre-baked data as containers. Maybe read-only?
Haven't thought that far...


Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/issues/831.

More plugins more better? I wanted to put it out there, since we do sort
of diverge from Docker's native volumes support. Clearly not urgent :)

On Thu, Aug 7, 2014 at 10:18 PM, brendandburns [email protected]
wrote:

I guess so? But why not just use a git repo?

On Thu, Aug 7, 2014 at 10:11 PM, Tim Hockin [email protected]
wrote:

This would map closely to Docker's native volumes support, and allow
people to build and version pre-baked data as containers. Maybe
read-only?
Haven't thought that far...

Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/issues/831.

Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-51564381
.

More potential uses of this:

  • Deployment of scripts/programs for lifecycle hooks: One of the main points of the lifecycle hooks (#140) is to decouple applications from the execution environment (Kubernetes in this case). If the hook scripts/programs must be deployed as part of the application container, that compromises this objective.
  • Dynamic package composition more generally: This would be more similar to our internal package model, where we can independently manage the base filesystem, language runtime, application, utility programs for debugging, etc.
  • Configuration deployment
  • Input data deployment

A git repo could be used for some of these cases, but for others it would be less than ideal.

If the base image of a docker file (e.g. FROM fedora) is a linux distro,
then isn't it going to be annoying to have a bunch of Linux Standard Base
type of files in what is really supposed to be a data-only packge?

On the other hand, if it is creating using tar -c . | docker import - myimage, then what is the advantage is of a docker image over a tar file?

On Tue, Sep 30, 2014 at 9:00 PM, bgrant0607 [email protected]
wrote:

More potential uses of this:

  • Deployment of scripts/programs for lifecycle hooks: One of the main
    points of the lifecycle hooks (#140
    https://github.com/GoogleCloudPlatform/kubernetes/issues/140) is to
    decouple applications from the execution environment (Kubernetes in this
    case). If the hook scripts/programs must be deployed as part of the
    application container, that compromises this objective.
  • Dynamic package composition more generally: This would be more
    similar to our internal package model, where we can independently manage
    the base filesystem, language runtime, application, utility programs for
    debugging, etc.
  • Configuration deployment
  • Input data deployment

A git repo could be used for some of these cases, but for others it would
be less than ideal.


Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-57416301
.

How about the VOLUMES directives of a container? Any container can declare
itself to be exposing any number of volumes. Maybe the functionality to
expose is not the whole container, but just the volumes from that container?

On the other hand, people can create data containers "FROM scratch", and
who are we to say it's annoying?

On Wed, Oct 1, 2014 at 7:20 AM, erictune [email protected] wrote:

If the base image of a docker file (e.g. FROM fedora) is a linux distro,
then isn't it going to be annoying to have a bunch of Linux Standard Base
type of files in what is really supposed to be a data-only packge?

On the other hand, if it is creating using tar -c . | docker import - myimage, then what is the advantage is of a docker image over a tar file?

On Tue, Sep 30, 2014 at 9:00 PM, bgrant0607 [email protected]
wrote:

More potential uses of this:

  • Deployment of scripts/programs for lifecycle hooks: One of the main
    points of the lifecycle hooks (#140
    https://github.com/GoogleCloudPlatform/kubernetes/issues/140) is to
    decouple applications from the execution environment (Kubernetes in this
    case). If the hook scripts/programs must be deployed as part of the
    application container, that compromises this objective.
  • Dynamic package composition more generally: This would be more
    similar to our internal package model, where we can independently manage
    the base filesystem, language runtime, application, utility programs for
    debugging, etc.
  • Configuration deployment
  • Input data deployment

A git repo could be used for some of these cases, but for others it
would
be less than ideal.

Reply to this email directly or view it on GitHub
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-57416301>

.

Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-57471069
.

@thockin I like the initial idea of having new volume type to support docker's data volume container, which matches common package, shared package etc. concept internally. I also can see the potential use cases listed by @bgrant0607. But please don't go down to the road like Docker had today: declare a container as data volume, in which case, we are going to introduce another level complexity of dependencies to containers within a pod, or even dependencies between pods if a pod only has data volume container, etc. I think your initial idea of having a volume type which actually refers as a docker volume container or other read-only volume is a better approach for a long run.

The interesting thing about docker volumes is that a container does not
have to be RUNNING for the volumes to exist. It's a weird model, but I
think it could work.

I don't think we know what people really want in this space yet, though.

On Wed, Oct 1, 2014 at 10:57 AM, Dawn Chen [email protected] wrote:

@thockin https://github.com/thockin I like the initial idea of having
new volume type to support docker's data volume container, which matches
common package, shared package etc. concept internally. I also can see the
potential use cases listed by @bgrant0607 https://github.com/bgrant0607.
But please don't go down to the road like Docker had today: declare a
container as data volume, in which case, we are going to introduce another
level complexity of dependencies to containers within a pod, or even
dependencies between pods if a pod only has data volume container, etc. I
think your initial idea of having a volume type which actually refers as a
docker volume container or other read-only volume is a better approach for
a long run.

Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-57507672
.

It appears the same net effect as this issue can be achieved without introducing a new volume type using two containers in a pod and some shell wrapped around the underling container command lines (see https://github.com/GoogleCloudPlatform/kubernetes/issues/1589#issuecomment-58043463).

How to decide on container-as-a-volume vs. command-line-based sequencing?

  • more portability between Kubernetes and non-Kubernetes docker use cases with container-as-volume.
  • easier for user to discover container-as-volume concept and identify that it is the right solution?
  • having fewer and less general mechanisms for setting up "packages" may lend itself to more tightly integrated build/deploy systems. But, maybe that is not a goal for Kubernetes.
  • either solution can integrate with data durability, I think.
  • liveness checking is more complex with command-line-based sequencing, since the pod to be liveness checked goes through a waiting phase and then a running phase.

There are a lot of ways to go about it and a new volume type is in my opinion not needed. I tried to standardize the way we structure data and make different volume providers possible. These range from Host Volumes, Data Volume Containers, Side Containers with additional logic to Volume as a Service, which is where k8s could integrate greatly. The start is already available via git as volume. I think the native Volumes in Docker are enough, but just lack a standard. The more detailed ideas are available at docker/docker#9277.

I think the question is whether the data volume container should be represented as a container or as a volume. I prefer to think of them as volumes and find passive containers to be non-intuitive for users, problematic for management systems to deal with, and the source of weird corner-case behaviors.

@bgrant0607 still they are supported in docker and therefore we should acknowledge they exist. I would love to see more integrated methods in k8s itself, which just expose a specific type of volume. I was hinting at that in my proposal via the VaaS approach. But I would dislike this approach reducing compatibility.

+1 for supporting a container as a volume. I have a scenario where I have a container that has a bunch of data baked into it for use by other containers, helps keep the data "local" to the work being done.

Whatever you decide, I hope you will make it clear in the documentation to save people time of searching around to find this information. Currently, the documentation for both compute and container engine:

  1. makes no mention that VOLUME/--volumes-from/container-as-volume is not supported
  2. makes no mention of possible work-arounds
  3. makes no mention of the possibility of retrieving things from a git repo

It's important to note that using a git repo isn't the same. It requires the git repo to be securely accessible from Google Cloud (or wherever Kubernetes is being used). Further, it's unclear how non-public repositories would be accessible, unless the user & password is hard-coded into the Kubernetes GitRepo#repository JSON/YAML string. Also, it requires that the desired artifact(s) be checked in to source control. And it decouples the Docker image from the artifact (which may or may not be desirable).

I will be working around this issue by moving the data that's in my container volume into a Dockerfile that layers on top of the container that wanted to use the volume, with ADD. The problem you're running into is that the community at large is encouraging the "container as volume" approach in websites and blog posts, and as a result people will continue to have difficulty. For example, the docker website itself says, "If you have some persistent data that you want to share between containers, or want to use from non-persistent containers, it's best to create a named Data Volume Container, and then to mount the data from it." (emphasis mine).

Also, @erictune a container-only volume can be (and probably should be) written as "FROM scratch". I'd argue that if the user doesn't do it that way, that's their choice.

+1 @rehevkor5

I am disappointed to hear that k8s doesn't support data volume containers.

I am not sure how I am supposed to abstract r/w data away from the host now. I was under the impression that k8s was about abstracting away the host, but a host volume introduces all sorts of host-specfics, like having to share the same username/group for access to the data.

@rehevkor5 I thought the same thing about data volume containers at first (should be written as FROM scratch) until I read this (which may or may not be correct): http://container42.com/2014/11/18/data-only-container-madness/ Your workaround seems to do just this?

There are a few things going on here, most importantly (I think) some
confusion.

Kubernetes supports the notion of a writable "empty volume" that is shared
across containers within a pod, without resorting to host directories -
this is what an emptyDir volume is.

Now the question comes down to "but I don't want an _empty_ directory, I
want ". The question I think we need to sort out is what
the is. We currently support pull-from-git, which is just
a single instance of a larger pattern - "go fetch some pre-cooked data once
before starting my pod".

We could support myriad such "fetch once" volume kinds, git, cvs, svn,
docker containers(more below), URLs, in-line base64-encoded tarfiles,
stdout of another program, etc. I do NOT think we want to support those
all as independent volume plugins - they can almost all be done by an
unprivileged container without any help from kubelet. More, you quickly
arrive at the followup features like "...and re-pull from git every 10
minutes" - things that stop being "fetch once" and start being active
management, but do not require privileges. We make great use of such
things internally.

IMO: all of these things that can be run as a side-car container in your
pod (writing to a shared emptyDir) SHOULD BE. Git should stop being a
first-class volume, and should instead be a container of some sort. This
brings a slew of new design questions: Is it just a container like all the
other app containers? How do I ensure it runs BEFORE my app containers?
What if it experiences a failure? I don't have answers to all of these yet.

Now, let's think about the case of docker data volume containers. What is
the semantic that people are really asking for? Is it:

  • "run" a data container and then expose the entire chroot of that run?
  • "run" a data container and then expose all of the VOLUME (from
    Dockerfile) dirs as kubernetes volumes?
  • "run" a data container and then do the equivalent of --volumes-from into
    kubernetes containers?

These are all subtly different semantically, especially in the face of a
data container that has multiple VOLUME statements. Some operating modes
also make it hard to verify input until after a pod has been accepted,
scheduled, and attempted on a kubelet (we try to validate as much as we can
up front).

ACTION ITEM: I'd very much like for people who use docker data containers
to describe what behavior they would expect here.

Back to side-car containers as volumes. I could imagine something like:

Pod {
spec: {
volumes: [
{ Name: "git-data",
Source: FromContainer {
Name: "awkward", // what goes here?
Image: "kubernetes/git-volume"
EnvVars: [
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
}
}
],
containers: [ ... use git-data ...]
}

Sadly, there's not much of the container-ness that you can really hide, so
you end up re-using the Container schema, which is at least somewhat
redundant and awkward.

Alternately, I could imagine just making them extra containers:

Pod {
spec: {
volumes: [
{ Name: "git-data", Source: EmptyDir {} }
],
containers: [
{
Name: "git-puller",
Image: "kubernetes/git-volume"
EnvVars: [
{ DATADIR: "/vol" },
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
VolumeMounts: [ { Name: "git-data", MountPath: "/vol" } ]
},
{ ... container that reads git-data ... }
]
}
}

It's still pretty verbose. Can we do better?

In this sort of model, almost any sort of logic can be bundled up as a
container and published, and anyone can use it immediately.

Something like this is, I think, the way to go. Details to emerge, input
wanted.

On Sat, Jan 10, 2015 at 5:42 PM, Ashley Aitken [email protected]
wrote:

+1 @rehevkor5 https://github.com/rehevkor5

I am disappointed to hear that k8s doesn't support data volume containers.

I am not sure how I am supposed to abstract r/w data away from the host
now. I was under the impression that k8s was about abstracting away the
host, but a host volume introduces all sorts of host-specfics, like having
to share the same username/group for access to the data.

@rehevkor5 https://github.com/rehevkor5 I thought the same thing about
data volume containers at first (should be written as FROM scratch) until I
read this (which may or may not be correct):
http://container42.com/2014/11/18/data-only-container-madness/

Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69480057
.

To follow up to myself - all of this assumes that any data mutations you
make have a lifetime equivalent to the pod. If the pod dies for any reason
(the machine goes down, it gets deleted in the API, some non-recoverable
failure in kubelet, etc) the data dies with it.

Durable data is a MUCH larger topic :)

On Sat, Jan 10, 2015 at 9:12 PM, Tim Hockin [email protected] wrote:

There are a few things going on here, most importantly (I think) some
confusion.

Kubernetes supports the notion of a writable "empty volume" that is shared
across containers within a pod, without resorting to host directories -
this is what an emptyDir volume is.

Now the question comes down to "but I don't want an _empty_ directory, I
want ". The question I think we need to sort out is what
the is. We currently support pull-from-git, which is just
a single instance of a larger pattern - "go fetch some pre-cooked data once
before starting my pod".

We could support myriad such "fetch once" volume kinds, git, cvs, svn,
docker containers(more below), URLs, in-line base64-encoded tarfiles,
stdout of another program, etc. I do NOT think we want to support those
all as independent volume plugins - they can almost all be done by an
unprivileged container without any help from kubelet. More, you quickly
arrive at the followup features like "...and re-pull from git every 10
minutes" - things that stop being "fetch once" and start being active
management, but do not require privileges. We make great use of such
things internally.

IMO: all of these things that can be run as a side-car container in your
pod (writing to a shared emptyDir) SHOULD BE. Git should stop being a
first-class volume, and should instead be a container of some sort. This
brings a slew of new design questions: Is it just a container like all the
other app containers? How do I ensure it runs BEFORE my app containers?
What if it experiences a failure? I don't have answers to all of these yet.

Now, let's think about the case of docker data volume containers. What is
the semantic that people are really asking for? Is it:

  • "run" a data container and then expose the entire chroot of that run?
  • "run" a data container and then expose all of the VOLUME (from
    Dockerfile) dirs as kubernetes volumes?
  • "run" a data container and then do the equivalent of --volumes-from into
    kubernetes containers?

These are all subtly different semantically, especially in the face of a
data container that has multiple VOLUME statements. Some operating modes
also make it hard to verify input until after a pod has been accepted,
scheduled, and attempted on a kubelet (we try to validate as much as we can
up front).

ACTION ITEM: I'd very much like for people who use docker data containers
to describe what behavior they would expect here.

Back to side-car containers as volumes. I could imagine something like:

Pod {
spec: {
volumes: [
{ Name: "git-data",
Source: FromContainer {
Name: "awkward", // what goes here?
Image: "kubernetes/git-volume"
EnvVars: [
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
}
}
],
containers: [ ... use git-data ...]
}

Sadly, there's not much of the container-ness that you can really hide, so
you end up re-using the Container schema, which is at least somewhat
redundant and awkward.

Alternately, I could imagine just making them extra containers:

Pod {
spec: {
volumes: [
{ Name: "git-data", Source: EmptyDir {} }
],
containers: [
{
Name: "git-puller",
Image: "kubernetes/git-volume"
EnvVars: [
{ DATADIR: "/vol" },
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
VolumeMounts: [ { Name: "git-data", MountPath: "/vol" } ]
},
{ ... container that reads git-data ... }
]
}
}

It's still pretty verbose. Can we do better?

In this sort of model, almost any sort of logic can be bundled up as a
container and published, and anyone can use it immediately.

Something like this is, I think, the way to go. Details to emerge, input
wanted.

On Sat, Jan 10, 2015 at 5:42 PM, Ashley Aitken [email protected]
wrote:

+1 @rehevkor5 https://github.com/rehevkor5

I am disappointed to hear that k8s doesn't support data volume
containers.

I am not sure how I am supposed to abstract r/w data away from the host
now. I was under the impression that k8s was about abstracting away the
host, but a host volume introduces all sorts of host-specfics, like having
to share the same username/group for access to the data.

@rehevkor5 https://github.com/rehevkor5 I thought the same thing about
data volume containers at first (should be written as FROM scratch) until I
read this (which may or may not be correct):
http://container42.com/2014/11/18/data-only-container-madness/

Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69480057
.

I'll try to describe a use case for a data container. I'll have a pod with
3 containers I'll name them "ingest", "process" and "data".

The ingest container is responsible for getting messages in some fashion
and telling the "process" container to do work.

The process container does work, but it requires access to data provided by
the "data" container, outside of kubernetes this is done using dockers
"volumes-from". This "data" can be 100's of megabytes, but most often is
10-15 gigabytes.

The data container has a process responsible for pulling the data that will
be needed by the process container. While the process container is doing
work, it's possible that a new set of data becomes available. The data
container can fetch the data and use something like symlinks to swap it so
the next time the process container begins a new process it's using the
newly available data.

Hopefully that makes some sense.

Thanks

On Sun, Jan 11, 2015 at 12:18 AM, Tim Hockin [email protected]
wrote:

To follow up to myself - all of this assumes that any data mutations you
make have a lifetime equivalent to the pod. If the pod dies for any reason
(the machine goes down, it gets deleted in the API, some non-recoverable
failure in kubelet, etc) the data dies with it.

Durable data is a MUCH larger topic :)

On Sat, Jan 10, 2015 at 9:12 PM, Tim Hockin [email protected] wrote:

There are a few things going on here, most importantly (I think) some
confusion.

Kubernetes supports the notion of a writable "empty volume" that is
shared
across containers within a pod, without resorting to host directories -
this is what an emptyDir volume is.

Now the question comes down to "but I don't want an _empty_ directory, I
want ". The question I think we need to sort out is what
the is. We currently support pull-from-git, which is
just
a single instance of a larger pattern - "go fetch some pre-cooked data
once
before starting my pod".

We could support myriad such "fetch once" volume kinds, git, cvs, svn,
docker containers(more below), URLs, in-line base64-encoded tarfiles,
stdout of another program, etc. I do NOT think we want to support those
all as independent volume plugins - they can almost all be done by an
unprivileged container without any help from kubelet. More, you quickly
arrive at the followup features like "...and re-pull from git every 10
minutes" - things that stop being "fetch once" and start being active
management, but do not require privileges. We make great use of such
things internally.

IMO: all of these things that can be run as a side-car container in your
pod (writing to a shared emptyDir) SHOULD BE. Git should stop being a
first-class volume, and should instead be a container of some sort. This
brings a slew of new design questions: Is it just a container like all
the
other app containers? How do I ensure it runs BEFORE my app containers?
What if it experiences a failure? I don't have answers to all of these
yet.

Now, let's think about the case of docker data volume containers. What is
the semantic that people are really asking for? Is it:

  • "run" a data container and then expose the entire chroot of that run?
  • "run" a data container and then expose all of the VOLUME (from
    Dockerfile) dirs as kubernetes volumes?
  • "run" a data container and then do the equivalent of --volumes-from
    into
    kubernetes containers?

These are all subtly different semantically, especially in the face of a
data container that has multiple VOLUME statements. Some operating modes
also make it hard to verify input until after a pod has been accepted,
scheduled, and attempted on a kubelet (we try to validate as much as we
can
up front).

ACTION ITEM: I'd very much like for people who use docker data containers
to describe what behavior they would expect here.

Back to side-car containers as volumes. I could imagine something like:

Pod {
spec: {
volumes: [
{ Name: "git-data",
Source: FromContainer {
Name: "awkward", // what goes here?
Image: "kubernetes/git-volume"
EnvVars: [
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
}
}
],
containers: [ ... use git-data ...]
}

Sadly, there's not much of the container-ness that you can really hide,
so
you end up re-using the Container schema, which is at least somewhat
redundant and awkward.

Alternately, I could imagine just making them extra containers:

Pod {
spec: {
volumes: [
{ Name: "git-data", Source: EmptyDir {} }
],
containers: [
{
Name: "git-puller",
Image: "kubernetes/git-volume"
EnvVars: [
{ DATADIR: "/vol" },
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
VolumeMounts: [ { Name: "git-data", MountPath: "/vol" } ]
},
{ ... container that reads git-data ... }
]
}
}

It's still pretty verbose. Can we do better?

In this sort of model, almost any sort of logic can be bundled up as a
container and published, and anyone can use it immediately.

Something like this is, I think, the way to go. Details to emerge, input
wanted.

On Sat, Jan 10, 2015 at 5:42 PM, Ashley Aitken <[email protected]

wrote:

+1 @rehevkor5 https://github.com/rehevkor5

I am disappointed to hear that k8s doesn't support data volume
containers.

I am not sure how I am supposed to abstract r/w data away from the host
now. I was under the impression that k8s was about abstracting away the
host, but a host volume introduces all sorts of host-specfics, like
having
to share the same username/group for access to the data.

@rehevkor5 https://github.com/rehevkor5 I thought the same thing
about
data volume containers at first (should be written as FROM scratch)
until I
read this (which may or may not be correct):
http://container42.com/2014/11/18/data-only-container-madness/

Reply to this email directly or view it on GitHub
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69480057

.


Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69484090
.

https://github.com/mindscratch
https://www.google.com/+CraigWickesser
https://twitter.com/mind_scratch
https://twitter.com/craig_links

To me, this does not make a strong argument. Everything you describe is
possible if your data container just writes to a shared emptyDir volume.

Now, there IS a gotcha with the initial load of data, but that has to be
handled in any similar model. Either the data is immutable, in which case
the data container can load it once and go to sleep, or else it is changing
over time, time, in which case you have to wait for it to get current. In
the former case, the initial data is ALL that matters. Is that an
interesting use? In the latter, does the initial data matter, or only
"current" data?

The only other argument is to be exactly docker compatible semantically,
but frankly the volumes-from behavior is so semantically rigid, it may not
be worth being compatible with.
On Jan 11, 2015 9:26 AM, "Craig Wickesser" [email protected] wrote:

I'll try to describe a use case for a data container. I'll have a pod with
3 containers I'll name them "ingest", "process" and "data".

The ingest container is responsible for getting messages in some fashion
and telling the "process" container to do work.

The process container does work, but it requires access to data provided
by
the "data" container, outside of kubernetes this is done using dockers
"volumes-from". This "data" can be 100's of megabytes, but most often is
10-15 gigabytes.

The data container has a process responsible for pulling the data that
will
be needed by the process container. While the process container is doing
work, it's possible that a new set of data becomes available. The data
container can fetch the data and use something like symlinks to swap it so
the next time the process container begins a new process it's using the
newly available data.

Hopefully that makes some sense.

Thanks

On Sun, Jan 11, 2015 at 12:18 AM, Tim Hockin [email protected]
wrote:

To follow up to myself - all of this assumes that any data mutations you
make have a lifetime equivalent to the pod. If the pod dies for any
reason
(the machine goes down, it gets deleted in the API, some non-recoverable
failure in kubelet, etc) the data dies with it.

Durable data is a MUCH larger topic :)

On Sat, Jan 10, 2015 at 9:12 PM, Tim Hockin [email protected] wrote:

There are a few things going on here, most importantly (I think) some
confusion.

Kubernetes supports the notion of a writable "empty volume" that is
shared

across containers within a pod, without resorting to host directories

this is what an emptyDir volume is.

Now the question comes down to "but I don't want an _empty_ directory,
I
want ". The question I think we need to sort out is
what
the is. We currently support pull-from-git, which is
just
a single instance of a larger pattern - "go fetch some pre-cooked data
once
before starting my pod".

We could support myriad such "fetch once" volume kinds, git, cvs, svn,
docker containers(more below), URLs, in-line base64-encoded tarfiles,
stdout of another program, etc. I do NOT think we want to support
those
all as independent volume plugins - they can almost all be done by an
unprivileged container without any help from kubelet. More, you
quickly
arrive at the followup features like "...and re-pull from git every 10
minutes" - things that stop being "fetch once" and start being active
management, but do not require privileges. We make great use of such
things internally.

IMO: all of these things that can be run as a side-car container in
your
pod (writing to a shared emptyDir) SHOULD BE. Git should stop being a
first-class volume, and should instead be a container of some sort.
This
brings a slew of new design questions: Is it just a container like all
the
other app containers? How do I ensure it runs BEFORE my app
containers?
What if it experiences a failure? I don't have answers to all of these
yet.

Now, let's think about the case of docker data volume containers. What
is
the semantic that people are really asking for? Is it:

  • "run" a data container and then expose the entire chroot of that
    run?
  • "run" a data container and then expose all of the VOLUME (from
    Dockerfile) dirs as kubernetes volumes?
  • "run" a data container and then do the equivalent of --volumes-from
    into
    kubernetes containers?

These are all subtly different semantically, especially in the face of
a
data container that has multiple VOLUME statements. Some operating
modes
also make it hard to verify input until after a pod has been accepted,
scheduled, and attempted on a kubelet (we try to validate as much as
we
can
up front).

ACTION ITEM: I'd very much like for people who use docker data
containers
to describe what behavior they would expect here.

Back to side-car containers as volumes. I could imagine something
like:

Pod {
spec: {
volumes: [
{ Name: "git-data",
Source: FromContainer {
Name: "awkward", // what goes here?
Image: "kubernetes/git-volume"
EnvVars: [
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
}
}
],
containers: [ ... use git-data ...]
}

Sadly, there's not much of the container-ness that you can really
hide,
so
you end up re-using the Container schema, which is at least somewhat
redundant and awkward.

Alternately, I could imagine just making them extra containers:

Pod {
spec: {
volumes: [
{ Name: "git-data", Source: EmptyDir {} }
],
containers: [
{
Name: "git-puller",
Image: "kubernetes/git-volume"
EnvVars: [
{ DATADIR: "/vol" },
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
VolumeMounts: [ { Name: "git-data", MountPath: "/vol" } ]
},
{ ... container that reads git-data ... }
]
}
}

It's still pretty verbose. Can we do better?

In this sort of model, almost any sort of logic can be bundled up as a
container and published, and anyone can use it immediately.

Something like this is, I think, the way to go. Details to emerge,
input
wanted.

On Sat, Jan 10, 2015 at 5:42 PM, Ashley Aitken <
[email protected]

wrote:

+1 @rehevkor5 https://github.com/rehevkor5

I am disappointed to hear that k8s doesn't support data volume
containers.

I am not sure how I am supposed to abstract r/w data away from the
host
now. I was under the impression that k8s was about abstracting away
the
host, but a host volume introduces all sorts of host-specfics, like
having
to share the same username/group for access to the data.

@rehevkor5 https://github.com/rehevkor5 I thought the same thing
about
data volume containers at first (should be written as FROM scratch)
until I
read this (which may or may not be correct):
http://container42.com/2014/11/18/data-only-container-madness/

Reply to this email directly or view it on GitHub
<

https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69480057

.

Reply to this email directly or view it on GitHub
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69484090>

.

https://github.com/mindscratch
https://www.google.com/+CraigWickesser
https://twitter.com/mind_scratch
https://twitter.com/craig_links

Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69502422
.

Using an emptydir that is available between containers sounds sufficient.
The initial data could actually be baked into the docker image, then the
process in the "data" container could make sure it updates it when
necessary.

After understanding "emptydir" better, I agree, the use case I provided
would work with what Kubernetes supports today.

Thanks.

On Sun, Jan 11, 2015 at 1:36 PM, Tim Hockin [email protected]
wrote:

To me, this does not make a strong argument. Everything you describe is
possible if your data container just writes to a shared emptyDir volume.

Now, there IS a gotcha with the initial load of data, but that has to be
handled in any similar model. Either the data is immutable, in which case
the data container can load it once and go to sleep, or else it is changing
over time, time, in which case you have to wait for it to get current. In
the former case, the initial data is ALL that matters. Is that an
interesting use? In the latter, does the initial data matter, or only
"current" data?

The only other argument is to be exactly docker compatible semantically,
but frankly the volumes-from behavior is so semantically rigid, it may not
be worth being compatible with.
On Jan 11, 2015 9:26 AM, "Craig Wickesser" [email protected]
wrote:

I'll try to describe a use case for a data container. I'll have a pod
with
3 containers I'll name them "ingest", "process" and "data".

The ingest container is responsible for getting messages in some fashion
and telling the "process" container to do work.

The process container does work, but it requires access to data provided
by
the "data" container, outside of kubernetes this is done using dockers
"volumes-from". This "data" can be 100's of megabytes, but most often is
10-15 gigabytes.

The data container has a process responsible for pulling the data that
will
be needed by the process container. While the process container is doing
work, it's possible that a new set of data becomes available. The data
container can fetch the data and use something like symlinks to swap it
so
the next time the process container begins a new process it's using the
newly available data.

Hopefully that makes some sense.

Thanks

On Sun, Jan 11, 2015 at 12:18 AM, Tim Hockin [email protected]
wrote:

To follow up to myself - all of this assumes that any data mutations
you
make have a lifetime equivalent to the pod. If the pod dies for any
reason
(the machine goes down, it gets deleted in the API, some
non-recoverable
failure in kubelet, etc) the data dies with it.

Durable data is a MUCH larger topic :)

On Sat, Jan 10, 2015 at 9:12 PM, Tim Hockin [email protected]
wrote:

There are a few things going on here, most importantly (I think) some
confusion.

Kubernetes supports the notion of a writable "empty volume" that is
shared

across containers within a pod, without resorting to host directories

this is what an emptyDir volume is.

Now the question comes down to "but I don't want an _empty_
directory,
I
want ". The question I think we need to sort out is
what
the is. We currently support pull-from-git, which is
just
a single instance of a larger pattern - "go fetch some pre-cooked
data
once
before starting my pod".

We could support myriad such "fetch once" volume kinds, git, cvs,
svn,
docker containers(more below), URLs, in-line base64-encoded tarfiles,
stdout of another program, etc. I do NOT think we want to support
those
all as independent volume plugins - they can almost all be done by an
unprivileged container without any help from kubelet. More, you
quickly
arrive at the followup features like "...and re-pull from git every
10
minutes" - things that stop being "fetch once" and start being active
management, but do not require privileges. We make great use of such
things internally.

IMO: all of these things that can be run as a side-car container in
your
pod (writing to a shared emptyDir) SHOULD BE. Git should stop being a
first-class volume, and should instead be a container of some sort.
This
brings a slew of new design questions: Is it just a container like
all
the
other app containers? How do I ensure it runs BEFORE my app
containers?
What if it experiences a failure? I don't have answers to all of
these
yet.

Now, let's think about the case of docker data volume containers.
What
is
the semantic that people are really asking for? Is it:

  • "run" a data container and then expose the entire chroot of that
    run?
  • "run" a data container and then expose all of the VOLUME (from
    Dockerfile) dirs as kubernetes volumes?
  • "run" a data container and then do the equivalent of --volumes-from
    into
    kubernetes containers?

These are all subtly different semantically, especially in the face
of
a
data container that has multiple VOLUME statements. Some operating
modes
also make it hard to verify input until after a pod has been
accepted,
scheduled, and attempted on a kubelet (we try to validate as much as
we
can
up front).

ACTION ITEM: I'd very much like for people who use docker data
containers
to describe what behavior they would expect here.

Back to side-car containers as volumes. I could imagine something
like:

Pod {
spec: {
volumes: [
{ Name: "git-data",
Source: FromContainer {
Name: "awkward", // what goes here?
Image: "kubernetes/git-volume"
EnvVars: [
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
}
}
],
containers: [ ... use git-data ...]
}

Sadly, there's not much of the container-ness that you can really
hide,
so
you end up re-using the Container schema, which is at least somewhat
redundant and awkward.

Alternately, I could imagine just making them extra containers:

Pod {
spec: {
volumes: [
{ Name: "git-data", Source: EmptyDir {} }
],
containers: [
{
Name: "git-puller",
Image: "kubernetes/git-volume"
EnvVars: [
{ DATADIR: "/vol" },
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
VolumeMounts: [ { Name: "git-data", MountPath: "/vol" } ]
},
{ ... container that reads git-data ... }
]
}
}

It's still pretty verbose. Can we do better?

In this sort of model, almost any sort of logic can be bundled up as
a
container and published, and anyone can use it immediately.

Something like this is, I think, the way to go. Details to emerge,
input
wanted.

On Sat, Jan 10, 2015 at 5:42 PM, Ashley Aitken <
[email protected]

wrote:

+1 @rehevkor5 https://github.com/rehevkor5

I am disappointed to hear that k8s doesn't support data volume
containers.

I am not sure how I am supposed to abstract r/w data away from the
host
now. I was under the impression that k8s was about abstracting away
the
host, but a host volume introduces all sorts of host-specfics, like
having
to share the same username/group for access to the data.

@rehevkor5 https://github.com/rehevkor5 I thought the same thing
about
data volume containers at first (should be written as FROM scratch)
until I
read this (which may or may not be correct):
http://container42.com/2014/11/18/data-only-container-madness/

Reply to this email directly or view it on GitHub
<

https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69480057

.

Reply to this email directly or view it on GitHub
<

https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69484090

.

https://github.com/mindscratch
https://www.google.com/+CraigWickesser
https://twitter.com/mind_scratch
https://twitter.com/craig_links

Reply to this email directly or view it on GitHub
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69502422

.


Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69505207
.

https://github.com/mindscratch
https://www.google.com/+CraigWickesser
https://twitter.com/mind_scratch
https://twitter.com/craig_links

I am not trying to dodge supporting a cool feature. In the case of
immutable data, emptyDir would incur a copy and 2x disk usage. That is not
acceptable. The question I have is whether such a use case is interesting,
and if so, what semantic people would think is "obvious".

Then there is the issue of live-updated data. Does the initial payload
matter? What sort of API would capture this idea? Is static data just a
special case of this?
On Jan 11, 2015 10:43 AM, "Craig Wickesser" [email protected]
wrote:

Using an emptydir that is available between containers sounds sufficient.
The initial data could actually be baked into the docker image, then the
process in the "data" container could make sure it updates it when
necessary.

After understanding "emptydir" better, I agree, the use case I provided
would work with what Kubernetes supports today.

Thanks.

On Sun, Jan 11, 2015 at 1:36 PM, Tim Hockin [email protected]
wrote:

To me, this does not make a strong argument. Everything you describe is
possible if your data container just writes to a shared emptyDir volume.

Now, there IS a gotcha with the initial load of data, but that has to be
handled in any similar model. Either the data is immutable, in which
case
the data container can load it once and go to sleep, or else it is
changing
over time, time, in which case you have to wait for it to get current.
In
the former case, the initial data is ALL that matters. Is that an
interesting use? In the latter, does the initial data matter, or only
"current" data?

The only other argument is to be exactly docker compatible semantically,
but frankly the volumes-from behavior is so semantically rigid, it may
not
be worth being compatible with.
On Jan 11, 2015 9:26 AM, "Craig Wickesser" [email protected]
wrote:

I'll try to describe a use case for a data container. I'll have a pod
with
3 containers I'll name them "ingest", "process" and "data".

The ingest container is responsible for getting messages in some
fashion
and telling the "process" container to do work.

The process container does work, but it requires access to data
provided
by
the "data" container, outside of kubernetes this is done using dockers
"volumes-from". This "data" can be 100's of megabytes, but most often
is
10-15 gigabytes.

The data container has a process responsible for pulling the data that
will
be needed by the process container. While the process container is
doing
work, it's possible that a new set of data becomes available. The data
container can fetch the data and use something like symlinks to swap
it
so
the next time the process container begins a new process it's using
the
newly available data.

Hopefully that makes some sense.

Thanks

On Sun, Jan 11, 2015 at 12:18 AM, Tim Hockin [email protected]

wrote:

To follow up to myself - all of this assumes that any data mutations
you
make have a lifetime equivalent to the pod. If the pod dies for any
reason
(the machine goes down, it gets deleted in the API, some
non-recoverable
failure in kubelet, etc) the data dies with it.

Durable data is a MUCH larger topic :)

On Sat, Jan 10, 2015 at 9:12 PM, Tim Hockin [email protected]
wrote:

There are a few things going on here, most importantly (I think)
some
confusion.

Kubernetes supports the notion of a writable "empty volume" that
is
shared
across containers within a pod, without resorting to host

directories

this is what an emptyDir volume is.

Now the question comes down to "but I don't want an _empty_
directory,
I
want ". The question I think we need to sort out
is
what
the is. We currently support pull-from-git, which
is
just
a single instance of a larger pattern - "go fetch some pre-cooked
data
once
before starting my pod".

We could support myriad such "fetch once" volume kinds, git, cvs,
svn,
docker containers(more below), URLs, in-line base64-encoded
tarfiles,
stdout of another program, etc. I do NOT think we want to support
those
all as independent volume plugins - they can almost all be done by
an
unprivileged container without any help from kubelet. More, you
quickly
arrive at the followup features like "...and re-pull from git
every
10
minutes" - things that stop being "fetch once" and start being
active
management, but do not require privileges. We make great use of
such
things internally.

IMO: all of these things that can be run as a side-car container
in
your
pod (writing to a shared emptyDir) SHOULD BE. Git should stop
being a
first-class volume, and should instead be a container of some
sort.
This
brings a slew of new design questions: Is it just a container like
all
the
other app containers? How do I ensure it runs BEFORE my app
containers?
What if it experiences a failure? I don't have answers to all of
these
yet.

Now, let's think about the case of docker data volume containers.
What
is
the semantic that people are really asking for? Is it:

  • "run" a data container and then expose the entire chroot of that
    run?
  • "run" a data container and then expose all of the VOLUME (from
    Dockerfile) dirs as kubernetes volumes?
  • "run" a data container and then do the equivalent of
    --volumes-from
    into
    kubernetes containers?

These are all subtly different semantically, especially in the
face
of
a
data container that has multiple VOLUME statements. Some operating
modes
also make it hard to verify input until after a pod has been
accepted,
scheduled, and attempted on a kubelet (we try to validate as much
as
we
can
up front).

ACTION ITEM: I'd very much like for people who use docker data
containers
to describe what behavior they would expect here.

Back to side-car containers as volumes. I could imagine something
like:

Pod {
spec: {
volumes: [
{ Name: "git-data",
Source: FromContainer {
Name: "awkward", // what goes here?
Image: "kubernetes/git-volume"
EnvVars: [
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
}
}
],
containers: [ ... use git-data ...]
}

Sadly, there's not much of the container-ness that you can really
hide,
so
you end up re-using the Container schema, which is at least
somewhat
redundant and awkward.

Alternately, I could imagine just making them extra containers:

Pod {
spec: {
volumes: [
{ Name: "git-data", Source: EmptyDir {} }
],
containers: [
{
Name: "git-puller",
Image: "kubernetes/git-volume"
EnvVars: [
{ DATADIR: "/vol" },
{ REPO: "http://github.com/yourname/something"},
{ RESYNC: "true" },
{ RESYNC_INTERVAL: 60 }
]
VolumeMounts: [ { Name: "git-data", MountPath: "/vol" } ]
},
{ ... container that reads git-data ... }
]
}
}

It's still pretty verbose. Can we do better?

In this sort of model, almost any sort of logic can be bundled up
as
a
container and published, and anyone can use it immediately.

Something like this is, I think, the way to go. Details to emerge,
input
wanted.

On Sat, Jan 10, 2015 at 5:42 PM, Ashley Aitken <
[email protected]

wrote:

+1 @rehevkor5 https://github.com/rehevkor5

I am disappointed to hear that k8s doesn't support data volume
containers.

I am not sure how I am supposed to abstract r/w data away from
the
host
now. I was under the impression that k8s was about abstracting
away
the
host, but a host volume introduces all sorts of host-specfics,
like
having
to share the same username/group for access to the data.

@rehevkor5 https://github.com/rehevkor5 I thought the same
thing
about
data volume containers at first (should be written as FROM
scratch)
until I
read this (which may or may not be correct):
http://container42.com/2014/11/18/data-only-container-madness/

Reply to this email directly or view it on GitHub
<

https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69480057

.

Reply to this email directly or view it on GitHub
<

https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69484090

.

https://github.com/mindscratch
https://www.google.com/+CraigWickesser
https://twitter.com/mind_scratch
https://twitter.com/craig_links

Reply to this email directly or view it on GitHub
<

https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69502422

.

Reply to this email directly or view it on GitHub
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69505207>

.

https://github.com/mindscratch
https://www.google.com/+CraigWickesser
https://twitter.com/mind_scratch
https://twitter.com/craig_links

Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-69505477
.

P2 to at least document this better.

Wouldn't docker data only container be persistent? Would be great to use it with kubernetes, because I need persistent data without iscsi, NFS vor host_dir (permission problems). I would use Fedora Atomic as a single minion instance.

@pwFoo currently there is no mechanism to share data only containers between multiple minions.

I have a single host / minion setup and need local persistent storage. Hostdir without permission fix isn't a solution, because the user inside the container haven't permissions to write to the volume.

Kubernetes is optimized for the many-node use case. Sometimes, as a consequence, things that seem like they should be easy in a single-node case may not be supported.

Most docker containers today run as root, so you should be able to write to
a hostPath mount. If you are running your containers as non-root (good for
you!) I would be amenable to some different tweak on the model that gave
you what you wanted. For example, something like

"hostPath": { "path": "/tmp/mypod", "uid": 93, "gid": 76, "mode": 0700 }

...with well-defined semantics (e.g. only applies if the path does not
exist or already exists in the exact same config). or even something
higher level to abstract the real host path away.

But first - can you explain what "persistent" means to you? Under what
conditions is the data retained? What is the bounding lifetime?

On Wed, Apr 22, 2015 at 7:36 AM, pwFoo [email protected] wrote:

I have a single host / minion setup and need local persistent storage.
Hostdir without permission fix isn't a solution, because the user inside
the container haven't permissions to write to the volume.


Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-95210227
.

Hello thockin,

I need reboot save volumes used to store mysql databases or webspace content. So lifetime needed is forever (not dependent on service / pod / host uptime). Changes should be saved directly.

Because I have a single minion setup, hostdir with correct / adjustable permissions (uid / gid) should be fine at the moment.

Regards

OK, I'll give this some thought. No promises on timeframe, though.

On Fri, Apr 24, 2015 at 12:47 AM, pwFoo [email protected] wrote:

Hello thockin,

I need reboot save volumes used to store mysql databases or webspace
content. So lifetime needed is forever (not dependent on service / pod /
host uptime). Changes should be saved directly.

Because I have a single minion setup, hostdir with correct / adjustable
permissions (uid / gid) should be fine at the moment.

Regards


Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-95837246
.

Thanks @thockin :)

I want to be able to mount the busybox binary from a busybox image into a container with an image that doesn't have a shell.

I think that's a great goal. I'm not sure we have bandwidth to hit this
before 1.0 - especially considering therer are a bunch of issues to iron
out around data containers and containers as volumes.

On Tue, May 12, 2015 at 4:53 PM, Paul Morie [email protected]
wrote:

I want to be able to mount the busybox binary from a busybox image into a
container with an image that doesn't have a shell.


Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-101462375
.

No doubt @thockin, just wanted to record it
On Tue, May 12, 2015 at 8:18 PM Tim Hockin [email protected] wrote:

I think that's a great goal. I'm not sure we have bandwidth to hit this
before 1.0 - especially considering therer are a bunch of issues to iron
out around data containers and containers as volumes.

On Tue, May 12, 2015 at 4:53 PM, Paul Morie [email protected]
wrote:

I want to be able to mount the busybox binary from a busybox image into a
container with an image that doesn't have a shell.


Reply to this email directly or view it on GitHub
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-101462375

.


Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/issues/831#issuecomment-101465996
.

I think that people will want to separately manage several distinct file sets:

  • base image
  • debugging tools (like sh)
  • general language runtime / environment / prerequisites
  • other app dependencies
  • app
  • app static data
  • app configuration
  • various secrets (SSL, ssh, ...)

+1 to wanting this!

My use case is as follows:

Have a cluster with a "cluster master". Have a set of scripts + libraries that "slave" images need to run to register with a cluster master. Would like users to be able to use images out-of-the-box as slaves, (without making a child image just for this use-case). Solution is to have a data volume image with scripts+libraries that users can mount read-only with "--volumes-from" and specify a command in their pod config running the script.

Current workaround is to make GCEPersistentDisk which originates from the data volume image, and mount that, but there is no reason this couldn't be made cloud-provider agnostic.

@elibixby Interesting - could you just mount those scripts and libraries in an NFS volume? Or is that overly complex for the solution?

Yeah it's a possible workaround. That becomes a lot more heavyweight if I want a large "library" of scripts or programs that I want to attach to containers. Much easier to maintain images in a image repository, than maintain a fleet of disks.

ping @thockin , what do you think if we support volumes-from within the Pod as I mentioned in #12526 ?

Good:
It will not change the behavior of Pod
It will release me of doing cp things to empty_dir (I can't do cp in Dockerfile as dest is a volume dir)
It solves 90% of what people want in this issue and bring a new way to deploy apps

Bad:
Kind of docker way, some owners may not like it ...

Other work around:
we can make empty_dir do not clean the contents in Volume dir. Just by backing up and restore the contents to host dir (this is just what docker -v did).

cc @dchen1107 @erictune

/cc @dalanlan

@resouer A couple thoughts on volumes-from:

  1. This approach will be racy, since the source container will have to copy data into the volume. The container consuming data from the volume will have to:

    1. Know what to expect in the volume

    2. Monitor the volume and wait for the data to be copied in

  2. As you say, this is a docker-only feature; I'm hesitant to depend on this since ideally all runtimes will be able to have parity as backends to run pods. As an alternative (though I think this would be really tricky), we could try exploding an image's rootfs into a volume directory, but this isn't a great solution.

+1
My humble use case is:

  • configuration of applications would be on a tmpfs based voume, represented by a data container.(Docker support for tmpfs comes soon). There would be a single pod of this data container per host.
  • application pods use this data pod to read their config. There can be multiple application pods using this single data pod on the same host. Using persistent storage is not OK, the configuration data is used very frequently, actually. I must have the speed of in-memory storage.
  • the benefit of having a single instance is obvious: no need to reserve memory for the same data for each and every application pod.

OK, it is a "bit" different from your use case of having a "container" type volume inside a pod. My use case is about having a "pod" container type, which is accessible by other pods on the same host.

Another +1 for this idea. A "container" volume type would simplify our current workflow of using a shared emptyDir volume that has to have shared data copied to it, and the consuming containers synchronized somehow.

@saad-ali another for the queue

+1, I have just needed something like this as well.

A couple thoughts / ideas:

  1. We should decide whether this feature would depend on the configuration of the host; #8702 deals with generalizing image representation in the API; I'd prefer not to make a docker-only feature if we can avoid it. I'm not sure that we should go as far as promising to support whatever image format a user wants to use regardless of host configuration, but there is definitely some implicit linkage potential between this feature and what container runtime you're using, and that should at least be made explicit.
  2. I don't like the idea of containers having to synchronize on the data to avoid races. I think if we move forward with this feature the data should be available in the pod when the pod is started. To accomplish this with docker, I think we would need to run a container, docker copy a path out of the container onto the host, and delete the container. Maybe there's more complicated stuff we can do to cache things to avoid doing this each time we start a pod.

@pmorie

  1. yes, I agree with you. Docker specific features is not good. At least, we need to decide that based on containerRuntime and warn user
  2. actually,copy data out is what I originally trying to do in #12526. but for now Pod doesn't support one-time-off task well: a) RC will continually restart the container, b) restartPolicy is not container specific.

So, as restartPolicy is Pod level, that remind me if we can allow a specific "task" container running in a Pod, things will be much better.

What do you think about having a "container manager" (e.g. Docker) specific part in the pod manifest? So, the owner of the pod descriptor could define here container manager specific options, and the task of kubelet would be to deliver those options to the container manager when the containers are created. A kind of transparent tool for delivering container manager specific data through the Kubernetes framework.
It may help in other cases, too. E.g. "incubation" of new container manager features. Once such feature becomes a kind of standard for all container managers, it can be moved from this transparent part to the structured part of the pod descriptor.

This issue is how to create a convenient, extensible way to get data into volumes. Persistence of local data is another issue -- #7562. The case of using --volumes-from just to keep track of a shared volume while it's in use isn't relevant to Kubernetes.

I think there are at least 2 cases we could support here:

  1. Passive containers, where the image is pulled and the file system (the whole thing or parts of it) of the unpacked image could be mounted directly (i.e., not copied), as readonly. Just (ab)using images as a generic data delivery mechanism. I have no idea whether Docker users are doing this or not.
  2. "Setup" containers that are executed prior to executing the other containers of the pod, for the purpose of populating volumes.

Some issues to work out with the latter:

  • Retry behavior -- for instance, distinguishing permanent from retry-able failure
  • Update behavior

    • Do we allow the setup container to be updated?

    • If so, does that kill all running pod containers? Or just ones depending on the volume(s) it writes?

  • What the execution context is, such as allowed security context, secrets available, other volumes visible, what happens to logs, etc.

    • Answers would be simpler if it were just a container with a simple dependency model (i.e., all setup containers concurrently or in an arbitrary order, then all ordinary pod containers if/once all of the setup containers have succeeded).

    • However, not knowing where the data is written (e.g., which volume), whether the container is safe to execute on non-empty volumes, whether it has other side effects, etc. could make retry and update semantics harder, and would decrease isolation between the setup container and the rest of the pod (e.g., potentially exposing secrets).

  • Ideally, any existing storage backend -- EmptyDir (tmpfs or disk), PD, NFS, etc. -- would be usable. For anything other than a fresh EmptyDir, the container would need to be safe to invoke on a non-empty volume.

    • This suggests that the mechanism shouldn't be represented as a new volume type, but possibly as a volume initialization hook.

    • If the whole pod were visible to the container, it would make more sense to represent the container as a pod hook or special container property than as a volume hook.

  • Surfacing errors (and maybe progress?)
  • Provenance, auditing, etc.

@bgrant0607, #1 is a pretty common use case (at least for us). We build a web application container that provides both a server component as well as static assets. Ideally, we would be able to initialize a read only volume from that same image so that an nginx container can serve the assets.

This is probably similar to the one of the use cases for the git container. However, it's handy to use docker images as the only way to distribute artifacts as well as _not_ check-in minified css / js.

Currently, we accomplish this in one of two ways:

  1. Copying files in on start from another container in the pod.
  2. Installing nginx in our image so that the same container can be used either for static assets or for application server.

Cheers!

@kjvalencik How are you achieving your first solution right now? I am trying a similar solution but it doesn't appear as though the container holding the files is starting first, so the container that needs the files is failing to start because it cannot find them.

@scrisenbery A bash start script that waits for the files to exist. Example:

_Container providing files_

# Copy the files to a temporary shared folder
cp -r /src/files /mnt/shared/files_tmp

# Rename the folder so that all files are available atomically
mv /mnt/shared/files_tmp /mnt/shared/files

_Container that requires files_

# Sleep until the shared files exist
until [ -d /mnt/shared/files ]; do sleep 1; done

# Start the application
./start.sh

@kjvalencik Perhaps your experience is different, but I could not get that solution to work reliably. Often the files would get copied over successfully but would never be "seen" by the consuming container.

@kjvalencik Currently we are using the containers entrypoint (it is logstash in this case) and sending it an argument to find the config file in the mounted directory.

Part of the reason we want config files separated from the main images is so we can maintain them separately and utilize the official images available on the Docker Hub. If we have to bake in a script to the main container anyway it defeats the purpose.

@scrisenbery You can still use the default image, but you need to provide your own entrypoint that then calls the default. Untested example:

kind: Pod
apiVersion: v1
metadata:
  name: logstash
spec:
  containers:
  - name: config
    image: my-logstash-config-container
    command:
    - /bin/bash
    - -c
    - cp /src/logstash.conf /mnt/shared/logstash.conf;
      while true; sleep 10; done
    volumeMounts:
    - name: shared
      mountPath: /mnt/shared
  - name: logstash
    image: logstash
    command:
    - /bin/bash
    - -c
    - until [ -e /mnt/shared/logstash.conf ]; do sleep 1; done &&
      /docker-entrypoint.sh logstash agent -f /mnt/shared/logstash.conf
    volumeMounts:
    - name: shared
      mountPath: /mnt/shared
  volumes:
  - name: shared
    emptyDir: {}

@bkeroackdsc I haven't had any issues doing this. The key is to ensure that the files all become available atomically--hence, the cp to a temporary location and then a mv command.

With that said, I would still like to see this feature implemented in order to reduce the boilerplate and coordination.

Although, one of the advantages of the cp method is that you can restart the container that moves files to the shared directory in order to update the data without removing and re-scheduling the entire pod.

EDIT: @scrisenbery Your use case actually more closely matches secrets. http://kubernetes.io/v1.0/docs/user-guide/secrets.html

@scrisenbery Example using secrets.

Create this file in the same directory as your logstash.conf.
logstash-secret.yaml.sh

#!/bin/bash

DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
CONF=$(base64 $DIR/logstash.conf | tr -d "\n")

cat << EOF
kind: Secret
apiVersion: v1
metadata:
  name: logash-config
data:
  logstash.conf: $CONF
EOF

Create the secret

./logstash-secret.yaml.sh | kubectl create -f -

Logstash pod

kind: Pod
apiVersion: v1
metadata:
  name: logstash
spec:
  containers:
  - name: logstash
    image: logstash
    args:
    - -f
    - /mnt/config/logstash.conf
    volumeMounts:
    - name: logstash-config
      mountPath: /mnt/config
  volumes:
  - name: logstash-config
    secret:
      secretName: logstash-config

@kjvalencik So instead of having a second container, use a secret that can be mounted and when mounted contains the file I need? It looks promising but how do you update the config file when the time comes?

Secret volumes are created and mounted at pod start. Once the pod is started, they are not tied to the secret, which means it is safe simply delete the secret and re-create it. Essentially the workflow looks like this:

  1. Delete the old secret
  2. Create the new secret
  3. Rolling restart of pods that use the secret

cc @kelseyhightower @smarterclayton

@kjvalencik Will the secret work in JSON as opposed to YAML?

Quick note:

We need (at least) 2 things:

  1. mount from another image
  2. run container to populate a volume

I think the config file use case is a good one for this issue though. While a secret does appear to provide the functionality I need, I suspect that isn't really what a secret is meant for.

@scrisenbery: I believe #6477 is meant to cover the config file case well before we hit this issue. I'm actually looking forward to this issue for large blob sidecars: things like .jar files, or Jenkins plugins, etc., that you want to compose with a base image.

Thanks @zmerlynn, I had not come across that one before. It seems like that's still a work-in-progress too, right? So for the moment I'm limited to secrets?

@scrisenbery See my comments about the sidecar mode in my gist: https://gist.github.com/resouer/378bcdaef1d9601ed6aa

And sorry for my late response.

Someone from RH was looking at mount from another image - @rhatdan, who was
that?

On Nov 11, 2015, at 2:41 PM, Brian Grant [email protected] wrote:

Quick note:

We need (at least) 2 things:

  1. mount from another image
  2. run container to populate a volume


Reply to this email directly or view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-155889358
.

We have the ability via atomic mount to mount container images.

On 11/17/2015 06:36 PM, Clayton Coleman wrote:

Someone from RH was looking at mount from another image - @rhatdan,
who was
that?

On Nov 11, 2015, at 2:41 PM, Brian Grant [email protected] wrote:

Quick note:

We need (at least) 2 things:

  1. mount from another image
  2. run container to populate a volume


Reply to this email directly or view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-155889358
.


Reply to this email directly or view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-157545404.

When I filed this bug, I REALLY meant that we could "run" containers that have VOLUME definitions and then use those in app containers (as in --volumes-from but able to mount wherever). That was all. Init containers are neat, and we should totally do that, but I filed this as a response to "I have all my data in a container with a VOLUME def in the Dockerfile - why can't I use it". As best I can tell that does not trigger any copies.

Now, that said, it wasn't clear how important that was, and this it languished and has not had many people asking for it lately.

So data-init-containers, yes, we should.

Do we still need the ability to do volumes-from?

BTW, the ability to mount another image would give us composability similar to what we get from Borg's package manager:
https://www.usenix.org/sites/default/files/conference/protected-files/lisa_2014_talk.pdf

I agree, though nobody is really asking for that, so I am not sure it is
worth much energy at the moment.

On Thu, Nov 19, 2015 at 7:30 PM, Brian Grant [email protected]
wrote:

BTW, the ability to mount another image would give us composability
similar to what we get from Borg's package manager:

https://www.usenix.org/sites/default/files/conference/protected-files/lisa_2014_talk.pdf


Reply to this email directly or view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-158268865
.

I would love it since I have a use case. However, I'm fine with just us
pursuing it for now.

On Fri, Nov 20, 2015 at 2:27 AM, Tim Hockin [email protected]
wrote:

I agree, though nobody is really asking for that, so I am not sure it is
worth much energy at the moment.

On Thu, Nov 19, 2015 at 7:30 PM, Brian Grant [email protected]
wrote:

BTW, the ability to mount another image would give us composability
similar to what we get from Borg's package manager:

https://www.usenix.org/sites/default/files/conference/protected-files/lisa_2014_talk.pdf


Reply to this email directly or view it on GitHub
<
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-158268865

.


Reply to this email directly or view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-158310001
.

cc @erinboyd

Other use cases:

16010

https://github.com/flynn/flynn/tree/master/slugrunner, if it didn't already support pulling from URL

BTW, the ability to mount another image would give us composability similar to what we get from Borg's package manager

Joe Beda also talked about this composability in Midas in this lovely blog post.

For our use case, simple layering of static docker images (whether through --volumes-from or Kubernetes volumes APIs) would be fantastic. We often have medium sized static assets (< 4GB) that get reused by thousands of pods in a read-only fashion. Reusing the docker image cache for these types of assets would be much better than our current system (attaching volumes from snapshots with this data to every worker instance and then mounting that hostDir path to every container.

@philips adding you in as you talked about this in your keynote :) Could you guys commit a resource for implementing this (as a starter) for 1.3?

+1. Need a simple way to use configuration data without having to rebuild images. Using git won't work because it then requires git to be a redundant highly available system. Network and persistent storage is not an option either. I just need to use some prepackaged data that varies between environments and can be updated independently of images. For now it looks like the way to do this is to deploy two images (one storing configuration data) and dump that configuration data into an emptyDir volume for the other image to use.

@markpaychex does ConfigMap not solve that problem for you?

@pmorie It would work for some things but I don't think it would cover all bases (such as xml files from libraries that require initialization from a file and said files are different in different environments).

@markpaychex you can project configmap keys into files, and the API resource is completely agnostic of what you throw into the files - you can store json blobs, yaml blobs, xml, whatever you want.

@markpaychex sorry, meant to link to the docs earlier on ConfigMap, please let me know if this helps clarify:

http://kubernetes.io/docs/user-guide/configmap/

@pmorie Yes that may work I'll have to play around with it. Thanks!

We've also got a need for this, specifically for running nginx & php-fpm. A git repository is not an acceptable solution for us due to permissions on repositories, and we'd rather the actual files be built into an image that can be distributed to save external dependencies on git servers etc.

For now, the copy-on-launch to an emptyDir is acceptable, however not at all ideal as it incurs a start up time penalty as well as disk usage.

Any update on this?

We have a use case for this as well. We have about a dozen images with a shared dependency, and we want to be able to upgrade that dependency without rebuilding all of the images. The dependency consists of many small files in deeply nested folders totaling several hundred MB. ConfigMap would not be ergonomic for our case even if the size restriction were removed. A zip volume or init container would technically work, but any copying or unzipping on startup would add a noticeable amount to startup time.

https://github.com/kubernetes/kubernetes/pull/23666 proposes init containers that can be used to pre-populate volumes. This would eliminate a lot of the need for a static read-only container volume. The only argument I could see for still wanting a container backed volume is that with init containers you'd still need to download once per pod--vs with a container backed volume, it would only be downloaded once per node (since the container is cached by the node)--which would be a nice optimization for very large static data sets.

Alternatively, it's always possible that if we add "host level, reusable
volumes that don't give root access to the machine", the init container can
initialize or reuse those as well (using filesystem locking).

On Mon, Apr 25, 2016 at 2:02 PM, Saad Ali [email protected] wrote:

23666 https://github.com/kubernetes/kubernetes/pull/23666 proposes

init containers that can be used to pre-populate volumes. This would
eliminate a lot of the need for a static read-only container volume. The
only argument I could see for still wanting a container backed volume is
that with init containers you'd still need to download once per pod--vs
with a container backed volume, it would only be downloaded once per node
(since the container is cached by the node)--which would be a nice
optimization for very large static data sets.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-214464009

@saad-ali I believe the proposals here are superior to init containers for the purpose of initializing volumes. See https://github.com/kubernetes/kubernetes/pull/23666#discussion_r59081733 and other comments in that PR for more details.

Volume types specifically for images can never be effectively implemented
with init containers (just prototyped), so this is still a valid need.

On Fri, Apr 29, 2016 at 4:58 PM, Brian Grant [email protected]
wrote:

@saad-ali https://github.com/saad-ali I believe the proposals here are
superior to init containers for the purpose of initializing volumes. See #23666
(comment)
https://github.com/kubernetes/kubernetes/pull/23666#discussion_r59081733
and other comments in that PR for more details.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-215880082

Additional Use-Case: Our data science team has trained models and other large datasets that need to be bundled with their app. They don't change too frequently. We'd like to be able to snap off a docker image with their static content and make it available to multiple ML/NLP microservices in our cluster as static content rather than manage automation that downloads it at runtime to the filesystem.

PS - @saad-ali Sorry, there's a lot of ancillary discussion and argument here... Is it accurate that there is an intent to implement data containers as volume types in v1.4 as indicated here, or am i misunderstanding?

I don't believe it is on the 1.4 list. Image volumes are hard because they
require support from the underlying volume.

On Thu, Jul 14, 2016 at 11:32 AM, Kevin Kuhl [email protected]
wrote:

Additional Use-Case: Our data science team has trained models and other
large datasets that need to be bundled with their app. They don't change
too frequently. We'd like to be able to snap off a docker image with their
static content and make it available to multiple ML/NLP microservices in
our cluster as static content rather than manage automation that downloads
it at runtime to the filesystem.

PS - @saad-ali https://github.com/saad-ali Sorry, there's a lot of
ancillary discussion and argument here... Is it accurate that there is an
intent to implement data containers as volume types in v1.4 as indicated
here, or am i misunderstanding?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-232700983,
or mute the thread
https://github.com/notifications/unsubscribe/ABG_p5Bp7sdQS37QcPRRF2x6hbcAvRkyks5qVlajgaJpZM4CVXlA
.

Is it accurate that there is an intent to implement data containers as volume types in v1.4

No, this request is not on the 1.4 list. There are too many unresolved questions. Even a simple read-only container volume, like @smarterclayton mentioned, is non-trivial to implement (it'd require a good amount of thought on how to support different container runtimes, etc).

@alph486 Can I ask a basic question? If you're already snapping off a docker volume, why not build it into a pod? Eg

apiVersion: v1
kind: Pod
metadata:
  name: MLApp
  labels:
    app: mlapp
spec:
  containers:
    - name: static-model
      image: my-static-model-image
      volumeMounts:
        - name: model-storage
          mountPath: /data/model-storage
    - name: model-processor
      image: my-model-processor-image
      volumeMounts:
        - name: model-storage
          mountPath: /data/model-storage
  volumes:
    - name: model-storage
      emptyDir: {}

Then, when my-static-model-image starts, copy the static content to a volume, and presto, it's shared.

@aronchick Yes, that's the plan. Since there's potentially a lot of data, I was going to have a script write a symlink to whatever data that pod needs. Which I think will work just fine. It's just nice to be able to have the ability to reference a volume another container exposes is all.

@aronchick @alph486 That pattern seems good to me. It's a novel use of existing primitives, and looks like it does the job.

Would volume containers offer much benefit over @aronchick's pattern?

The idea was to avoid the copying and the manual-ness of it.

On Fri, Jul 15, 2016 at 1:55 PM, Elson Rodriguez [email protected]
wrote:

@aronchick https://github.com/aronchick @alph486
https://github.com/alph486 That pattern seems good to me. It's a novel
use of existing primitives, and looks like it does the job.

Would volume containers offer much benefit over @aronchick
https://github.com/aronchick's pattern?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-233067309,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVKGDGp0JcNygLOCXWibcWg2yXqZSks5qV_OngaJpZM4CVXlA
.

While that is copying, it's not (really) manual, is it?

Crazy idea - since containers share memory space, would a symlink be available between containers?

I'm following this issue for over a year now.

In @aronchick's recent example

https://github.com/kubernetes/kubernetes/issues/831#issuecomment-23300722

I read a comment about copying but cannot see any executable code that does copying. So I don't fully understand how his workaround works. Maybe there is an order of operations in the consecutive mounts, under the covers, that affects copying? Would you please clarify what copying means here? Was the reader to assume the model-processor container, or perhaps an init-container, manually copies content in its entrypoint?

Thank you.

@aronchick symlinks demand shared mount space, which we do not have.

@ae6rt What David proposed was:

when my-static-model-image starts, copy the static content to a volume, and presto, it's shared.

Yes it works, and init-containers are even more elegant. It's still a little clunky for my taste. The real problem is, as much as I like to solve the design, it's significantly complex and only a handful of users have asked for it.

Am I understanding from this that using a side car that contains data and
shares it by writing a symlink
to a directory on its file system will not work when accessed from the
"main" (other) container in the pod?

On Sun, Jul 17, 2016 at 11:35 PM Tim Hockin [email protected]
wrote:

@aronchick https://github.com/aronchick symlinks demand shared mount
space, which we do not have.

@ae6rt https://github.com/ae6rt What David proposed was:

when my-static-model-image starts, copy the static content to a volume,
and presto, it's shared.

Yes it works, and init-containers are even more elegant. It's still a
little clunky for my taste. The real problem is, as much as I like to solve
the design, it's significantly complex and only a handful of users have
asked for it.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-233227295,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABqrPW6Z0NfFFs1aVMANEh_gZqRMWlodks5qWwKUgaJpZM4CVXlA
.

Correct. A symlink will be resolved relative to the reader.

On Sun, Jul 17, 2016 at 9:43 PM, Kevin Kuhl [email protected]
wrote:

Am I understanding from this that using a side car that contains data and
shares it by writing a symlink
to a directory on its file system will not work when accessed from the
"main" (other) container in the pod?

On Sun, Jul 17, 2016 at 11:35 PM Tim Hockin [email protected]
wrote:

@aronchick https://github.com/aronchick symlinks demand shared mount
space, which we do not have.

@ae6rt https://github.com/ae6rt What David proposed was:

when my-static-model-image starts, copy the static content to a volume,
and presto, it's shared.

Yes it works, and init-containers are even more elegant. It's still a
little clunky for my taste. The real problem is, as much as I like to
solve
the design, it's significantly complex and only a handful of users have
asked for it.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-233227295
,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/ABqrPW6Z0NfFFs1aVMANEh_gZqRMWlodks5qWwKUgaJpZM4CVXlA

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-233227794,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVPywO3xCcocsfmeBNr9K2ZAZbma0ks5qWwR4gaJpZM4CVXlA
.

With CRI looming, as much as I think this is a solution that most people need, it complicates the attempt to implement something like this. @thockin / @dchen1107 is there a feature freeze in the kubelet until CRI reaches alpha?

Not a freeze per-se, but every new change needs to address how it interacts
with CRI

On Thu, Sep 22, 2016 at 8:59 AM, Clayton Coleman [email protected]
wrote:

With CRI looming, as much as I think this is a solution that most people
need, it complicates the attempt to implement something like this.
@thockin https://github.com/thockin / @dchen1107
https://github.com/dchen1107 is there a feature freeze in the kubelet
until CRI reaches alpha?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-248946693,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVCKIfJhDq4w49ZRzDI_szbvuG70rks5qsqXggaJpZM4CVXlA
.

This one may be ugly there because the volume manager would need to create
containers that could then be mounted into other volumes, and the container
GC would need to handle their existence.

This is coming up enough from end users on our side now I'm tempted to
increase the priority for 1.6 to try for it.

On Thu, Sep 22, 2016 at 12:02 PM, Tim Hockin [email protected]
wrote:

Not a freeze per-se, but every new change needs to address how it interacts
with CRI

On Thu, Sep 22, 2016 at 8:59 AM, Clayton Coleman <[email protected]

wrote:

With CRI looming, as much as I think this is a solution that most people
need, it complicates the attempt to implement something like this.
@thockin https://github.com/thockin / @dchen1107
https://github.com/dchen1107 is there a feature freeze in the kubelet
until CRI reaches alpha?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
831#issuecomment-248946693>,
or mute the thread
auth/AFVgVCKIfJhDq4w49ZRzDI_szbvuG70rks5qsqXggaJpZM4CVXlA>

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-248947660,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_p1NDn3hpUS0Nw6CKWzbx3exfjvWhks5qsqaCgaJpZM4CVXlA
.

It will require some acrobatics to do sanely.

On Thu, Sep 22, 2016 at 9:13 AM, Clayton Coleman [email protected]
wrote:

This one may be ugly there because the volume manager would need to create
containers that could then be mounted into other volumes, and the container
GC would need to handle their existence.

This is coming up enough from end users on our side now I'm tempted to
increase the priority for 1.6 to try for it.

On Thu, Sep 22, 2016 at 12:02 PM, Tim Hockin [email protected]
wrote:

Not a freeze per-se, but every new change needs to address how it
interacts
with CRI

On Thu, Sep 22, 2016 at 8:59 AM, Clayton Coleman <
[email protected]

wrote:

With CRI looming, as much as I think this is a solution that most
people
need, it complicates the attempt to implement something like this.
@thockin https://github.com/thockin / @dchen1107
https://github.com/dchen1107 is there a feature freeze in the
kubelet
until CRI reaches alpha?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
831#issuecomment-248946693>,
or mute the thread
auth/AFVgVCKIfJhDq4w49ZRzDI_szbvuG70rks5qsqXggaJpZM4CVXlA>

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
831#issuecomment-248947660>,
or mute the thread
p1NDn3hpUS0Nw6CKWzbx3exfjvWhks5qsqaCgaJpZM4CVXlA>
.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-248950801,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVNDQEStdRqoi1l-NPtjUzgOWnpiDks5qsqlFgaJpZM4CVXlA
.

Indeed.

On Thu, Sep 22, 2016 at 12:18 PM, Tim Hockin [email protected]
wrote:

It will require some acrobatics to do sanely.

On Thu, Sep 22, 2016 at 9:13 AM, Clayton Coleman <[email protected]

wrote:

This one may be ugly there because the volume manager would need to
create
containers that could then be mounted into other volumes, and the
container
GC would need to handle their existence.

This is coming up enough from end users on our side now I'm tempted to
increase the priority for 1.6 to try for it.

On Thu, Sep 22, 2016 at 12:02 PM, Tim Hockin [email protected]
wrote:

Not a freeze per-se, but every new change needs to address how it
interacts
with CRI

On Thu, Sep 22, 2016 at 8:59 AM, Clayton Coleman <
[email protected]

wrote:

With CRI looming, as much as I think this is a solution that most
people
need, it complicates the attempt to implement something like this.
@thockin https://github.com/thockin / @dchen1107
https://github.com/dchen1107 is there a feature freeze in the
kubelet
until CRI reaches alpha?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
831#issuecomment-248946693>,
or mute the thread
auth/AFVgVCKIfJhDq4w49ZRzDI_szbvuG70rks5qsqXggaJpZM4CVXlA>

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
831#issuecomment-248947660>,
or mute the thread
p1NDn3hpUS0Nw6CKWzbx3exfjvWhks5qsqaCgaJpZM4CVXlA>
.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
831#issuecomment-248950801>,
or mute the thread
NPtjUzgOWnpiDks5qsqlFgaJpZM4CVXlA>

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-248952362,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABG_py_mo9aCpyvpEgHHinmCG_1a-AyKks5qsqpRgaJpZM4CVXlA
.

With CRI looming, as much as I think this is a solution that most people need, it complicates the attempt to implement something like this.

What is CRI?

CRI is the kubelet-internal API which abstracts container runtimes, and
allows swapping in rkt, docker, OCI, hyper, etc.

On Thu, Sep 22, 2016 at 10:59 AM, Derek Mahar [email protected]
wrote:

With CRI looming, as much as I think this is a solution that most people
need, it complicates the attempt to implement something like this.

What is CRI?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-248979322,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFVgVJRRnh_zRbEId3FMzzo_G-aAVmsaks5qssH2gaJpZM4CVXlA
.

cc @verb

Derek:
https://github.com/kubernetes/kubernetes/blob/master/docs/proposals/container-runtime-interface-v1.md

On Thu, Sep 22, 2016 at 10:58 AM Derek Mahar notifications@github.com
wrote:

With CRI looming, as much as I think this is a solution that most people
need, it complicates the attempt to implement something like this.

What is CRI?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kubernetes/issues/831#issuecomment-248979322,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AACDCK-ziPXYl_XqhuwiBxSuIrnjIrlMks5qssHCgaJpZM4CVXlA
.

Another use case:

  • Users can login via SFTP and upload/download media to website
  • PHP source code is baked into Apache container via Docker ADD
  • SFTP is in it's own container so Apache does not have to run privileged
  • At some later date the "media" volume will be an NFS container when more stable.

Pod

  • Container Apache

    • Mount "media" PD volume

  • Container SFTP (Runs privileged to bind mount at startup into SFTP users home dir)

    • Mount "media" PD volume

The above scenario works great for media, however we now have a new requirement: A script will run on Apache that upgrades PHP source code. Since the source code is not attached as a volume we need a way to sync this back to our Git repo, for example:

1) Run upgrade script on Apache (we have a vendor that requires this type of upgrade).
2) Login to SFTP and download the changed "source" files to a local git repo.
3) Rebuild Apache container and test.

Workaround:

We can mount an emptyDir in both containers and copy source from Apache into the emptyDir volume after the upgrade, making it available to the SFTP container.

Since we're running the SFTP container "privileged" we might be able to mount into the Apache containers /var/www/html/site folder however I'm not sure how to do this.

This is an unusual use case but this would be very useful when doing development on a remote Kubernetes cluster as we do because this is a massive site and it's not practical to do dev on a laptop so we bring up a GKE cluster do our dev and then tear down the cluster and deploy to production.

Wow, long thread, I tried to read the whole thing, but I may have missed something along the way.

So what is the best way to store persistent data that should never go away? Like what was said here I have a database that stores the actual data in a volume container. I need that data available before the DB container starts, and I need to be able to modify it and have it real time saved back to the original container for snapshots, backups, restores, etc.

@justechn If I understand your requirements, a StateFulset should do what you want. Depending on the cloud, it will provision you a PV that is tied to the database instance. On GCE - that will be a normal PD disk - which you can snapshot, backup, etc,

+1

This would be useful for building FaaS, also:

https://github.com/kubeless/kubeless/issues/148

And for data that's too large for ConfigMap.

From today's community meeting: It would be great for someone to prototype this as a FlexVolume

I'd be interested to work on this. It seems like FlexVolume is more like a generalized API to support third-party volumes? there's still the open question of how does the scheduler handle the containers used as volumes

@bgrant0607 or Anyone else interested in a proof-of-concept? please see https://github.com/dims/docker-flexvol

@dims Neat! I assume I could populate that volume using ADD /my-files /data-store?

@bgrant0607 yes of course, updated the sample container and added some traces - https://github.com/dims/docker-flexvol#container-image-with-pre-defined-volume

Very cool @dims

(edit: names :) )

@bgrant0607 @thockin @alph486 - Added support to make a copy of entire container image available to the pod per offline discussion

@dims Example use case: #16010. And FaaS, and app/dependencies/runtime separation more generally. And app data too large for ConfigMap.

Also very useful for the "Immutable Configuration" pattern (e.g. see https://github.com/k8spatterns/examples/blob/master/configuration/ImmutableConfiguration/README.adoc how it is currently implemented via init containers).

If we don't want to rely on docker create/export, the other options to explore are

  1. posita/dimgx
  2. containers/image directly or using skopeo CLI
  3. Explore image piece of containerd (either as library or via grpc)

Though No. 2 does not seem to support squashing image

Here is a use case - any thoughts or comments appreciated.

We're making a Continuous Deployment/Integration tool which can run as an application in k8s. It needs to be able to talk to a docker daemon to execute builds and push them to an external repository.

Due to the security concerns of 'dood', we're using 'did'. I could theoretically extend the 'did' image and install the software required to run the application on the one image, however this would mean I need to maintain the extended image.

What would be preferable is to have two images, one for 'did', one for the application, but be able to run docker commands in the 'did' container from the application container.

I could do this using volumes-from and mounting /var/run/docker.sock into the application container from the 'did' container.

I haven't figured out the work around yet - I will be experimenting mounting the same emptyDirs, however it would be certainly easier if I could specify something like volumes-from in a k8s spec.

Update: one workaround I thought of, inspired from applatix was to run the application from within the DID and mount the socket and relevant binaries from the DID host into the application. After some experiments with running docker and mounting I found that the binaries would not work, ie "docker: not found" I've just re-read jpetazzo and noticed the warning about this not being reliable anymore. So that aspect of the approach is probably not viable - I'll be looking into the docker API instead, but still mounting the socket.

@renewooller. This feature wouldn't solve your problem because it would only provide files and not dependencies.

Since you have multiple entry points, what you explained is exactly the correct decision. Mount an emptyDir to share the docker socket across containers in the same pod.

I really like the idea of a docker container as a volume from the standpoint of, it lets you reuse all the architecture for distributing containers. scalable doccker-registries, image caching, etc. It also lets you separate concerns. You could do a scalable static website by having a volume container with your web content in it and a main container of nginx. each pod would then have two containers as a scalable unit, and each updatable separately.

@renewooller did you try the flex volume? https://github.com/dims/docker-flexvol

no, I missed that. thanks for the pointer. its very interesting. :)

Hi,

I'm adding my use case as an example. docker-flexvol seems to be what I need. Using eemptyDir with copy should also work but it's nasty.

We have a few single page front-end applications: compiled JS+CSS and other resources.
I would like to serve these apps using a single nginx server, replicated. No need to use more.
I would like to consume the apps and benefit from things like easy rollback to previous version. This use case is not nice to solve with copying files.Also it might take time if and 2x space.
I would also like to use the exact image that upstream provides (with matching hashes and gpg signatures and such).

The idea is to create a pod with a single nginx container and mount the files from the docker containers as volumes.

I imagine that I am able to deploy a new version for one of my applications by changing the volume somehow and then trigger a pod restart or something.

So if I have an image called my-container-image in my private repo:
````

  • name: test
    flexVolume:
    driver: "dims.io/docker-flexvol"
    options:
    image: "my-container-image:v1"
    name: "/data-store"
    and update the pod to contain:
  • name: test
    flexVolume:
    driver: "dims.io/docker-flexvol"
    options:
    image: "my-container-image:v2"
    name: "/data-store"
    ````

I expect Kubernetes will do a rolling update to the new version of my app with my-container-image:v2

@dims: Is this use case supported by your plugin? How safe is the plugin and could we see it upstream to be installed via kubeadm? I think these are questions that should be on the project issues list.

@ieugen i believe yes, you should be able to do that. It's a pretty simple shell script so feel free to try it and let me know if you see issues.

As for upstream kubeadm etc. if someone wants to take the initiative, i can help.

Thanks,
Dims

Would be interested in getting this into a helm chart somehow... I saw containerized mount utils merged but they say it doesnt work with flexVolumes. Maybe something like:
https://github.com/openstack/kolla-kubernetes/blob/master/helm/microservice/ceph-rbd-daemonset/templates/ceph-rbd-daemonset.yaml

with shared mount namespaces?

alternately, if we could get a statically linked jq, we might be able to just slide in the two files directly onto the host....

or, I guess we could spit the difference and just run jq in a container... docker run -i --rm jq....

It's been a month since I started studying how to use kubernetes (properly). It didn't take me long to find pretty much all use cases mentioned in this issue on my own.

In all cases it comes down to wanting to expose static files to more than one process/container.

For us that means:

  • Expose dir from container A (php runtime with static frontend) to container B (generic nginx) to serve only the static (css/js) files.
  • Expose dir from container A (static js frontend) to container B (generic nginx) to serve those files.

As I understand, the facilities that would allow this functionality are currently only properly supported by containerd(?) in the form of volumes, but it would be very helpful to have...

Some cases could probably be solved by obscuring the copy action needed and ensuring it is successful.

Any other (simple) solution that would allow packaging static files as a single artifact, and then use it inside of a pod/container, without copying it (with postStart commands) each time, would probably also work for most use cases mentioned in this issue. The thing is, with pipelines to build containers and registries to hold/version them all pretty much figured out, they are an extremely handy vassal for this (and in my opinion not outside their scope, it helps with the single responsibility principle).

Anyway, just my 2 cents.

Expose dir from container A (php runtime with static frontend) to container B (generic nginx) to serve only the static (css/js) files.
Expose dir from container A (static js frontend) to container B (generic nginx) to serve those files.

I have found that recent Docker multi-stage builds pretty much allow you to do this at docker build level.

multistage builds are not the same thing. multistage builds let you do stuff like build, throw away the build environment and copy the built artefacts to the final container. in the end, you are left with 1 container. it in this situation has for example, nginx and your static files.

In the spirit of k8s composability though, the desire is to have one container for nginx that can be independently updated from a second container storing your static files. they are combined together at runtime via k8s pod semantics. This is what the issue is about.

Yes, I agree docker images as volumes is nicer. I'm just leaving a clue, to whoever is reading this bug, how you can work around the missing feature in the meantime.

+1

The recent ephemeral csi volume support along with https://github.com/kubernetes-csi/csi-driver-image-populator should make this possible. :)

@kfox1111 Cool, thanks!

BTW, Google's internal composable "package" mechanism is described in this talk:
https://www.usenix.org/sites/default/files/conference/protected-files/lisa_2014_talk.pdf
which is mentioned in the SRE book:
https://landing.google.com/sre/sre-book/chapters/release-engineering/

+1

Was this page helpful?
0 / 5 - 0 ratings