GitLab 'Auto DevOps' Changes Everything - Dev & Ops Harmony - Confessions of a middle child
Movies Preview
Share or Embed This Item
Flag this item for
movies
GitLab 'Auto DevOps' Changes Everything - Dev & Ops Harmony - Confessions of a middle child
- Publication date
- 2018-07-09
- Publisher
- Internet Archive
Special Topics talk given July 2018 to Archive Engineering.

SLIDES
We show a prototype of just out feature from GitLab for Auto DevOps, an extension of CI/CD that only requires code repositories and kubernetes clusters to "do everything" right from only a commit & push.
NOTE: the audio is really rough in spots -- a hand-corrected captions is avail when playing. Find the [cc] button on the control bar and select 'devops'

SLIDES
We show a prototype of just out feature from GitLab for Auto DevOps, an extension of CI/CD that only requires code repositories and kubernetes clusters to "do everything" right from only a commit & push.
Transcript:
Guess it is fullscreen.
No it's fullscreen.
it's on my end
So Tracey's been working quite a bit on
figuring out how to automate a lot of our
deployment and CI pieces. As you guys know,
she made the `[dev]` box for Petabox happen,
which allows people to do development outside
of having to be fully logged in to the cluster,
or have full access to the cluster
and ssh keys. Tracey's also been working to get
private credentials out of the Petabox tree and
moving along how things happen on Continuous Integration
for deployment.
So this is kind of a culmination of many things
and represents a direction that might be, that
we believe may be useful going forward.
Sound good? Yes. OK.
OK so. Sometime
late last year GitLab came up with a new feature
called Auto DevOps, or really started promoting it.
This looks really exciting for us, for a lot
of reasons, the whole point of this presentation.
If you want to run these slides remotely, you can
hit this link here.
It's in the normal git.archive.org, under project
'ia', repo 'auto-devops'.
So the motivation here is to
minimize pain points
for both Ops and Devs.
We have a lot of single instance VMs here
that are hard to update, hard to deploy.
Different teams in different groups have different
ways of pushing things out, some better than others.
It's kind of a diaspora out there.
And we're thinking, "were we to use more
containerization with Docker and Kubernetes",
we'd have more nodes in a common cluster,
or sets of Kubernetes clusters.
As opposed to, individual one-off boxes
that are sometimes underutilized.
More motivation is we can take some of these
single instance websites now and we can make
sort of instantly 'highly available', just by,
we all know with Time Machine, once have 2 copies,
things get a lot better. So if one node goes down
or reboots, or a data center is out, we'll get all
these things automatically handle it.
We also get
industry standard
Docker registry, code provisioning, Docker provisioning,
pushing out, roll forward and rollback.
There's all sorts of stuff,
and even more coming.
What the Auto DevOps from GitLab has now is
only scratching the surface, huge roadmap.
One really attractive feature is that
we're all actually already using
GitLab. So we already have users with login
accounts and we all know how to use it.
That means we can already stick with their
roles and users and whole system
instead of learn another system.
No offense to HR,
but think about every time we have another HR
site we have to get another user/login
and get used to its thing;
this is a single system in one place.
So not to bury the lead,
basically this is the Summary right here.
So, Ops comes up with a Kubernetes cluster.
As Devs, we make some kind of a Repo.
It could be Richard's awesome
WayBack Time Machine thing that was
at a Hackathon
that he eventually fleshed out a little more.
We make a repo,
we commit the changes, and that's kind of it.
We'll think 'baloney! what does that even mean?'
We're gonna go through -- that's the whole
end cycle of what we have to do.
The way they pulled this off is by using
industry evolving standard best friends for coders.
Using: Git, Docker, and Kubernetes. If you
want to see a little more background about this
GitLab raised twenty million from Google Venture
and other partners and they brought on 2 or 3
more board members. Huge thing, really ambitious.
All just kind of went down.
This is a little hard to read. They started
kind of here, what GitLab has always done.
Dev: Create. Test. Verify.
Create. Test. Verify.
Now trying to move into: Release. Configure. Monitor.
Back into Planning.
Create. Verify. Package
So a loop, where things are constantly going.
If we think of CI as Continuous Integration -
Development, pushing and Testing.
and CD as Continuous Deployment - then
this is 'Beyond CI/CD' is what their Vision is.
So you can see their slides linked there, if you
want to see their presentation.
We already know how to use Git repos
with access control for readonly, for read/write.
Already use repos. And use branches.
And all you have to really do if you stick
with the vanilla
kind of just go with the what
they're suggesting, all you have to do
is put a Dockerfile in your top repo, or you
can use something called 'buildpacks', that I don't
know a lot about -- that's from Heroku or something
like that. That's another option.
By default they expect - you can change it - but
if you default to port 5000
as your WebApp port - assuming a webapp - you don't
have to do a webapp.
But that's the assumptions.
In this case, a little nodeJS experiment from
a few years ago. And this is the whole thing.
Starts with an Alpine linux 'node'
which has node, npm, yarn
and a few other things.
It's really a modern version of that.
We copy a few JavaScript files into this
this folder called 'app'. I try avoid running as
'root', so we're going to switch down to user 'node'.
When thing starts, going to make its
working directory be '/app'.
Setup some basic needs like supervisor and few
other kinds of things and then say when the
Docker container runs, you want to run 'Main.js'.
It's really that simple. If you literally commit that
and some JavaScript code, everything just happens.
So we'll step through this a little bit.
It makes
this commit pipeline kind of thing.
So this is that repo
that is running on GitLab v11.
On a sort of demo site.
And this is the basic JavaScript -
it's just some files.
We don't need this files right now,
but they're there for now.
Put in a really simple JavaScript linting test
to show testing. That's really about it.
And some JavaScript packages.
The repo - well sort of -
we'll show that pipeline in a second.
But that's the whole experiment.
So we search for .. never know what
we're going to get at the Archive.
This is basically showing some basic tiles
and whatnot, nothing too fancy.
And now let's say we want to
go ahead and edit it.
They've got a builtin IDE editor.
This is going to show Kubernetes
setup in the background.
I'm gonna change
the little Search banner
and commit that.
And you can see stuff sort of already
start to happen in background. All of these
white rectangles are things that are firing
off and .. it's 'guts'
of the pipeline. You're probably familiar
with these pipelines.
Automatically - there's that Search commit.
And it's running the build automatically.
So it's popping in here and running
the Dockerfile automatically and config things.
It's fired up a few different pieces to
pull that off - these GitLab Runners
that you're familiar with.
That'll finish in a second.
We'll come back to it.
But the new stuff -
what we normally see now is 'build' and 'test'.
But now there's some extra targets built-in.
If we do GitLab Enterprise, there's some more.
There's this 'Production' stage
and 'Performance'. In 'Production' is where
we'll automatically roll it on out, as long as
all testing and everything else went OK in building.
And then 'Performance' as well.
I think David really likes the CI Runner part,
right? I missed my cue
for being a ringer
in your demo, didn't I?
It's these Runners that run off
using Docker-in-Docker.
Can you put the whole thing,
show the rest of that?
This, guy? OK, so it already happened.
Only been up for a couple,
less than a minute.
and they've already launched dynamically,
running in parallel, the tests
and the builds. The first time I
saw that, my jaw hit the floor.
What happens if it fails the tests?
Then it won't do 'Production', and won't do 'Performance'.
You'll get an email on that.
Emailed, so don't have to be monitoring this?
Ya. This is just for the nerds. Right,
I just wanted to make sure there is a simple link or protocol.
Ya. In fact, everything we're going to do today,
we won't ssh in, won't be logged in anywhere.
All gonna be through web browser.
Or just this thing in the background, just for fun.
OK so the job succeeded - the 'Build' job.
The Pipeline .. now it's off running.
These guys have already finished the 'Test'.
One thing that's kind of neat about this testing
schema? Again, if you just stick to the defaults
will try to suss things out for you. So it actually
runs something called 'Herokuish'.
Which is an homage / lawsuit waiting to happen.
Kind of kidding.
We're on to Heroku backpack, where it tries to figure out
what your repo is based on, then tries to actually
kick of tests for you automatically.
The first time I saw that I was like
'Wait, it already figured out it was NodeJS!'
figured out gonna try 'npm test'. It just did that!
So when I hooked in small tests, it just ran it.
And basically ran 'eslint'. The first 5 or 6 times
it actually picked up real things, and actually
bailing because I had bad lint. We can come back to it.
OK, so that's sort of showing
the Pipeline area thing.
We already saw this, but this just shows
some of the default pieces
that are involved to pull this off.
There's different tech that I'll mention.
There's quite a few things. You might think
that's a lot of things for one webapp.
Which it is, to some degree.
But the entire system can support multiple webapps.
And all that comes prepackaged or did you have to ..?
All this comes prepackaged,
all pretty much out of the can.
There's a few minor growing pains, right now.
They launched possibly
just a couple days early
a couple days early, so we're definitely going to wait
for 11.1. But they've already got a bunch of fixes in.
Including a patch I put in (kind of excited about).
So we're running normal GitLab runners.
We've already seen that - but now running _inside_
the Kubernetes cluster. Don't need our own CI box.
Don't need our own docker (registry) box.
Herokuish thing is really cool. Does code security checking.
Does license checking. So if you had a bunch of
different MIT and GPL and this and that,
and they were incompatible, it would actually test
and figure that out for you - kind of neat.
It's got this auto-deploying
versioning roll back and roll forward.
You can do
a customized staggered roll out, so you can
roll it out in pieces. You can do 'Canary'
and roll it out and see. If the Canaries aren't squawking,
it keeps going. But if start squawking, then rollback.
All that's going to be automated.
And the tests for failing on the canaries,
is configurable?
Yep. Often it's performance based.
Did it respond to top 10 pages? and fast enough?
And if not, 'Whoa, something really wrong'
Stops rolling out. One of the coolest things
that took me awhile to realize, 'What?'
is this wildcard DNS thing. This is the other
piece magic glue here. So, what they do is
take your project repo - in our case it's usually
ia/petabox. They're automatically going to make
a name called: http.. ia-petabox..
and then you pick a domain name
So anything in .dev.archive.org
will just auto-resolve
and point to this internal name resolver and
Ingress inside Kubernetes, inside this whole system.
So all names will just resolve there and then if it
can actually resolve them and hand it out for you
it will do that. And if it doesn't, you know..
http://el33t-hax0r.dev.archive.org
will just give you a 404 or 500.
It also allows for auto preview.
This is super cool - super super cool.
So if you make a merge request, and you come up
with a branch and a merge request and submit.
It will automatically make a brand new name based off of
that branch, auto-deploy it, run all the tests,
auto-deploy and recontain, send you that name.
And then people can look at, try it, and
you can refine it, make commits, keep going, fix things.
Maybe marketing says this.. certain something.
We dont have marketing, but.. someone in the group decides
this doesn't work, and then you refine it and then once
you merge it, it automatically pulls all that out, removes
all the resources, kills the name, and pushes to production.
Does it stay around the entire time, before the merge?
Yes, so you can keep going, which is huge for us.
So if I don't make any changes, it stays around?
Yep.
Oh, we need staging data. Yah.
We've already seen the pipelines - example of pipeline.
Sets up a big standard thing.
It's got a built-in Docker registry.
Which makes for a lot of more transparency.
You can also.. there's a nice little GUI
so you don't have to be on the command line.
Where you can see and manage the space.
So if I click this, then look at experimental -
there's only one repo now.
And there's been a few different deploys and
roll back/roll forward kind of things
so you can actually see them.
Oh, am I running out? Thanks.
There's little delete buttons here, so if you
want to manage space, you can go ahead and do that.
Another really cool thing is remember we
talked about user accounts and roles and groups?
Well, it's already built-in. And what's super cool?
You can just do a git login to whatever proxy/port we're gonna use
which is locked down with https and certs.
You can just log with your git credentials.
Which is really cool, and if you have two-factor auth
on, then you can just use a token.
You can just log in that way,
and start pushing/pulling to the repo.
Easy peasy!
We're about halfway through.
This all happens through an Ingress or load balancer.
Different ways to pull this off. Ingress is probably one
of the best ones. If you want to see them compared
there's a realy nice Medium high level article about
the different ways you can take traffic in.
Basically, Ingress is kind of the nicest one, and that allows
multiple domains, multiple URLs and all sorts of things to go in
and hit this ingress, and the ingress figures out where
to farm you to. This is the load balancer?
Yah, it's like a load balancer on crack.
It makes, Jonah, you've seen this,
I think you're reasonably pleased with this?
Yah, it's good for, especially for this kind
of testing, it's perfect, because the nginx
reverse proxy will ...
Scaling it, to a full situation is more complex, as
usual. But, it's a solid nginx-based ingress controller.
Right now we're using NodePort because there's some..
David and I are trying to figure out what's going on
with v11 and self-installs.
A lot of this is sort of..
since Google Venture is kicking in a lot of money, you can
imagine they kind of encourage you to just use
Google Compute Engine, GCE. So, people like us,
and there are a lot of people like us who self-host
it's a little less trodden, but it will get there.
Again, you can set up the number of Production Replicas,
or load balances, if you like. It defaults to 1,
but if want to do like 10, a farm of 10 webheads, for
your app, you can do that, where you can have auto-scale.
And again, you have this Canary thing to help
you with auto testing and rollout, rollback
It has a ton of stuff built in. It's got
performance monitoring built-in using Prometheus
and various charting kinds of things. It will
give you by default, not just the top page,
but a bunch of pages off of it. The backEndTime,
the firstPaint time, firstVisualChange time
and it uses this thing called speedtest.io for that.
It does healthchecking, auto-restarts, charts
and stats, error rates, latency and throughput.
Here's an example of this Prometheus thing.
It's a little hard to see, this is your http error rates,
this is your latency, and this is your
throughput in different ways with different 200s, 400s, 300s,
500s and things like that - error statuses.
You get all that out of the can for free, and they're
just going to keep going like a steamroller, and not stop.
Just thought I'd mention some of the tech, in case you see it.
They seem to be doing a really good job
of picking really good componentry. Some of which I'd never
heard of before - all really good and interesting.
Obviously we all know Kubernetes. Using something called
Helm and Tiller. Helm is something that
runs your Kubernetes fleet -- so you're at the helm.
Prometheus, not sure how that analogy works.
That's for charting and graphing,
and things like that.
Herokuish for testing. SAST for automated security
testing. Sitespeed.io for speed and all that.
And of course, Postgres for persistent storage
and things like that.
One of the ways that they pull
all this off is Docker-in-Docker
which I hadn't actually really used before,
and so my mind esploded.
When you run Docker in privileged mode,
you (can) make Docker, run Docker, .. run Docker.
It keeps going.
SO you can do things like, have GitLab, that
opens up and runs everything in Docker.
So I think we're not doing that - we can run
GitLab Omnibus - which is the whole shebang,
inside one single Docker container with just 3 ports ported out,
and then you can basically just farm out all sorts of
other things. It's the same kind of Docker turtles all the way
down in Kubernetes cluster - Docker Docker Docker Docker.
Really cool stuff and it's all gated inside
the Kubernetes cluster.
I see a chat notice, but I'm not near a keyboard that can
share that. So, I don't know who put it up there
but you may have to go verbal on that
if you have a comment.
No need!
Again, they'll have this incremental, slow rollout.
You can also do staging, there's a little option
that you can click on - not on by default.
You can actually have a whole staging setup and system.
Again, it will do the automatic names and you'll be in
this little staging domain.
And you can do manual production deploys
if there was some reason you didn't want to do the
automated thing that would make things predictable.
The way a lot of people are going, you can do it both ways.
There's some more notes here if you wanna dive in.
Almost done - but of course there's more!
We saw the IDE - which is kind of cool.
But check this out! They've got a built-in Terminal.
So, I'm just clicking on my project and the 'Terminal'.
And it already figured out which container to go ahead and
talk to, and actually talk to it. In this case,
there's no 'bash', but it has 'sh'. So watch this!
That's the node -- that's the thing we were just looking at.
Those are my files -- I can actually 'attach' to it.
I don't have to do
anything from the command line. You can run this all
yourself using kubectl - you just need one file
on your laptop - and directly too. But it's
nice that it has this available, as well.
I could literally take out the search this way, but
obviously we don't want to do that. That's kind of a neat
little feature built-in. So on that last, what about
firewalling it from bad actors who are doing dev,
what's the options on that - for isolating the
containers from the server?
So by default it's all behind different ports and different
domains. And there's different roles and groups
so we might have it so some only get read only access,
but not write access, to the whole repo, which includes
the production pipelines and everything else.
But it's all basically hidden behind
our default fireall -
we don't have you opened up.
You're pretty much just exposing your production
ports and names, but nothing else. Right.
So these are the potential 'Buh-bye' list. We could get rid
of docker.archive.org. We can get rid of our own
little CI system. We could get rid of 'PI' and
'install.sh'. We could get rid of NFS
and www-tracey.archive.org and all those kinds of things.
We could get rid of `[dev]` docker image on laptops
because we now have full branches that we can push to,
and will just show up and now everyone gets to them.
So we have a lot of options at our disposal.
Evan and I have been talking about making user profiles
so you could decide that certain trusted
developers can talk to the live
search engine or talk to the live database.
But other ones can't.
Or maybe talk to those, but in readonly mode.
So we have a few different options.
That could obviate the entire need for these
last two kinds of things
which I think would be really nice. It's one of the
things I think we'd all like to avoid.
It's a bit of an artifact from how we've been doing things
for so long. Some minor issues, we've had and gonna bring up
it's using ABAC instead of RBAC. So that's Attribute Based
Access Control instead of Role Based Access Control.
But that's coming. That should be done by
the end of the year, if not sooner.
We had a minor Prometheus error, but that's already been fixed.
I have workarounds for the stand-alones.
That will come out in v11.1 - it's already live for
folks hosted on GitLab.
There's some minor workarounds right now, but 11.1 or 11.2
should fix all that, so we should be fine there.
There's some miscellaneous stuff - I thought maybe I'd
add more stuff here.
You can do a cool thing where, someone had this idea here,
I forgot who, I apologize - either raise your hand or pop in
it was to use a prior Docker build for the next Docker build.
And you can do that.
I'm really keenly interested in how fast we can make all
these kind of things happen - that's one of them, where
it will basically just leverage and Vampire out prior layers
and just do the deltas, which would be awesome.
That's all we got for here.
I'm sure David might have
a few comments. We were both doing this in parallel
and trying to.. David was awesome
extra set of eyes, more on the Ops side but able to
just really get through some things, the
little things are the really hard to sort out.
A lot of new stuff.
I think we only really dove in this like
2 weeks ago, and
we only dove into
GitLab v11 starting this weekend
so this kind of all came together pretty quickly.
So huge thanks to David for being an awesome
DevOps team mate here.
Oh! and one other thing, let's just go back
where is, the pipeline
and I'll show you.. OK, it looks like everything
has passed.
And the production, basically this goes out and
makes sure everything is working,
makes sure it can talk, it actually verifies that it
can talk to the node that will be the production node
and if it can't, it will actually fail.
So it did a bunch of quick checks and then
went on to the 'Performance'.
Performance went ahead and did
a lot of the sitespeed stuff
and uploaded a bunch of artifacts.
Tha'tll be linked in directly
just is a file right now. Will be a direct thing,
I just dont have that quite hooked in.
And then we go back to here - and we see 'Searchy' and that was it.
Again, all we did was commit and everything else just happened.
And if I bricked it or broke JavaScript for the linter,
none of that would, have aborted it all.
I think it could really help us here.
We could take a lot of
our one-offs and interesting things
and future things, future websites into this
very quickly, and maybe start to move some of
our bigger stuff in to it as well.
I love the being able to spin up quick demo
instances of things - that's pretty amazing.
Do you think it will change how we do our
production deploy - if every commit
to master kicks off a deploy, will we be more
judicious about when we merge to master?
There's probably a good chance of that.
If these sort of lighter weight branches
we're working on in branches,
are easier to get feedback from more rapidly,
there might be some other options to just do a
quick git update.
But I mean, clearly they're trying to close this loop,
so it's just tighter and tighter, faster and faster.
[inaudible]
It would be a pretty tight integration with GitLab
Corporation. Yep. I just plowed
around on their website. Would we need.. we use
GitLab Community Edition, I believe
at this stage, and that's free to us,
because we're a nonprofit.
Yep, it's free in general, and everything we saw here
today, is GitLab Community Edition.
We could kick on the Enterprised Edition stuff,
if we want. I think, David, you found it was
something like, 20 bucks a dev, per month, or something
so, hmmm, but this is where we go and talk
to people at conferences and
go "You don't want to charge us, like, $2,000
or whatever.." But that's only if we want it.
The two.. they might be focusing
a little bit more of certain things like
revenue and crap like that.
They're trying to go public.
GitLab did actually, just a few days ago now,
announce that their top tier whatever thing
is available without support for free to
educational institutions. We like educating!
I was gonna ask, we talked about
this a little bit before.
I might have missed this if you mentioned it here.
There's a lot of really cool stuff
for automatically deploying.
Are there also options for
scheduled deploy and things like that? Yah.
Maybe not just automatically? Yep, they are built in.
You can have a normal scheduled deployment, and you can
basically fire off, not at 3 in the morning, but at some
period of time, like two times, monday through friday.
Kind of related to I guess what Brenton was saying,
how would this affect how we push to master -
maybe every commit to master doesn't have to kick off
a deploy. Yep.
Different repos can have different schedules. Cool.
I don't know how the magic works
at the www-tracey or www-mike.archive.org
But you just write a file
and you can see it, instantly. Yes.
Is that preserved in this?
So, in this right now, again it's just been
only a week or so of hacking..
Um.. no. Not that quick
level of instantaneous-ness.
But I'd be really surprised if it won't
be in the near future.
So it's really a path for
deploying on cluster
as opposed to rapid iteration development.
Yes, except that you can still also do this kind
of staging thing where you're staging branches.
But still it's not that absolutely blitz and see.
It's, there's some time.
What is that time?
So. There's the docker image, run it.
In practice, this one looks like it
took four minutes, thirteen seconds.
Which is a pretty good time to get the whole thing through.
It's probably a little less because
of the Performance..
But that's going through the whole tests cycle
thing which isn't I think what most people
want in a short term. Yah, one of the
big reasons I wanted to play around with v11
was I was curious how much they were taking the
repo and handing it around the back
and into the container. 'Cause they do that for CI.
That's one of the reasons why CI is really fast.
Some of think CI is slow. It's not that CI is slow.
It's that we have a lot of tests, some long running.
So it fires up and gets going really quickly, because they
do this cute little thing, where they hand the prior repo
around in this side dir, and just quickly update.
They don't have to pull down 4.7GB of tree.
They're doing similar things here, but I think this can
get faster. But I also think of this as staging,
so the fastest iterations are happening on my laptop, not
happening up to staging. Yah, and it is worth pointing out
you can run GitLab on your laptop, quite easily.
Because you can just plug in Docker Omnibus
and right now do everything we just saw in
maybe 7 lines of configuration. That's it.
And people basically..
So maybe you haven't investigated this yet
but is it possible on GitLab on your laptop,
to make it so you can bypass some of the CI
if you just want to do rapid development?
You can make a branch on git...
It sounds to me like, I know you had suggested
that we could get rid of the local Docker images,
but it sounds to me like it probably wouldn't make
sense here. That you'd probably still want to have
something that you can develop on locally,
but be able to push off the tests, kind of like
Brenton's point - you should be able to do
really quick iterations on your local version
and when you're ready to put it up for testing, that's when
you kick off this process. Not like, every time you save.
Now I mean, I don't.. obviously, we could
throw in a little bit of glue
where you basically kind of do what we're
saying, you know, either bounce it through NFS
and through to the container, and it just updates its
docroot like that. So we could do something like that
in the staging area. And then you could leverage everything
else you see and maybe not do the pipeline by default?
I definitely do think though that, even if this adds a
little bit of extra process of having
for testing and running CI and all of that.
It might take a little bit of time up front
but it saves us time on bugs that we might
otherwise push through to production.
Yah, and you can run the pipeline anytime
you want, too, it doesn't have to be a commit.
You just go right in, here, you can do this here,
too, today, with our..
you just say Run Pipeline and step through it.
So you can just run the tests right here.
Any other questions, remote or local?
I'd like to add - it's perfectly possible to set
up a pipeline that runs a minimal set of tests
that can do that - like if all you're doing
is changing the background color
to cornflower blue, and you know that
that you need to run the full integration tests,
you can push that kind of fast pipeline to a
side branch, and just see what it looks like
in less than 17 minutes. Good call.
But we probably, we would never bypass that
like at deployment time, right?
Oh no, I was saying I would do that
to a branch, not to master.
I thought you had indicated that one of the
things one of the things we could do without
is the www-traceys. Right, so when we're logging
in to that little terminal thing
imagine that where that
little app repo is
it's got a side mount to your GlusterFS deployed
tree, something that's saved from your laptop.
Then you get instant feedback.
It would only be for staging, only for a branch.
Right. Then we could get rid of the www- question.
What about staging data as part of the petabox?
Oh, it'd be really great to have a giant..
Yah. I mean, you know, a lot of the
other magic part - the big things we need to sort out
to get people even quicker, stuff in Kubernetes
was the Persistent Volumes and Ingress.
We think we have pretty good beads on that.
The persistent volumes where you can actually store
a bunch of data, so it just loads it right up, so you know,
I'm making it up, but let's say /var/lib/petabox,
which is a persistent volume /var/lib/petabox,
"vanilla-ice-cream" branch
in your user testing and it automatically prepopulates
with a bunch of data and constants.
That's one of the things that we do want to do,
is set up a stable testing environment
that includes sample data.
Yeah. Yeah, you know the..
If Aaron Ximm's on vacation,
and he is right now
and you know, there's some minor issue, and we're just like
"oh just look at his documentation and scripts"
As long as it's not on fire, I don't want to do it.
So, that's kind of how I think
you know, everyone, not feels about my code,
but deploy native,
let's try to get rid of as much of this crap as possible and just
go with the industry. This is all .. based, it's documented.
It's supported by thousands of users.
We don't have to learn some strange, you know
Docker build script or run script
that I came up with.
It's very attractive.
So one thing I wasn't really clear about is
you also put on the list, that we could get rid of
is "install.sh" and I don't quite understand how we do that.
It still seems like we need it to build the container.
You still have to build the container, but that's the
"provision.sh". So you still have a provision step.
But the install step, there isn't really much..
I mean the container, as long as it's actually properly setup
and third party and /etc/ files are already done,
then nothing to.. you just have to update the repo.
That's been the whole goal, the whole time.
Let's kill the "PI" install script.
We just want to have a simple, update repo. Done!
But then how would this get deployed to all the datanodes
a change that affects the datanodes?
So I like Sam's idea to kind of make the nginx datanodes
Dumb and Dumber. Make them basically just fileservers.
As much as possible.
I know we can't fully get there, but to the extent that we can,
move as much stuff to webnodes as possible.
Because the webnodes are load balanced -
they are spread. They are tested.
And then we could have that be where we do changes.
So hopefully, it's just a dumb fileserver.
But you know like the workers right now,
the workers aren't in Kubernetes.
Kubernetes and certain things like..
It doesn't have to be web apps,
but that's their specialty.
There's still an open question there about
how we would resolve the issues where
certainly pieces of our policy layer
are most likely to be continued to be
implemented on the datanode and nginx probably.
Unless all access to the corpus
comes through the webnodes.
That means things like access control and
a bunch of other business logic will end up
having to live on those nodes and have
kind of first class access to the data on their filesystem.
Which seems like an option, maybe reasonable
option but we'll have to see, but yah.
So I think the question
of how deployment cycles might
work best for them
is I think a little unresolved.
The other thing that I'm curious about
is that I think a developer working
with one of these systems probably wants
an entirely local version of the universe
available on their laptop.
And then that is kind of related to this in that they
both might use Docker technology but I think that's
where their relationship sort of stops.
Do you think that that's true?
You mean for the rapid dev, kind of thing?
Yah, because someone working on their laptop
is going to want to edit a file and then run
the file again. There's that normal
normal development loop, and that's one activity.
then outside of that there's the second level up
which is where they want to be able to
share with with somebody else
and I think there's two parts. Those are very true
for our team, I can say, we do that all the time.
I think there's sort of two levels for that
system that we really wanna support.
How we want to handle the share it with somebody
else is something that you know because
the scale that we want to share that on and the
corpus that the feature set has access to
that's kind of the shared prototypes
are things that I'm very curious.
In my imagination, I would very much like for it to
be possible to share, for example, a changed layout
of a web page across a small demonstration corpus
with the general public. Yah. Which means,
that it would be hosted somehow on machines
that are connected to our internet
and that's publicly available. That would be like
a merge request or branch in this world.
It will give you a name that's publicly shareable
and as you commit, it sort of updates it automatically.
That gives you the sort of slower timeline version which I think we
agree, is great, and that's what we're really trying to solve here
as well as the production, full production.
The instant dev feedback thing is interesting.
If we take a step back, here's what we're doing now.
Some of us are on 'home' with emacs
but most of us, these days, are now doing dev on our laptops,
and most of us, I don't know if all of us are doing this
but I think a lot of us are auto-syncing to 'home'
on 'save'. I don't know if that's true for all.
About a third of us or half of us.
Yah, that's certainly what I do.
But it's syncing to 'home', which isn't magic, right?
It's just an NFS thing, which then goes to effectively
an nginx container - not containers - but nginx servers
that are running on one specific server that sort of says
"Oh you can get to this thing instantly and serve it
as a web page". So it wouldn't be any different for us
to either continue with NFS, or Gluster or something
like that where we just bounce the file through and it
goes straight into the one of these little staging things.
I suspect they're gonna up with this very quickly.
I could be wrong, but even if they don't, we can do basically
what I just said, because we're already doing it now.
It's just we'd be sliding it around the backdoor
into one of these little staging containers.
I think the difficulty right now in my personal experience
is that sometimes there are things that I need to be able to
test quickly, that the Docker image doesn't handle for me.
So I need to have it be rsync-ed up to the
www-evan and test there.
And I'm doing that before I make a commit.
Yep. So the thing I would..
I don't know if it's a necessity to fix before this happens.
In my mind, there's going to be a little bit of a gap there
in terms of things that I now would need to commit to origin
to be able to test as opposed to
make a change and just see it.
Yeah, and to be clear, I'm not saying that if we go with this,
we kill NFS home and 'www-'s, I'm saying that..
I mean, I want to kill it, I just want to be sure
One alternative by the way, would be better test data
in the Docker environment
the local environment.
And if you removed that, you reduce the number of cases
where I need to put it on the staging environment
that's pretty helpful. And then it just reduces the number of
times I need to commit to origin to be able to test something.
And then to show me the commits, we put it to a
staging server, and then we both can see. Right.
We want to probably minimize that to be the
simple sharing case as little as possible.
The personal test case shouldn't require
that I put it up on the staging environment.
So, would it be helpful to get some feedback from other folks
about the Docker container so we can make that richer?
Yah, I think that ideally that environment is
quite flexible, so that as necessary
that purpose could be updated and modified,
because I think that it's very often
that website features relationship
with data in the items and also
relationship with data that's
in the surrounding services like the search engine,
like the database. All of those data stores,
we're going to have to be able to sort of load
them as appropriate for testing environment.
I hope not.
So, you guys are much more in actually doing this..
So the idea of creating the whole universe
the whole swaths of items, have whole
swaths of whatever, build the whole universe that
we're going to be trucking around on our laptop.
What if we could do things
a little different? To have the universe
of 40 petabytes live some place?
And what we're trying to increase the
number of people that can use that
by going and making a lighter
weight set of services that a
high speed development environment would use
out of the universe. And the universe lives on
40 PB well maintained by somebody
else with search engines and data
and blah blah blah. And make it so it's a lighter weight thing
that talks via APIs to the backend.
So that way, if we can increase the number of
people that would be able to improve the
system without downloading the universe
and some part of this galaxy of clusters.
I think that in
a lot of cases, that is gonna be really good,
but in other cases, a little easier
said than done because like this shared state thing.
And so what happens if I'm writing code
that is modifying that shared state.
Then we're all using that same shared state, right?
But if I'm working out of.. and Tracey's working out of.. Right.
You're changing the items? Right, they're not at odds, what you
just said was creating
a way for all developers to work is not
at odds with having testing. And it's to
get that rapid development. We need stuff
that we can mess up, bang on
as part of development. So I don't think it's not
I don't think it's like bringing in the whole
you know.. I think it's trying to bring in
search engines and databases and catalogds
and all these things to run on
your laptop strikes me as awesome!
I mean, in that sort of like
galactic, amazing, wow, why would you do this?
But what we're trying to do is get it so people can
work against, the APIs, the items.
Right now, a lot of the
breadth of work that people are doing
does go under that hood.
It's unfortunate
but, for example, the code that renders
our web pages talks directly to our database
and has a relationship with
the internals of that.
Staging that is obviously a good thing.
Score!
This has been my charge when we hired John, and
we haven't gotten there yet. And maybe it's wrong,
maybe we should just cart around everything
every time you want to make a change to anything
we have to go and understand everything.
But I'm hoping that that's, I don't know.
It doesn't strike me as the method to move forward.
We have a lot of options.
We can plug in from our laptops or from production or staging:
live database, live search engine, live items, right?
But we could also take some version of those,
right? It could be: live search engine,
readonly empty database running locally on your laptop.
And real items that can actually get feedback on and look at.
Also wanting, we haven't necessarily been asking,
but we've been asking for for test data.
We haven't
necessarily asked for replicate the whole entire thing.
No, we just need representative data.
But OK. Well anyway if there's
a way of making it so
that we have a smaller
amount of system, a larger number
of people can interface with
and help us move forward on?
That would be great.
You know, there will be a storage backend,
that's gonna be upgraded by monks.
That's terrific. If we can make a much
more rapid interface towards making
UI changes so that they can be more appropriate
for the data that's in the Archive
that'd be fabulous. I'd love to invest in that!
It would be great if the entire front end were
only interacting with the Archive through bonafide APIs
and that there could be more.. This is a different conversation.
Mitra's been really trying to push on this, by my request.
He's making progress, and I hoping that that
brings along other some of the other parts of the Archive.
I think that you're doing something like that with
your NodeJS. Ya ya. It's just searching against
the Search API and it's, I think there's
some more APIs used, and that's it.
It's worth adding that these can kind of be
separate things, separate options.
Right? You have
these.. if you separate the web applications that talk to APIs.
The APIs could be a shared state. They could also be
something you run locally, depending on what you need.
Then like that one is the majority case,
but sometimes we are going to need to call read/write APIs.
So that's when we might need to run something locally. Yep.
Are ther any other burning questions?
Because we're at 11:30, and we had an hour.
[claps] Nicely prepared. Is there anything more
that you want to find out, that you didn't find out?
Well I mean, hearing the emphasis on the
rapid change interface was super helpful,
and I think we should kind of get on that, as we can.
Good. OK. Sweet. If people want more of the git repo, of the
presentation, we did record and I will publish the link to this
recording shortly.
Thank you everyone! Thank you, Tracey.
comment
Reviews
There are no reviews yet. Be the first one to
write a review.