Notes & Learnings from Q Con London 2014 - Day 1
I was lucky enough to go to Q Con London 2014. Here is an update of my experience from day 1.
This talk reminded me of another cellular automaton I encountered many years ago whilst studying Genetic Algorithms at University. I am pleased to say that the creator (David Eck) has converted the original Java Applet version of his program Eaters, that so impressed me, into javacript available here.
What I took away...Coding can and should be fun.
I asked Daniel at what point etsy decide to roll a build back and stop debugging an issue, he replied that there is no rollback (the closest thing to a rollback is revert commits and create a new build). This doesn't seem like a good idea to me. I think each build should be rollbackable. Debugging an issue calmly after you have restored service with a rollback is surely better than a forward only stratgey.
What I took away...Continuous delivery can be achieved without a perfect architecture if you have the right processes and culture in place. Etsy should be applauded.
What I took away... There are a lot of good patterns out there for fault tolerance, a lot of them are discussed in Release It!
Damian Conway - Life, The Universe and Everything
Damian is an amazing speaker and made his already fun, interesting, geeky subject, even more fun interesting, and geeky with his great presentational skills. He showed us Perl code (and later Klingonscript) that not only implemented the game of life but also (kind of) disproved Maxwell's Demon. We were also shown an example of a Turing complete machine which was implemented using game of life made by Paul Rendell. Video also on youtube.This talk reminded me of another cellular automaton I encountered many years ago whilst studying Genetic Algorithms at University. I am pleased to say that the creator (David Eck) has converted the original Java Applet version of his program Eaters, that so impressed me, into javacript available here.
What I took away...Coding can and should be fun.
Daniel Schauenberg - Development, Deployment & Collaboration at Etsy
Etsy have around 150 engineers working on a monoloithic php application which forms etsy.com. Although this sounds like an old fashioned architecture, their delivery pipeline is very impressive. They manage 50 deploys a day. Other points:-- They favour small changes.
- 8 people can deploy different sets of changes in one build (that's there magic number).
- The deploy process the "Deployinator" has two buttons only - no ambiguity, everything automated.
- They use config to switch features on after the code is safely in production (i.e. soft live releases).
- Developer VMs can run the entire stack and closely mimic production.
- They make use of Linux Containers (LXC) On day 1 of working for etsy you deploy a change - This seems like a great idea to me.
- Tons of dashboards which are monitored after releases.
I asked Daniel at what point etsy decide to roll a build back and stop debugging an issue, he replied that there is no rollback (the closest thing to a rollback is revert commits and create a new build). This doesn't seem like a good idea to me. I think each build should be rollbackable. Debugging an issue calmly after you have restored service with a rollback is surely better than a forward only stratgey.
What I took away...Continuous delivery can be achieved without a perfect architecture if you have the right processes and culture in place. Etsy should be applauded.
Uwe Friedrichsen - Fault Tolerance & Recovery
Uwe explained a lot of common fault tolerance patterns, including...
- Timeouts.
- Circuit breaker.
- Shedding load.
Uwe also explained the Netflix library Hystrix.
What I took away... There are a lot of good patterns out there for fault tolerance, a lot of them are discussed in Release It!
Graham Steel - How I learned to stop worrying and trust crypto again
Graham started his talk by saying that "the paranoids were right" - NSA has been intervening with crypto standards. However, all is not lost as properly implemented crypto systems can be relied on.
Here's what makes a good crypto API:-
Here's what makes a good crypto API:-
- Open and subject to review - More likely to have bug fixes and patches applied.
- Supports modern cryptographic primitives.
- Good key management - This can be the Achilles heel of a crypto API.
- Mistake resistant - Some think people should be free to make mistakes. I don't!
- Interoperable.
Examples of crypto APIs.... My notes are sparse here...
- Big success, widely used.
- Old - Current version 2.4 was released in 2004.
- Implementation is closed, but the standard is open.
- The API is closed (under Oracle's control), however some providers, like Bouncy Castle are open source.
- Has around fifty "Fix me" comments in it!
- Contains the NSA backdoor random number generator which you are free to use or not.
- Work in progress to provide encryption in clientside javascript - Lots of challenges here!
Lessons
- If you roll your own cryptography implementation, you are guaranteed to have a floor in it.
- TLS has been around for years and they found issues just 3 days prior to this talk!
- Applied Cryptography is a dangerous book as it encourages you to roll your own cryptography.
What I took away... A new understanding of how much I don't know about security and cryptography.
Dave Farley - Continuous Delivery
According to Dave... "Our highest priority is to satisfy the customer through early and continuous deliveries of valuable software."
Feedback loops - starting with smallest and quickest:
- Finished means in production.
- Break down silos, Business, Testers, Developers, Operations, all need to collaborate more to make it work.
Feedback loops - starting with smallest and quickest:
- Unit Test - Code.
- Executable - Build.
- Idea - Release.
Principles:
- Keep everything in version control - I agree. Any tool that has a UI-only is to be avoided in my opinion.
- Automate everything.
- Build quality in.
- If something is painful, do it more often. If you release every 6 months because it's painful, the lesson is release every 2 months, then month, then week etc etc. Don't start releasing every year, it will get worse.
- Manual testers are far better at exploratory creative testing than repetitive regression.
Accidental benefits:
- When debugging a long standing defect, if version control and deploy are fast and reliable, you can determine which build introduced it. Binary chop through a list of builds until you have narrowed it down,
- Automating everything can introduce auditing for free.
What I took away... Dave Farley knows this subject inside and out and his, Continuous Delivery is highly recommended.
Paul Simmonds - Identity is the new currency
Data is going from:
- Internal.
- De-perimeterized.
- External Collaborations.
- Secured Cloud.
Interested or working with security on the cloud... you must read:
Good overview. What needs to be done for us to release 50 times a day? :)
ReplyDeleteYo - thought I'd finally comment.
ReplyDeleteI totally agree with the Etsy engineer that there shouldn't really be such a thing as a rollback. After a fair few years working with a forward-only approach, having to do a rollback would seem like defeat now ;-)
There are a few reasons for this:
Cost vs Benefit - ensuring a rollback can happen costs development time, especially when we are talking about migrating schema and data. If we don't spend time here, the rollback can cause as many, if not more, issues than the actual deploy. If the data migration was long-running, it's going to double the time your site is down.
50 deploys a day - if you are deploying 50 times a day (which in my opinion, is a great thing), chances are you are deploying a much smaller amount of code than somebody deploy once a week. This makes solving an issue with a deploy a much less intimidating prospect. e.g. lots of deploys and forward-only approach go hand-in-hand.
Rollbacks - if you are having doing rollbacks often, there is obviously a more serious issue at play
Aiming for a forward-only approach - if you aim for a forward-only approach, you solve a lot of other issues on the way, e.g. we need to get the dev machines closer to the production env so we can see these issues earlier.
Thanks for the comment! All good points well made.
ReplyDeleteThis is a really interesting topic and I think that there are many organisation/project specific factors which can sway the balance between the two approaches.
Risk - The riskier the deployment/change is, the more attractive a roll-back becomes. Of course we all want low risk deploys, but sometimes you find yourself working with a legacy system with forgotten functionality. This point is very much linked to the confidence you have in your testing which is obviously very important.
Impact of outage - The risk of an outage could be high but perhaps the impact is low. Perhaps your application/service fails in a graceful manor with say a small component being temporarily not visible to some users. Or perhaps downstream dependencies of your app/service just fail stale. The opposite of this would be big embarrassing dead-end sorry pages to all users.
Effort of supporting a roll-back - If the rollback is complex to create and test then forward-only obviously looks more attractive. If it's simple then why not!
I have never worked with a forward-only approach so I suppose having a rollback on hand is a source of comfort I have grown used to. However, I'm sure given the right scenario I could be persuaded to the brave forward-only approach!
Thanks