Chris Cormack – Tech Stuff – Page 2 – Insights from a career in information technology

Foundational DevOps

Benefits of Infrastructure-as-Code and Cloud Economics

As I see customers adopt Amazon Web Services, one of the first benefits they quickly realise is the ability to create and bootstrap environments at a time that suits them. This is a great benefit that helps to: (1) manage costs; and, (2) enable experimentation of new ideas. It appeals from both a financial perspective and an engineering perspective. With this foundational capability in hand, an organisation can build on it to gain further benefits. For example, accelerating product development to gain a competitive advantage.

Environments in Traditional Data Centres

In a traditional data centre we would typically see a dev | test | prod | dr type approach to defining non-production (development and test) and production (prod and disaster recovery) environments. The infrastructure for these environments would be purchased at a high cost. Then it would often be written down, for example over a typical 3-5 year hardware refresh cycle. Guesses would be made to estimate capacity in advance of equipment purchase, and proof-of-concept work would typically occur just-in-time of purchase. Proof-of-concept in a hardware refresh cycle might trial and prove new application architectures at that time, perhaps not to be revisited until the next refresh.

Environments in AWS Cloud

Thank goodness we’re no longer confined to traditional data centres! With Amazon Web Services, you can create infrastructure and services without paying any upfront purchase costs. You pay for what you use, when you use it. What’s more (and even better), when you are finished you can destroy the infrastructure and services you provisioned and no further costs are incurred. (Note of course I’m not suggesting you destroy your production environments here, but highlighting the lifecycle capability of provisioning environments in cloud).

TRG Talk - Cloud - The Economics of Cloud Computing

Run a proof-of-concept whenever you want! Trial adoption of database-as-a-service like Amazon Relational Database Service (RDS) to reduce your database administration costs and improve service availablity! Introduce high-availability and self-healing compute infrastructure, with Amazon Elastic Load Balancing across Availability Zones and EC2 Auto Scaling!

Why Does It Matter?

Cloud providers such as Amazon Web Services have heralded changes that are nothing short of revolutionary. These changes contribute to the widely acknowledged current technological revolution – the Fourth Industrial Revolution. Globally we have seen the concept of cloud economics introduced to organisations and rapidly adopted. There’s now a more level playing field between smaller organisations and larger ones, which is accelerating innovation, disruptive ideas and products.

Underlying digital agility, innovation and productivity is IaC. Infrastructure-as-Code. IaC is a foundational capability of agile digital organisations. Using IaC you write the programming code to create your infrastructure and services. Once the code is written, the process is effectively automated.

Amazon Web Services provides CloudFormation and the Cloud Development Kit (CDK) for IaC.

Why use a human to do dumb, repetitive tasks? Automate them and boost your operational efficiency. Once you have your infrastructure code in hand, build a DevOps pipeline to manage the process of provisioning.

Foundational DevOps relies on IaC.

Imagine Who’s Listening

Humour me and imagine, if you will, some years in the not too distant future: you are part of a group putting your attention to understanding something. Somewhat like you might be doing now.

Boy on tin can phone listening to curious good news

Not everyone in your group is human.

We are not talking about alien life here. We’re not talking about animals either. No. Imagine for instance this particular fellow group member is an “intelligent agent”.

There are various definitions of an intelligent agent. For this case, let’s say that an intelligent agent is a device that perceives its environment and takes actions to maximise its chance of success.

An intelligent agent is a type of thing we more generally call artifical intelligence, or AI.

AI has arrived. What is AI?

AI and intelligent agents are changing the way we live and work.

Let’s think about this concept of intelligence. What is intelligence? Well, the Oxford dictionary defines it as “the ability to acquire and apply knowledge and skills”. Other definitions include: self-awareness, emotional knowledge, and capacity for logic. For our purpose of discussion, I’m going to define intelligence here as “the ability to achieve complex goals“. This definition broadly covers all of those ideas. There are many goals and the ability to achieve them provides a basis for intelligence.

Intelligence comes on a scale. The degree of ability to achieve a goal helps us understand where on the spectrum of intelligence something is. A complex goal might be: speaking. Let’s consider. Can a baby speak? No. Can an adult speak? Yes, generally. What about a child? All are somehwere different on the spectrum of intelligence.

Intelligence is narrow or broad. IBM’s Deep Blue chess computer beat the chess grandmaster Gary Kasparov in 1997. It could beat a chess grandmaster at chess but in noughts and crosses (a.k.a. tic-tac-toe), it couldn’t beat a 4-year old. It was built with a narrow focus of ability.

The more recent Google Deep Mind DQN AI system can play dozens of vintage Atari games at human level or better. This system is built to be able to apply itself to new goals. We might consier it has a broader capacity for intelligence.

Humans however, we as a species are so far unique in all the world with reagrd to intelligence. We are able to master a huge amount of skills. We can learn languages, sports, vocations and so much more. There’s nothing on the planet to rival us at this point in time.

General AI is coming

We are seeing accelerated breakthroughs and uses of AI in broader areas of our lives. Research and development in artificial intelligence has an endgame focus of general AI at a human level. Narrow AI will eventually evolve to become general AI. Whilst we do not know when this will happen (and some do question whether it will ever happen), there is no doubt of the advances and applications of AI.

Deep Mind has been able to learn many different games by using a positive reinforcement deep learning technique called reinforcement learning. This is a general unsupervised learning technique that computers use to teach themselves how to achieve narrow goals.

Using this technique, Google Deep Mind has been able to master and outplay human testers on 29 different vintage Atari games. Never sleeping, never resting and with no need to eat, computers can spend almost endless time learning how to achieve their goals in virtual reality, and then apply that knowledge when ready.

AlphaGo demonstrated strategic creatvity when it beat Lee Sedol, considered the top player of Go in the world in 2016. It was expected to take another decade before AI beat a human Go champion. AlphaGo went on to beat all 20 top players the year after it beat Lee Sedol.

To put this feat in context… there are many more possible Go positions than there are atoms in the entire universe. Therefore players rely heavily on intuition alongside conscious analysis. AlphaGo shocked the world by defying ancient wisdom by playing on the 5th line early in its 2nd game against Lee Sedol, and it went on to win the game. This was a demonstration of intuitive / creative play from a machine-based atrtifical intelligence.

AI is being widely used throughout our lives today

Natural language translation was not really considered possible when I studied AI back in the 1990s. As a student of computer science at university I recall discussing computation ability and we considered it as unable to cope with the ambiguities of natural langauge back then. Now we see natural language translation all around us. We are now seeing translation happen in real-time as well. What other examples do we see of AI all around us?

AI is being used in finance. Most stock market buy-sell decisions are now made automatically by computers. Algorithmic trading is the AI behind this. Algorithmic trading is used to help profitable trading. It allows resources to be efficiently allocated across the world at the speed of light.

In healthcare we are seeing changes in multiple areas, like diagnosis and surgery:

In 2015 a Dutch study showed that computer diagnosis of prostrate cancer using MRI was as good as human radiologists
In 2016 a Stanford study showed AI could diagnose lung cancer using microscope images better than human pathologists could
Machine learning is now used in medical research institutes such as Walter Eliza Hall Institute, for instance in predictive modelling for best outcomes based on analysis of genes, diseases and treatment responses
2 million robotic surgeries have been performed in the US smoothly between 2000 and 2013 according to a recent report in 2017

We are in the midst of the 4th industrial revolution. AI is one of the catalysts in this step change for humaity.

AI is now a permanent part of our lives, changing the way we live and work

Let’s review what we’ve covered.

Artificial Intelliegence has arrived. What is AI? It is a narrow, limited ability to achieve a goal, at this point time (2021). The engame is a broader, more capable ability to achieve complex goals.

General AI is coming. Accelerated breakthroughs have shown advancements in intuitive, creative, strategic mastery of games. Virtual reality is now commonly used for reinforcement learning in training AIs.

AI is widely used today for the positive benefit of human society. In finance and healthcare we are seeing improvements that benefit humans.

Imagine who’s listening

Imagine some time in the not too distant future, you’re part of a group putting your attention to understanding something. Not everyone in your group is human…

Amazon Virtual Private Cloud

Amazon Virtual Private Cloud (VPC) is an abstract network service that allows you to create a virtual network of your own. Back when first introduced in 2009, it was a revolutionary concept that enabled the creation of a network of your very own – without you needing to own any IT hardware.

At present time of writing a VPC enables you to create a network address space using any IPv4 address range, including RFC 1918 or publicly routable IP ranges. The network can be between 16 and 65,536 IPv4 addresses in size. IPv6 is also supported.

The architecture of AWS Global Infrastructure means that your VPC spans multiple Availability Zones. It spans all Availability Zones in the AWS Region. Unlike many technology infrastructure providers, every AWS Region has 3 or more Availability Zones (AZ). AZs are geographically separated locations within an AWS region, connected by redundant fast fibre-optic data links.

You can learn more about the AWS Global Network here: AWS re:Invent 2016: Amazon Global Network Overview with James Hamilton

Within your VPC, you define subnets in an Availability Zone. This means whilst your VPC spans all AZs, your subnets will not.

To manage and secure network traffic flow you use route tables. A VPC is created with a main route table. Each subnet you create must be associated with a custom route table or the main route table. The route table defines routing for your subnet, indicating how network data should flow.

To further secure your subnets, Network Access Control Lists (NACLs) can be defined. A NACL can be used to explicitly Allow or Deny network data to cross the boundary into or out of your subnet. Each subnet must be associated with a NACL – either the default NACL (provisioned when your VPC is first created) or a custom NACL.

One more security feature for capturing network traffic flows is VPC Flow Logs. This allows you to capture the traffic that flows to and from the network interfaces in your VPC or subnet.

There is much more to VPCs than this but these are the fundamentals. You can create an AWS account and create and destroy VPCs either through a management console or programmatically.

There is some further reading here exploring options to extend your data centres to include VPCs: AWS Whitepaper: Extend Your IT Infrastructure with Amazon Virtual Private Cloud

PrivateLink – It’s a Kind of Magic

AWS PrivateLink is an interesting way to create an endpoint by which you can provide services to other AWS accounts. You can do this without the need to run requests through the Internet and without peering or otherwise “connecting” VPCs. What is this particular type of AWS magic, I hear you say?

Says the Amazon web site, “AWS PrivateLink provides private connectivity between VPCs and services hosted on AWS or on-premises, securely on the Amazon network. By providing a private endpoint to access your services, AWS PrivateLink ensures your traffic is not exposed to the public internet. AWS PrivateLink makes it easy to connect services across different accounts and VPCs to significantly simplify your network architecture.”

So this is a particularly interesting AWS magic trick in that we can provide services to other consumer VPCs, through the Amazon backbone. We simply do two things to make his happen:

Create an Endpoint Service, in our serivce provider VPC
Create an Interface Endpoint (linked to our Endpoint Service), in our service consumer VPC

Where this gets really interesting is that it avoids all the unruliness of network address spaces and having to deal with Network Address Tranlsation (NAT). Routing works through a network interface to the endpoint service and you don’t have to worry about the network addresses. And if the endpoint service is unavailable in one Availability Zone, well that’s not a problem because your endpoint service will load balance across multiple Availability Zones.

Not to put too finer point on it, but to get the engineering and provisioning underlying all that without lifting a finger? That’s a kind of magic.

Publishing a .NET Core App

In my last blog, I mentioned a basic Web App that I had put together using Visual Studio 2015 tooling for .NET Core. Over the last couple of days, I’ve been looking at publishing the App, and the steps involved.

.NET Core is a new beast with a lot of potential…. the aim is it will run anywhere, on anything. So to keep my costs down, I’m going to trial it on my Amazon Linux VM. Note to self, SQL Server for Linux is about 1 year down the track – unfortunate, as the MVC scaffold uses SQL Server LocalDB – so I’ll have to figure out what database I can use.

But, first things first… how to publish my App to Linux?

First step – publish the App using dotnet publish. I found that Bower was not referenced in my Path environment variable. Bower was installed with Visual Studio 2015 Professional – in a sub-folder of my Visual Studio 2015 folder. I’m not sure if it was installed because I had installed tooling for .NET Core, or if it comes by default. Anyway, once that was sorted, dotnet publish worked fine and it created a portable for me to use.

I copied all of the files in the folder that dotnet publish created over to my Amazon Linux server. (I used WinSCP for this). Then I found I needed to install .NET Core on Amazon Linux. Installation was easy but when I tried to run the dotnet CLI, I received an error. Running dotnet –info from a bash shell I saw,

dotnet: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.18′ not found (required by dotnet)

After several hours of searching, I found this was due to the libstdc++ library version on my Amazon Linux distribution, which was libstdc++47. I had ensured my VM was up to date, so it seems that the libstc++ version was lagging (for whatever reason). After running the below, I was able to successfully run the dotnet CLI, with a valid response from dotnet –info.

sudo yum install libstdc++48

So, I had installed .NET core and fixed up the reference library it needed. After that, I needed a way to access my Web App. Kestrel is a web server built into .NET Core and can listen on any port you tell it to. However, it is better to use a web server as a proxy/reverse-proxy to relay request/response to Kestrel.

I already had a web server on my Amazon Linux VM. So I configured it with proxy and reverse-proxy mappings (to the Kestrel server in my .NET Core App). The calls to my web server forward on to the Kestrel server in the App, and vice-versa. I ran the App with dotnet run and checked access via the web server. All good 🙂

Almost there.

Finally, I installed supervisor to manage start/stop/restart of my App. And set up a script to ensure supervisor is re-started whenever my Amazon Linux machine restarts.

Note, this would have taken days if not for the early groundwork of several people who blog their efforts.

Now I have a Web App but no database. So next step is to get the database up and running.

Scrapy Spiders, Python Processing & Web APIs

Over the past couple of weeks, I’ve spent some time drafting a Web App for Touch Footy results. The App is built on .NET Core, and this gave me a great opportunity to review the new Visual Studio .NET Core tooling. But once I had my bare bones App, I needed some data to play with. Enter Scrapy and Python…

What I wanted was a data set that I could use in the Touch Footy App. I had a good data source and I figured my best bet was a web scraper. Scrapy made it easy for me to scrape together my test data set. It’s built using Python (hence you need some understanding of Python to use it). Python is an interpreted language. It’s great for list processing and it’s easy to read/write.

Scrapy is an open source framework for writing web crawlers, or spiders. It gives you control over how and when you execute the spiders you’ve written. And a great shell as a part of the framework to test/debug commands. After looking at other web scraping options, I decided on Scrapy as a neat way to get my data.

After a few hours coding, I had a crawler that collected the data I wanted, i.e. groups, teams, fixtures and results for my Web App. I wanted to store the data in JSON – for easy processing – and Scrapy made that easy too. It was simple then to write some Python code to process the JSON for groups, teams, fixtures and results.

All good so far, and fun to boot. My next step – how to get the data to the Touch Footy Web App? Well, Visual Studio 2015 tooling for .NET Core makes it easy to add a Web API to an MVC Web Application…. several hours later I had a working spider populating data into my App.

Spinning ever faster

The world’s not spinning any faster… but working in IT these days sees my thoughts spinning faster than ever. I’m a career IT guy and I have to stay on top of this stuff. Not just for my career, but it’s just who I am. It’s in my DNA.

I’ve done this tech stuff for a long time, and I’m pretty quick on the uptake. But it’s getting crazy out there. The rate of tech advance increases day by day, creating new or deeper specializations over time. It’s a time-consuming effort to maintain the general knowledge to manage IT well.

Mind you, that general knowledge is actually highly specialized tech knowledge. And very valuable it is too. So, in an effort to make it more accessible for myself, I’m gong to start writing down my thoughts. Maybe even organizing them.

And perhaps they will be useful not just for me. So here we go…