Serverless | Cloud Academy Blog https://cloudacademy.com/blog/category/serverless/ Mon, 17 Oct 2022 07:58:28 +0000 en-US hourly 1 https://wordpress.org/?v=6.4.1 The State of Serverless Today: Chatting with a Serverless Advocate https://cloudacademy.com/blog/the-state-of-serverless-today-chatting-with-a-serverless-advocate/ https://cloudacademy.com/blog/the-state-of-serverless-today-chatting-with-a-serverless-advocate/#respond Thu, 02 Jun 2022 16:28:42 +0000 https://cloudacademy.com/?p=49887 Cloud Academy’s CEO Stefano Bellasio sat down with Serverless Advocate Lee Gilmore to chat about the current and future state of the technology. Lee shares a ton of technical insights and passion for his work in this interview. Lee, thank you for your time. Let’s start with a quick intro...

The post The State of Serverless Today: Chatting with a Serverless Advocate appeared first on Cloud Academy.

]]>

Cloud Academy’s CEO Stefano Bellasio sat down with Serverless Advocate Lee Gilmore to chat about the current and future state of the technology. Lee shares a ton of technical insights and passion for his work in this interview.

Lee, thank you for your time. Let’s start with a quick intro about you and why you have been spending so much time in the serveless world. 

It’s great to chat with you! I’m Lee, a Global Serverless Architect working for City Electrical Factors in the UK, and City Electric Supply in the US; supporting the business in their global adoption of Serverless. I’ve also recently been a technical advisor for Sedai based in California who are doing amazing things in the AI and autonomous World, and I’m an AWS Community Builder, active blogger, mentor and speaker of all things Serverless!

I first got into Serverless in 2014 when Lambda and FaaS was a new ‘thing’, and quickly started to use it in production solutions in enterprise organisations, starting with background processing tasks and progressing usage in line with new AWS Serverless services as they came out (Step Functions, AppSync, EventBridge etc). Jump forward to 2022 and Serverless in my opinion is both the current, and future, of enterprise architectures.

Serverless

The last few years has had major focus and investment from AWS in the Serverless space, with many traditional ‘serverful’ services like RDS Aurora, Redshift and MSK now making their Serverless introduction! I only see this continuing over the coming years, as teams increase their agility, and reduce their costs, through a move to services requiring no maintenance or operation. I also see more teams integrating services like Lego building blocks, reducing the need for ‘glue code’ using Lambda, which many in the industry believe to be the future of low or no code solutions.

We do however have a way to go on the database front, with the only viable options for production workloads being DynamoDB and RDS/RDS Proxy in my opinion, due to scaling and connection management concerns with alternatives. I would love to see a Serverless version of DocumentDB with the connection management taken care of for us, and Serverless Aurora v2 is still very new to go all in on.

What’s your experience with Serveless in large organizations? Is it a normal component and framework inside their orgs or is it still “new” for most of them? 

I have been lucky to work for technically minded organisations like City (CEF/CES), AO.com and Sage PLC; all of which had an event-driven and Serverless first mindset from the top down.

A great example of scale would be at Sage where I architected the Sage Online Services (Online Payslips, Online Timesheets, Online Documents etc), which had 3.5m+ employees using the system, and over 150K companies registered, each company typically with many users for managing the employees. The domain services which made up the overall solution were fully Serverless, and it was estimated to have saved the company around £80K per annum on AWS operational costs alone; whilst increasing agility to allow for deployments to other countries like France and Canada through internationalisation which was baked into the design from the start.

What do you think of some of the newer features of something like function URLs? What would be its main advantages and challenges?

This is a great question. I can see this feature being used in a limited capacity in enterprise organisations, as we already have API Gateway and HTTP APIs, and they are simple to setup through IaC with no real overheads, as well as being fully featured. 

An example where I may use this feature would be a very simple isolated webhook; but the limitations on request and response validations, caching, custom domain names (which you can work around using CloudFront distributions), and limited authentication options means it’s use case is niche for me. 

New patterns and reasons to use this feature will emerge over time though, as the Serverless community are great at finding new use cases and patterns for new features!

AWS announced that Lambda will now support up to 10 GB Ephemeral Storage. What do you think the effect of this will be for users?

This is such a great feature, and opens up so many new use cases! In the past, the teams I worked with typically used the available 512MB storage for limited use cases such as storing dynamic small config files and email templates as HTML with placeholders. This allowed us to cache a small amount of email templates so we didn’t need to keep reading from S3 on every invocation.

Now we can support use cases like pulling down a huge quantity of sizeable images and large font files dynamically which are subsequently cached for generating different sized marketing ads on the fly in different languages, without needing to constantly read from S3 or EFS every time the Lambda is invoked. This is a use case I had a year ago which would become very simple now, yet back then I had to work around this limitation for speed and cost with a complex solution.

I was reading about AWS Lambda Power Tuning: at what stage of development/architecture would using this become more relevant?

So for me this should ideally be done autonomously over time through AI Ops based on machine learning algorithms. This is where companies such as Sedai are excelling in this field. 

This to me is not something we can do once within the SDLC; this is something which is fluid over time based on changes in environment and user behaviour, and code and dependency updates. Ideally this would run in your CI/CD pipelines for constant validation. 

That being said, historically I think this is something teams unfortunately forget about and generally just over provision memory, and then never come back to it unless they hit issues.

Are there issues specific to security that enterprises need to be aware of?

I am a big advocate of Serverless Threat Modelling, as serverless as a paradigm typically means a greater attack surface for bad actors due to the use of many more services compared to Serverful solutions; each service with their own specific configurations and limits, extrapolated out across the full enterprise! Serverless Threat Modelling allows teams to look at the proposed architecture as a group to weed out and mitigate some of these potential threats as early as possible in the SDLC. I have written a detailed article on Medium explaining how to use this approach with your teams, and what the tangible benefits are.

Examples of common serverless threats could be denial of wallet attacks due to poor/non-existent rate limiting and authentication, lack of controls around malicious payloads or file uploads containing malware, or information disclosure due to privileges which are too open from a security perspective (for example publicly accessible S3 buckets or Opensearch clusters).

Serverless strengths

What are some disadvantages to serverless architectures for companies already using ‘serverful’ – and what would be some good ways to overcome these challenges?

Serverless is complex when done at an enterprise level, and one of the biggest issues I typically see is around education and lack of experience across an organisation. There is a famous tweet from Elon Musk stating, “Prototypes are easy, production is hard”, and this is no different in the Serverless World!

I typically see teams starting out falling into what I call the ‘Serverless Dunning-Kruger’ effect, where they quickly create a basic serverless app and push it to the cloud, falling into the trap of thinking it’s that easy. This in my experience would be prototype quality at this stage. As teams get more experienced they soon realise that there are a lot more areas to consider when productionising a serverless solution, such as choosing from many services which have feature parity, huge amounts of service configurations to reason about, disaster recovery needs, compliance, caching, authorisation and authentication, load testing for downstream systems etc etc – I could go on! This can lead to cognitive load and frustration at times, and it takes time for teams to become fully comfortable with using Serverless at an enterprise scale, compared to perhaps more legacy based containers and n-tier style applications.

There are undoubtedly huge advantages too serverless, but for an organisation that is currently excelling with serverful solutions, and that have experts and experience in that field, they should see a move to serverless as a marathon not a race in my opinion. 

What have you found to be the state of serverless experts in the field, or at least practitioners that are experienced? Is there a knowledge gap?

I think there is a large gap currently when it comes to expert serverless experience at an enterprise level, as this requires more thought leadership around design patterns, reference architectures and governance at large scale. An example of somebody who is excelling in this field would be Sheen Brisals in my humble opinion. When I start working with an organisation at a global level I typically look at three key areas: 


Firstly, I use the ‘Serverless Architecture Layers’ pattern to help distributed teams focus on how to design their solutions so that they can easily be consumed, integrated with each other, that business logic is reusable, we have sharable components, and teams don’t need to reinvent the wheel when it comes to cross cutting concerns (such as logging, tracing, authentication, authorisation, event-driven communication etc to name a few). Without this governance,  organisations can get into a mess when they have many autonomous teams working in their own silos as shown in Conways Law.

I secondly look to implement a Thoughtworks Tech Radar to get some governance around how teams work day to day with serverless and the technologies and frameworks they use; with the main aim to allow us to focus on reference architectures, reusable components and cross-cutting concerns, as well as ways of working for example. Again, without this governance, and with teams working in silos; it means that reuse of both knowledge and solutions is near impossible without tackling this early.

Lastly, I help teams working through the ‘Serverless Dunning Kruger’ effect we discussed earlier using a method I call TACTICAL DD(R), which prompts teams to think about non functional requirements at the definition of ready and definition of done stages; again as a light framework which contains the main areas which are missed typically when productionising serverless solutions. This alongside the Well Architected Framework is very beneficial in my experience.

Using these three strategic approaches can help ensure we have a more productive transition to Serverless across an organisation in my experience.

Last question, what’s a dream project you’d love to work on and solve by using serverless architecture?

It’s the project I am working on currently at City, which is taking our more monolithic solutions and transitioning them to Serverless over time globally! When I was approached by our Global Director of Software Engineering, Matthew Carr, about the role of Global Serverless Architect, I was compelled to apply due to his exciting serverless architectural vision for City, and the challenges that come with this for a hugely successful global organisation. The great thing about City is the people, with the development teams I work with having a real thirst and drive for serverless knowledge and experience, whilst pushing the boundaries of serverless innovation internally through well thought out POCs.

References

Serverless Architecture Layers: https://levelup.gitconnected.com/serverless-architecture-layers-a9dc50e9b342
Serverless Threat Modelling: https://leejamesgilmore.medium.com/serverless-threat-modelling-df8e4028ef6d
Serverless TACTICAL DD(R): https://levelup.gitconnected.com/serverless-tactical-dd-r-23d18d529fa1 

The post The State of Serverless Today: Chatting with a Serverless Advocate appeared first on Cloud Academy.

]]>
0
AWS Lambda re:Invent Wrap-up Week 1 https://cloudacademy.com/blog/aws-lambda-reinvent-wrap-up-week-1/ https://cloudacademy.com/blog/aws-lambda-reinvent-wrap-up-week-1/#respond Fri, 11 Dec 2020 16:53:57 +0000 https://cloudacademy.com/?p=45258 All the newest AWS Lambda announcements In the first week of re:Invent, there were a few interesting announcements for Lambda that we should take a moment to look at. A few of these have the potential to save some of us quite a lot of money, and the others help...

The post AWS Lambda re:Invent Wrap-up Week 1 appeared first on Cloud Academy.

]]>
All the newest AWS Lambda announcements

In the first week of re:Invent, there were a few interesting announcements for Lambda that we should take a moment to look at. A few of these have the potential to save some of us quite a lot of money, and the others help to build on the service as a whole.

One cool new addition gives some users a new way to interact with AWS that they previously might have been locked out of, based on their architectural setup. And in total we have four new noteworthy additions that we can investigate, so let’s just dive in!

Lambda duration billing — actually being charged for what you use!

I know I’m not opening up with the fireworks here, but check out this headline:

AWS Lambda duration billing shortened

While this might not seem earth-shattering, it actually helps a huge number of people who have been leaving a lot of money on the table because of the billing duration minimum. Let’s say for example that you have a simple data lookup and calculate function that takes 10ms to run. 

This function is integral to serverless architectures and is called millions of times per day. Previously, with the 100ms billing duration, you would be charged for 10x the amount of processing time you actually used. Unless you are somehow able to batch these requests and send them out together, which might increase the complexity of the solution, you were just losing value. Losing value really hurts the soul. 

Now, by having the granularity to bill by the millisecond, the company in our example can save a huge amount on its Lambda costs!

Overall this is a great update to Lambda that I’m surprised didn’t come way sooner. And I think it’s pretty safe to say that most workloads will take more than 1ms to run, so I wouldn’t expect this billing duration to drop any lower. 

Amazon CloudWatch Lambda Insights (I was blind but now I can finally see!)

Launch of Amazon CloudWatch Insights

I think one of the most painful parts about using Lambda is learning how to optimize and make things more efficient

Having the billing granularity lowered is a real boon to many people, but if you are not currently using the service the right way, you are leaving even more money on the table. 

Creating and maintaining efficient code is the best way to save money and utilize Lambda to its fullest. With that in mind, the ability to see and understand how your functions are performing was dreadfully lacking. 

Now, with Cloudwatch Lambda Insights, we have the ability to keep an eye on our Lambda functions. Insights gives us the visibility to monitor and troubleshoot by giving access to automatically created dashboards that summarize the performance and health of Lambda functions. This could be used to help diagnose memory leaks or changes in performance for your code when trying out new versions.

I can stretch out my legs!

AWS Lambda enhanced horsepower

That’s a lot of room for activities. With 10GB of memory to play with, Lambda can actually start doing some real computational problem-solving. With the previous limit of 3GB, there was not a whole lot of headroom for memory-intensive workloads.

If you think about it, the original implementation of Lambda addressed small functions that could burst as needed. With the update, we can now work on larger, more substantial workloads.

This means we now have the ability to do batch processing, ETL jobs, and a number of other media-type workloads. Lambda can even be used for “large file” serverless video rendering and processing. 

These are things that, traditionally, I would have wanted to put on an EC2 instance to process.  I’m very comfortable with VMs, but managing a fleet of EC2 instances is like herding cattle.  

Early in the days of AWS, their training referenced cloud resources as “cattle, not pets.”  At the time, the corporate message was that we should not get emotionally attached to our resources. Today, the analogy has evolved, I think, to be more fitting. Now, I’m having to manage the herd, and it is a chore. 

Although even with the advantage of having up to 6 vCPUs to play with — which AWS states is “a thread of either an Intel Xeon core or an AMD EPYC core” — I could see some issues with having enough raw compute to deal with these large memory workloads. And we are still constrained by the 50MB package size for building our Lambda solutions. Just stuff to think about, I guess.

For that special person who married containers… and wants to use Lambda.

AWS Lambda now supports container images

Alrighty, well this one has a few angles we can take a look at. 

The first one I see is that it helps address the issues I stated above about Lambda. This functionality allows you to package up dependencies that are larger than 50MB, up to 10GB! This is a huge increase that gives Lambda some much-needed reach. The challenge is that it does require you to become familiar with container development and learn how to use Docker, but at least there is a path now!

The next avenue is for people who are so fully invested in containers, their upkeep, and their infrastructure, that Lambda was untenable before. This gives that group of customers a way to use the serverless technology in a hilariously server-filled way. 

Amazon is providing base images for each supported Lambda runtime (Python, Node.js, Java, .Net, Go, Ruby), and you can add your code and dependencies from there. They are also releasing an open-source Lambda runtime interface emulator that will allow you to run local tests of your container image to check if it will deploy correctly. Neat!

The wrap-up 

Overall these are some nice improvements to the service that I think many of us will find handy. While most of them are not super groundbreaking, they are going to allow us to start using the service in new ways. 

I think the larger memory footprint of 10GB is the most impressive update, as it will allow people to really start moving more complex workloads into the serverless cloud, as it were. 

However, a close runner up in my book is the billing duration being lowered to 1ms. I truly think this will save some people a ton of money, or at least provide some amount of headache reduction. 

Having greater logging and visibility in Lambda is a welcome update, although it is something I have wanted all along. Beggars can’t be choosers. (I suppose.) 

Finally, having the ability to use Lambda within a container is pretty nifty, but personally, it doesn’t grab my attention that much. Maybe it’s because I lack imagination in that area. I’ll keep an eye out and see what people start to do with the technology and come back to let you know.

Cheers!

-Will Meadows

The post AWS Lambda re:Invent Wrap-up Week 1 appeared first on Cloud Academy.

]]>
0
How to Go Serverless Like a Pro https://cloudacademy.com/blog/how-to-go-serverless-like-a-pro/ https://cloudacademy.com/blog/how-to-go-serverless-like-a-pro/#respond Tue, 22 Oct 2019 07:14:42 +0000 https://cloudacademy.com/?p=37202 So, no servers? Yeah, I checked and there are definitely no servers. Well…the cloud service providers do need servers to host and run the code, but we don’t have to worry about it. Which operating system to use, how and when to run the instances, the scalability, and all the architecture...

The post How to Go Serverless Like a Pro appeared first on Cloud Academy.

]]>

So, no servers?

Yeah, I checked and there are definitely no servers.

Well…the cloud service providers do need servers to host and run the code, but we don’t have to worry about it. Which operating system to use, how and when to run the instances, the scalability, and all the architecture is managed by the provider. 

But that doesn’t mean there’s no management at all. It’s a common misconception that in a serverless paradigm, we don’t have to care about monitoring, testing, securing, and other details that we are used to managing in other paradigms. So let’s explore the main characteristics that we need to take in consideration when building a serverless solution.

First, why serverless?

One of the great advantages of serverless is that you only pay for what you use. This is commonly known as “zero-scale” which means that when you don’t use it, the function can be reduced down to zero replicas so it stops consuming resources — not only network I/O, but also CPU and RAM — and then brought back to the required amount of replicas when it is needed.

The trigger of a function on an AWS Lambda can be an API gateway event, a modification on a DynamoDB table or even a modification on an S3 file as defined in What Are AWS Lambda Triggers?  But to really save money on serverless, you need to take into consideration all of the services that a Lambda needs to work. Serverless architecture provides many advantages, but it also introduces new challenges. In this article, we’ll provide best practices when building a serverless solution.

To deep dive into building, deploying, and managing the serverless framework, check out Cloud Academy’s Serverless Training Library. It’s loaded with content and hands-on labs to give you the practical experience you need to integrate serverless architecture into your cloud IT environment. 

Serverless Training Library

Costs

Storage

Even though it is not a direct cost, it is a common architectural design to store some of the assets used on a Lambda on an S3 bucket, so we need to add the S3 cost to the total cost.

Network

If you’re sending or receiving large amounts of data on each request, you need to carefully review this cost because on peak hours it can easily go really high.

API calls

This is another hidden cost, since it’s not charged to the Lambda resources. You may have a lot of API calls to consume database information or others, so it still is an important part of the total cost.

Cold starts

A cold start is the first time the Lambda is getting executed after shutting down to zero replicas (40 to 60 minutes after the last execution). At a cold start, the Lambda might spend a larger time to get everything ready and respond. So even though it is not an actual extra charge, you might want to avoid cold starts by increasing your memory limits or create a script that “warms up” the lambda by calling it every few minutes. Either of the two solutions represents an extra cost for the Lambda.

The actual execution time

The execution time is measured by periods of 100ms. So if we have invocations that run for less than 100ms, let’s say 25ms, it would end up costing the same. And that’s why sometimes we spend more money than what we actually should. Even if the execution time exceeds only by 5 milliseconds (105ms) We still have to pay for the whole period of execution time.

To get all of this information about how much are we really spending, we need to monitor the Lambda.

Monitoring the Lambda

A common mistake is to confuse zero administration with zero monitoring. On a serverless environment, we still need to pay attention to the metrics, and these will be a bit different from the traditional ones like CPU, memory, disk size, etc. Lambda CloudWatch Metrics provides very useful metrics for every deployed function. According to the AWS documentation, these metrics include:

  • Invocation Count: Measures the number of times a function is invoked in response to an event or invocation API call.
  • Invocation Duration: Measures the elapsed time from when the function code starts executing to when it stops executing.
  • Error Count: Measures the number of invocations that failed due to errors in the function (response code 4XX).
  • Throttled Count: Measures the number of Lambda function invocation attempts that were throttled due to invocation rates exceeding the customer’s concurrent limits (error code 429).
  • Iterator Age: Measures the age of the last record for each batch of records processed. Age is the difference between the time the Lambda received the batch, and the time the last record in the batch was written to the stream. This is present only if you use Amazon DynamoDB stream or Kinesis stream.
  • DLQ Errors: Shows all the messages that Lambda failed to handle. If the event was configured to be handled by the DLQ, it can be sent again to the Lambda function, generate a notification, or just be removed from the queue.

Besides the default metrics, there are plenty of monitoring services like Dashbird, Datadog, and Logz.io that can be integrated, so we can have additional metrics for a better logs visualization.

Right now, everything seems very clear and straightforward, right? We have some new metrics and configurations to learn, but it is pretty similar to our traditional structures. 

But what about tests? Can we even make local tests for serverless?

Tests

Local testing

Since we don’t manage the infrastructure anymore, can we run it locally? If so, how can we do that?

We do have some options to simulate the serverless environment locally, like LocalStack and Docker-Lambda. They can simulate serverless functions and a few other services, such as an API Gateway. But most of these tools have some differences with the real environment, like permissions, authentication layer, and other services.

The best way to check if everything is working as intended is writing the actual tests!

Unit testing

Unit tests are always a most — whether or not your app is serverless. They are the cheapest (fastest to write and execute). We can use mocked-up functions to test the business logic in isolation.

Integration testing

Integration testing will allow you to catch errors when your function interacts with external services. This tests becomes very important since serverless apps usually rely on a combination of external functionalities that communicates to each other constantly. 

GUI testing

UI tests are usually expensive and slow because we have to run it on a manual, human-like environment. But, serverless makes it cheaper because of a fast and cheap parallelization.

To make the app easier to test, a good approach is to divide the function into many smaller functions that join together to accomplish the same task. One of the best ways to do it is by applying an Hexagonal Architecture

Conclusion

Serverless architecture might be a big paradigm change, providing us with a whole new bag of useful tools and advantages. But it also introduces new challenges to the developers that need to make decisions about the new options they have. Learning the best practices before start developing it is always the easiest and short path to adopt any new paradigm. Hopefully, these tips will help you to decide which are the best approaches in your new project.

The post How to Go Serverless Like a Pro appeared first on Cloud Academy.

]]>
0
Google Cloud Functions vs. AWS Lambda: The Fight for Serverless Cloud Domination https://cloudacademy.com/blog/google-cloud-functions-vs-aws-lambda-the-fight-for-serverless-cloud-domination/ https://cloudacademy.com/blog/google-cloud-functions-vs-aws-lambda-the-fight-for-serverless-cloud-domination/#respond Fri, 06 Sep 2019 15:54:16 +0000 https://cloudacademy.com/?p=38270 Serverless computing: What is it and why is it important? A quick background The general concept of serverless computing was introduced to the market by Amazon Web Services (AWS) around 2014 with the release of AWS Lambda. As we know, cloud computing has made it possible for users to manage...

The post Google Cloud Functions vs. AWS Lambda: The Fight for Serverless Cloud Domination appeared first on Cloud Academy.

]]>
Serverless computing: What is it and why is it important?

A quick background

The general concept of serverless computing was introduced to the market by Amazon Web Services (AWS) around 2014 with the release of AWS Lambda. As we know, cloud computing has made it possible for users to manage virtual computers and services, but customers still have to be proficient with provisioning and managing compute resources. 

AWS decided to take another step in making cloud computing easier and more accessible by managing the underlying compute layer (or abstracting the infrastructure layer as you might hear it said). In the case of AWS, Lambda runs a code function without requiring you to provision the virtual machine and the operating system that runs that code. 

In this article, we’ll cover the basic functions of serverless computing. To deep-dive into this topic and learn how to build, deploy, and manage the Serverless framework, check out Cloud Academy’s Serverless Training Library. With Learning Paths, Courses, Quizzes, Exams, and Hands-on Labs, you’ll gain the technical knowledge and practical experience that you need to integrate serverless architecture into your cloud IT environment.

Google Cloud Functions vs. AWS Lambda

Why is serverless computing important?

Let’s clear something up first: There is still a server involved in the serverless model, but the cloud provider manages that resource, not you, so serverless computing can possibly be better described as Functions-as-a-Service. Serverless computing is a bit like a car rental service. You just want a vehicle to get you to your destination, whether that is just across town or across the country. It is expected you will drive carefully when using the vehicle, and you will report any damage. But you are not expected to pay for the car to be built or delivered to the pickup facility first before you use it, and you are not expected to contribute to the cost of buying or preparing the car. You only pay for the time that we use the service. 

The second thing to bear in mind with serverless computing — and this is the main benefit — is the effect of all the managing that the cloud provider does for you, namely, you have more time to work on developing and delivering the application! 

To summarize, you get these advantages from serverless computing:

  • Less worry — you don’t have to provision or manage the server
  • Scalability — can handle any workloads so your work remains viable
  • Cost — costs can stay under control as you cannot overprovision

General info on AWS Lambda and Google Cloud Functions

Amazon was first to market with serverless functions through their Lambda offering in 2014, and as such has been at the forefront of development. Google Cloud Functions was launched to beta in 2017 and to general availability in 2018. Google’s offering was about four years behind but has managed to catch up in many ways.

Lambda can be used in conjunction with other like services on AWS such as: 

  • Serverless Application Model (SAM) — an open-source framework for building serverless applications
  • Serverless Application Repository — a managed repository for serverless applications
  • Cloud9 — an integrated development environment (IDE) for writing, running, and debugging code

Google is part of a larger family of serverless offerings which include:

  • Cloud Functions — serverless code
  • App Engine — serverless application development platform
  • Cloud Run —  stateless containers

It’s good to remember that in general, both Lambda and Cloud Functions have been designed to play nicely with tons of other services, as long as they’re in the same provider ecosystem.

Practical Applications of Google Cloud Functions and AWS Lambda

Here are a couple of real-world examples of serverless computing, whether they’re implemented on Google Cloud Functions or Amazon Lambda:

  • Realtime stream processing — gather data without setting up infrastructure or logging. This can be event-driven so that it the functions work whether you have a few requests per day or thousands per second.
  • Connecting to third party applications/hardware — you can use serverless functions as lightweight integrations to other applications automate certain tasks within your cloud environment.

You can get more detailed and concrete insight and (importantly) hands-on practice by testing out Cloud Academy’s Hands-on Labs for Google Cloud Functions and AWS Lambda.

Hands-on Lab for Google Cloud Functions Events

In the Cloud Functions Hands-on Lab, you can:

  • Learn basic tenets of serverless architecture
  • Create functions from the GCP console
  • Test your skills against a subject matter expert’s real-world tasks

You can also try out our Intro to AWS Lambda Hands-on Lab guides you through building functions with Node.js. 

Hands-on Lab on Intro to Lambda

Google Cloud Functions and AWS Lambda Features

Functionality AWS Lambda Google Cloud Functions
Scalability & availability Automatic scaling (transparent) Automatic scaling
Max. # of functions Unlimited functions 1000 functions per project
Concurrent executions 1000 parallel executions per account per region  1000 parallel executions (per function, for background functions)
Max. execution time 900 seconds (15 minutes) 540 seconds (9 minutes)
Supported Languages Java, Go, PowerShell, Node.js, C#, Python, and Ruby code, and a Runtime API which allows you to use any additional programming languages to author your functions Node.js, Python, Go
Deployments .zip or .jar file consisting of your code and any dependencies ZIP upload, Cloud Storage, or Cloud Source Repositories
Versioning Versions and aliases Cloud Source branch/tag
Event-driven Event Sources (S3, SNS, SES, DynamoDB, Kinesis, CloudWatch) Cloud Pub/Sub or Cloud Storage Object Change Notifications
HTTP(S) invocation API Gateway HTTP trigger
Logging CloudWatch Logs Stackdriver Logging
Monitoring CloudWatch and X-Ray Stackdriver Monitoring
In-browser code editor Only if you don’t have dependencies Only with Cloud Source Repositories
Granular IAM IAM roles IAM roles
Pricing 1M requests for free, then $0.20/1M requests, plus $0.00001667/GB-sec 2M requests for free, then $0.40/1M invocations, plus $0.0000025/GB-sec

Conclusion

Serverless functions are a great way to leverage the elasticity of cloud deployments. Make sure to consider the big picture in your environment, such as your medium- to long-term plans and how you want to maintain code that you create for serverless architecture. 

You can get further detailed guidance on serverless architecture by checking out our huge catalog of Learning Paths, Courses, Quizzes, Exams, and Hands-on Labs.

The post Google Cloud Functions vs. AWS Lambda: The Fight for Serverless Cloud Domination appeared first on Cloud Academy.

]]>
0
The Rise of Containers & Kubernetes  https://cloudacademy.com/blog/the-rise-of-containers-kubernetes/ https://cloudacademy.com/blog/the-rise-of-containers-kubernetes/#respond Mon, 19 Aug 2019 13:00:04 +0000 https://cloudacademy.com/?p=37114 Containers have become the standard output of the development process, and Kubernetes has emerged as the standard for container orchestration platforms. With Kubernetes, containers can be managed by clusters in public cloud, hybrid cloud, and even in a multi-cloud environment. In this article, we’ll discuss: Deploying Kubernetes Containers-as-a-Service (CaaS) Advantages...

The post The Rise of Containers & Kubernetes  appeared first on Cloud Academy.

]]>
Containers have become the standard output of the development process, and Kubernetes has emerged as the standard for container orchestration platforms. With Kubernetes, containers can be managed by clusters in public cloud, hybrid cloud, and even in a multi-cloud environment.

In this article, we’ll discuss:

If you are new to these topics, Cloud Academy offers introductory courses that explain the fundamentals: Introduction to Containers and Introduction to Kubernetes.

Cloud Academy Intro to Containers Cloud Academy Intro to Kubernetes

Deploying Kubernetes

Deploying and managing Kubernetes on your own is hard, which is why managed Kubernetes services have emerged as a streamlined way to deploy containers in the public cloud. Managed Kubernetes is evolving, and serverless containers are slowly are becoming a norm.

When running Kubernetes at scale, managing, operating, and scaling its infrastructure to maximize cluster utilization — without suffering from idle resources — can be a big challenge. There are too many your development team needs to manage and configure. This includes selecting the best instance type and size, determining when to scale up or down, and making sure all of the containers are scheduled and running on the best instances — and that is even before starting to think about cost resource optimization.

What is Containers-as-a-Service (CaaS)?

The success of Functions-as-a-Service (FaaS) shows that developers and DevOps teams don’t want to provision the underlying virtual machines or servers. They want to deploy their applications without the operational overhead. While FaaS offerings cater to running functions based on event triggers, there is further innovation in the market where automation is extensively used to support other application architectures using containers: Containers-as-a-Service (CaaS). 

Advantages of serverless CaaS platforms

  • Improved productivity – No operational overhead associated with managing the virtual machines. This not only serves as a cost saver by reducing the operational costs, but also increases the velocity of application delivery by removing friction points through automation.
  • Simple to scale – Since management of underlying virtual machines is the responsibility of the cloud provider, scaling is even more seamless.
  • Greater efficiency – With serverless CaaS, users pay based on the container size rather than the virtual machine size. This is more cost efficient than paying for virtual machine instances.

The resource efficiencies offered by serverless CaaS makes it more attractive for developers and the enterprises are increasingly adopting serverless CaaS to cut down their operational costs and streamline the DevOps pipeline. Along with FaaS, most cloud providers are also offering serverless CaaS to meet the needs of containerized applications.

Serverless CaaS offerings

Cloud providers make use of automation to provide the developers with easy hooks to deploy containerized environments without any need to manually provision and manage the virtual machines. While useful, these services are far from being cost-effective and fully featured. In addition, when you are running on multiple clouds, you would have to learn and manage multiple services. Here are some examples:

Getting the serverless experience on any cloud

Spotinst Ocean lets you deploy containers without worrying about infrastructure management, while gaining deep visibility and dramatically optimizing costs. By providing an abstraction on top of virtual machines, it allows you to deploy Kubernetes or any containerized cluster to any cloud without the need to manage the underlying VMs.

Spotinst Ocean provides a true serverless experience by introducing a new, three-layer approach of automating and optimizing containers workloads:

  1. The pricing model layer – Spot, On-Demand, and Reserved Instances.
  2. The instance sizing layer – Choosing the right infrastructure size and type to satisfy the actual containers requirements.
  3. The containers utilization layer – Changing in real time the limits and resource requests of the containers based on their consumption and utilization.

Summary

Modern application architectures require a more elastic infrastructure, and CaaS makes it easy to use the infrastructure resources more efficiently while providing the flexibility to run many different workloads. While every cloud provider has a CaaS offering to help its consumers deploy modern applications, SpotInst Ocean lets you deploy containerized applications on any cloud provider by tapping into spot instances and reserved instances. SpotInst Ocean provides both resource efficiencies of CaaS along with additional cost efficiencies using analytics and automation, with many market-leading organizations using it with great satisfaction.

Learn more

Join our upcoming “Serverless Containers” webinar on August 28, 2019 to learn how you can run your Kubernetes cluster without managing and operating its underlying infrastructure. We’ll discuss the different services provided by cloud providers and third-party products.

Kevin McGrath, CTO at Spotinst, Tomer Hadassi, Solutions Architects Team Lead at Spotinst, and Thomas Mitchell, Azure Researcher & Trainer at Cloud Academy, will introduce a new approach to automating and optimizing container workloads. This approach will reduce cloud costs and operational overhead, while improving efficiency and performance.

By joining this webinar, you will learn more about:

– Containers-as-a-Service (CaaS)
– The rise of containers and Kubernetes
– Automating, managing, and optimizing container workloads
– CaaS solutions, including a detailed technical demo from Spotinst Ocean
– Reducing infrastructure costs and improving performance

"Serverless Containers" - Running Kubernetes Workloads Without Managing Infrastructure on Any Cloud

 

The post The Rise of Containers & Kubernetes  appeared first on Cloud Academy.

]]>
0
Microservices: Using Distributed Tracing for Monitoring & Troubleshooting https://cloudacademy.com/blog/microservices-using-distributed-tracing-monitoring-troubleshooting/ https://cloudacademy.com/blog/microservices-using-distributed-tracing-monitoring-troubleshooting/#respond Fri, 12 Jul 2019 09:55:52 +0000 https://cloudacademy.com/?p=36174 Modern applications can be found everywhere today. Distributed microservices, cloud-native, managed resources, and serverless are parts of this complex whole. But how can we keep track of so many elements in our production environments? In these distributed environments, microservices communicate with each other in different ways: synchronous and asynchronous. Distributed...

The post Microservices: Using Distributed Tracing for Monitoring & Troubleshooting appeared first on Cloud Academy.

]]>
Modern applications can be found everywhere today. Distributed microservices, cloud-native, managed resources, and serverless are parts of this complex whole. But how can we keep track of so many elements in our production environments?

In these distributed environments, microservices communicate with each other in different ways: synchronous and asynchronous. Distributed tracing has become a crucial component of observability — both for performance monitoring and troubleshooting.

In this article, I’m going to discuss some key topics in instrumentation, distributed tracing, and modern distributed applications. To better understand these topics, watch our webinar on Distributed Tracing in Modern Applications

What is distributed tracing?

Tracing is a way of profiling and monitoring events in applications. With the right information, a trace can reveal the performance of critical operations. How long does a customer wait for an order to be completed? It can also help to a breakdown of our operations to our database, APIs, or other microservices.

Distributed tracing is a new form of tracing that adapted better to microservice based applications. It allows engineers to see traces from end to end, locate failures, and improve overall performance. Instead of tracking the path within a single application domain, distributed tracing follows a request from start to end.

For example, a customer makes a request on our website and then we update the item suggestion list. As the request spans across multiple resources, distributed tracing takes into account the services, APIs, and resources it interacts with.Microservices applications diagram

Applications become more and more distributed

Automated microservices instrumentation

Exploring distributed traces might sound simple, but collecting the right traces with the right context will require considerable time and efforts. Let’s follow an example where we got an e-commerce website that updates our database with purchases:

Microservices (not distributed) diagram

In this example, which is not distributed, to create an interesting trace, we will need to collect the following information:

  1. HTTP request details:
    1. URL
    2. Headers
    3. The ID of the user
    4. Status code
  2. Spring Web:
    1. Matched route and function
    2. Request params
    3. Process duration
  3. RDS database:
    1. Table name
    2. Operation (SELECT, INSERT, …)
    3. Duration
    4. Result
To capture this information we can either do it manually before and after every operation that we make in our code or automatically instrument it into common libraries.

By “automated instrumentation,” we mean “hooking” into a module. For example, every time we make a GET request with “Apache HttpClient,” there will be a listener. It will extract and store this information as part of the “trace.”

Collecting this information manually using logging is not recommended since they are not structured well. Using a more standard way, like OpenTracing, will allow us to filter out relevant traces. We will also have the option to present them nicely in many tools. For example, Python might look like this:

HTTP request in Python with OpenTracing

Capturing HTTP request in Python with OpenTracing

As you can see, this kind of instrumentation requires heavy lifting. It involves integrating to our libraries, as well as constant maintenance to support our dynamic environments.

Standards and tools

OpenTracing

Luckily for us, there are already microservices standards and tools that can help us to get started with our first distributed traces. The first pioneer was OpenTracing, which is a new, open distributed tracing standard for applications and OSS packages.

Using OpenTracing, developers can collect traces into spans, and store extra context (data) to each one of them. For example:

OpenTracing Code

Spans can have a relation – `child of` or `follows from`. These relations can help us get a better understanding of performance implications.

To trace a request across distributed microservices spans, we must implement the inject/extract mechanism to inject a unique “transaction ID.” Then we would extract it on the receiving service. Note that a request can travel between microservices in HTTP requests, message queues, notifications, sockets, and more.

Another common standard is OpenCensus which collects application metrics and distributed traces. OpenCensus and OpenTracing recently merged into a unified standard called OpenTelemetry.

Jaeger

After the exhaustive task of collecting distributed tracing, comes the part of visualizing them. The most popular open source tool is Jaeger, which is also compatible with OpenTracing format. Jaeger will output our traces into a timeline view, which will help us understand the flow of the request. It can also assist in detecting performance bottlenecks:

Jaeger for detecting performance bottlenecks

Managed solution

Ultimately, you might want to consider an automated distributed tracing solution. Epsagon, for example, uses automated instrumentation to provide microservices performance monitoring and visualization of requests and errors in an easier way:

Automated microservices diagram

A managed solution for distributed tracing provides the following benefits:

  1. Traces are being collected automatically without code changes.
  2. Visualizing traces and service maps with metrics and data.
  3. Query data and logs across all traces.

Summary

Distributed tracing is crucial for understanding complex, microservices applications. Without it, teams can be blind into their production environment when there is a performance issue or other errors.

Although there are standards for implementing, collecting, and presenting distributed traces, it is not that simple to do manually. It involves a lot of effort to get up and running. Leveraging automated tools or managed solutions can cut down the level of effort and maintenance, bringing much more value to your business.

To deep dive into microservices, check out Cloud Academy’s Microservices Applications Learning Paths, Courses, and Hands-on Labs. Microservices: Cloud Academy Training

The post Microservices: Using Distributed Tracing for Monitoring & Troubleshooting appeared first on Cloud Academy.

]]>
0
Top 20 Open Source Tools for DevOps Success https://cloudacademy.com/blog/top-20-open-source-tools-for-devops-success/ https://cloudacademy.com/blog/top-20-open-source-tools-for-devops-success/#respond Tue, 09 Jul 2019 12:58:28 +0000 https://cloudacademy.com/?p=33731 Open source tools perform a very specific task, and the source code is openly published for use or modification free of charge. I’ve written about DevOps multiple times on this blog. I reiterate the point that DevOps is not about specific tools. It’s a philosophy for building and improving software...

The post Top 20 Open Source Tools for DevOps Success appeared first on Cloud Academy.

]]>
Open source tools perform a very specific task, and the source code is openly published for use or modification free of charge. I’ve written about DevOps multiple times on this blog. I reiterate the point that DevOps is not about specific tools. It’s a philosophy for building and improving software value streams, and there are three principles: flow, feedback, learning.

The philosophy is simple: Optimize for fast flow from development to production, integrate feedback from production into development, and continuously experiment to improve that process. These principles manifest themselves in software teams as continuous delivery (and hopefully deployment), highly integrated telemetry, and learning and experimentation drive the culture. That said, certain tools make achieving flow, feedback, and learning easier. You don’t have to shell out big bucks to third party vendors though. You can build a DevOps value stream with established open source tools.

Let’s start with the principle flow and what the open source community has to offer for supporting continuous delivery. In this article, we’ll cover the top 20 open source tools to achieve DevOps success. But to dive deeper in deployment pipelines and the role different tools, check out Cloud Academy’s DevOps – Continuous Integration and Continuous Delivery (CI/CD) Tools and Services  Learning Path. DevOps Playbook

Open Source Continuous Delivery

1. Gitlab is a great project for source control management, configuring continuous integration, and managing deployments. Gitlab offers a unified interface for continuous integration and deployment branded as “Auto DevOps.” Team members can trigger deploys or automatically created dedicated environments for a pull-request and see test results all within the same system.

2. Kubernetes and Docker are associated tools like docker-compose to make it easy to maintain development environments and work with any language or framework. Kubernetes is the go-to container orchestration platform today, so look here first for deploying containerized applications to production (and dev, test, staging, etc).

3. Spinnaker is designed for continuous delivery. Spinnaker removes grunt work from packaging and deploying applications. It has built-in support for continuous delivery practices like canary deployments, blue-green deploys, and even percentage based rollouts. Spinnaker abstracts away the underlying infrastructure so you can build a continuous delivery pipeline on AWS, GCP, or even on your own Kubernetes cluster.

Infrastructure-as-Code

The underlying infrastructure must be created and configured regardless of it being on a cloud provider or container orchestration. Infrastructure-as-code is the DevOps way.

4. Terraform (from Hashicorp) is the best tool for open source infrastructure-as-code. It supports AWS, GCP, Azure, Digital Open, and more using a declarative language. Terraform handles the underlying infrastructure such as EC2 instances, networking, and load balancers. It’s not intended to configure software running on that infrastructure. That’s where configuration management and immutable infrastructure tools have a role to play.

5. Packer (also from Hashicorp) is a tool for building immutable infrastructure. Packer can build Docker images, Amazon Machine Images, and other virtual machine formats. Its flexibility makes an easy choice for the “package” step in cloud based deployment processes. You can even integrate Packer and Spinnaker for golden image deployments.

6-9. Ansible, Chef, Puppet, and SaltStack are configuration management tools. Each vary slightly in design intended uses. They’re all intended to configure mutable state across your infrastructure. The odds are you’ll end up mixing Terraform, Ansible, and Packer for a complete infrastructure-as-code solution. Cloud Academy’s Cloud Configuration Management Tools Learning Path gives you an overview of configuration management, and then introduces you to three of the most common tools used today: Ansible, Puppet, and Chef. Cloud Academy’s Ansible Learning Path, developed in partnership with Ansible, teaches configuration management and application deployment. It demonstrates how Ansible’s flexibility can used to solve common DevOps problems.

DevOps: Open Source Tools & Cloud Configuration Management Learning Path

Open Source Telemetry

The SDLC really starts when code enters production. The DevOps principle of feedback calls for using production telemetry to inform development work. Or in other words: use real time operational data such as time series data, logs, and alerts to understand the reality and act accordingly. The FOSS community supports multiple projects to bring telemetry into your day-to-day work.

10. Prometheus is a Cloud Native Computing Foundation (CNCF) project for managing time series data and alerts. It’s integrated into Kubernetes, another CNCF project, as well. In fact, many of the CNCF projects prefer Prometheus for metric data. Support is not limited to CNCF projects either. Prometheus is a strong choice for many different infrastructures because it uses an open API format, includes alert support, and integrates with many common components.

11. Statsd is a Prometheus alternative for time series data. Prometheus uses a pull approach. This is good for understanding if a monitored system is unavailable but requires registering new systems with Prometheus. Statsd on the other hand uses a push model. Any system can push data into a statsd server but data is sent over UDP. Statsd, unlike Prometheus, only support time series data, so you’ll need another tool to manage alerts.

12. Grafana is for data visualization. Projects like Prometheus and statsd only handle data collection. They rely on other tools for visualization. There is where Gafana comes in. Grafana is a flexible visualization system with integrations for popular data sources like Promotheus, Statsd, and AWS Cloudwatch. Grafana dashboards are just text files which makes it a natural fit for infrastructure-as-code practices.

13. The Elastic Stack is a complete solution for time series data and logs. The Elastic Stack uses ElasticSearch for time series data and log storage paired with Kibana for visualization. Log stash connects and transforms logs from various components like web server logs or redis server logs into a standard format.

14. Flutend is another CNCF telemetry project. It acts like a unified logging layer for ingress, transformation, and routing. Data steams may be forwarded to multiple sources like statsd for real time interactions or sent to S3 for archiving. Fluentd supports many data sources and data outputs. Projects like Fluentd are especially useful for connecting disparate to a standard set of upstream tools.

15. Jaeger is a distributed request tracing project compatible with Open Tracing. Traces track individual interactions within a system across all instrumented components with latency and other metadata. This is a must for micro service and other distributed architectures since engineers can pinpoint where, what, and when.

Expanding Out

The third way of DevOps calls for continuous improvement through experimentation and learning. Once the continuous delivery pipeline is established along with telemetry to improve velocity, quality, and customer satisfaction. Here are some projects that help teams improve different aspects of their process.

16. Chaos Monkey is a project by Netflix to introduce chaos into running systems. The idea is to introduce faults into system to increase reliability and durability. This is part of the principles of chaos engineering and further described in Release It! and Google’s Site Reliability Engineering book. The idea that willingly breaking your production environment may sound foreign but doing so will reveal unknowns and train teams to design away possible failure scenarios. You don’t have to go all in at once either. You can rules and restrictions so you don’t destroy your production environment until you’re ready.

17. Vault by Hashicorp is a tool for securing, storing, and controlling access to tokens, passwords, certificates, encryption keys and other sensitive data using a UI, CLI, or HTTP API. It’s great for info-sec minded teams looking for a better solution than text files or environment variables.

Building and deploying software

You’ll encounter some of these tools building and deploying software. This list isn’t exhaustive by any means.

18.Nomad is light-weight Kubernetes alternative.

19.GoCD is another deployment pipeline and CI option.

20.The serverless framework opens the door into an entirely new architecture. Just consider the list of CNCF projects. You’re likely to uncover tools for scenarios you never considered. DevOps-focused teams will assuredly use a mix of FOSS and proprietary software when building their systems. Engineers must understand how the different projects fit into their overall architecture and leverage them for best effect.

Also keep in mind these projects are not infrastructure specific. You can use them for your on-premises infrastructure, AWS, GCP, or Azure systems. Cloud Academy’s Terraform Learning Path teach students to achieve DevOps success with Terraform and infrastructure-as-code, covering AWS and Azure. Engineers can learn these tools and keep skills portable across different setups.

Don’t get lost in tooling though. You can achieve DevOps success irrespective to the underlying tools if the right culture is in place — check out the DevOps Playbook – Moving to a DevOps Culture. The secret is to build on the philosophy that values flow, feedback, and learning and realizes their practices via tools. Learn the ideas, build a culture, and the rest will sort itself out.

The post Top 20 Open Source Tools for DevOps Success appeared first on Cloud Academy.

]]>
0
Introduction to Monitoring Serverless Applications https://cloudacademy.com/blog/introduction-to-monitoring-serverless-applications/ https://cloudacademy.com/blog/introduction-to-monitoring-serverless-applications/#respond Thu, 21 Feb 2019 18:14:34 +0000 https://cloudacademy.com/?p=30054 Serverless as an architectural pattern is now widely adopted, and has quite rightly challenged traditional approaches when it comes to design. Serverless enhances productivity through faster deployments bringing applications to life in minimal time. Time not spent provisioning and maintaining production infrastructure can be invested elsewhere to drive business value...

The post Introduction to Monitoring Serverless Applications appeared first on Cloud Academy.

]]>
Serverless as an architectural pattern is now widely adopted, and has quite rightly challenged traditional approaches when it comes to design. Serverless enhances productivity through faster deployments bringing applications to life in minimal time. Time not spent provisioning and maintaining production infrastructure can be invested elsewhere to drive business value – because at the end of the day that’s what matters!

Now, with your Serverless application shipped into production, maintaining optimal performance requires you to focus in on the operational question of “what’s going on in production?”. In other words, you’ll need to address observability for every operation that takes place within your Serverless application.

Observability

Observability, bounds together many aspects – monitoring, logging, tracing, and alerts. Each observation pillar provides critical insight into how your deployed Serverless application is working and collectively whether your Serverless application is not just working but deriving real business value.

In this post, we are going to discuss each observation pillar, providing you with examples and solutions, which specifically address the Serverless ecosystem.

Monitoring

Monitoring is the main pillar which tells us, “is my system working properly?”. “Properly” can be defined by multiple parameters:

  1. Errors: every request or event that yielded an error result
  2. Latency: the amount of time it takes for a request to be processed
  3. Traffic: the total number of requests that the resource is handling

Compounded together, monitoring allows us to detect highly errored services, performance degradation across our resources, and even scaling issues when we hit higher traffic rates.

Much of our serverless deployment is undertaken within a Function as a Service (FaaS). FaaS provides us with our base compute unit, the Function. Popular examples of cloud-hosted managed FaaS services are:

Using AWS Lambda as our FaaS of choice, monitoring of Lambda functions is accomplished by using CloudWatch Metrics. With CloudWatch Metrics, every deployed function is monitored using several insightful metrics:

Serverless Dashboard

These metrics include:

  1. The number of invocations.
  2. Min/avg/max duration of invocations.
  3. Error counts, and availability (derived from errors/invocations ratio).
  4. The number of throttled requests.
  5. Iterator Age – The “age” of the oldest processed record.

For serverless, we are still missing some unique issues such as timeouts, out of memory and even cost, which we can monitor using a dedicated serverless monitoring and troubleshooting tool, Epsagon:

Serverless Monitoring

Logging

When a problem has been found according to our monitoring parameters, we then need to troubleshoot it. We accomplish this by consulting and analyzing all relevant logs.

Logs can be generated by prints, custom logging, and/or exceptions. They often include very verbose information – that’s why they are a necessity for debugging a problem.

When approaching our logs, we need to know what we are looking for, so searching and filtering within and across logs is essential. In AWS Lambda, all of our logs are shipped to AWS CloudWatch Logs. Each function is assigned to its own log group and one log stream for each container instance.

Log Archive

Once we find the correct log file, we can see the error message and gain a better understanding of what initiated the error.

There are better ways to look for logs other than just using CloudWatch Logs. A known pattern is to stream log data to a dedicated logs aggregation service, for example, ELK. With this in place, when a function fails you, you can simply query and find the corresponding log for the specific function invocation.

The main problem that is often associated and attributed to logs, is that they have minimal or no context. When our applications become distributed, trying to correlate between logs from different services can be a nightmare. For that particular issue, distributed tracing comes to help.

Tracing

Tracing, or specifically distributed tracing, helps us to correlate between events captured across logs on different services and resources. When being applied correctly we can utilize it to find the root cause of our errors with minimal effort.

Let’s imagine for example that we’ve built a blog site – that has the following public end-points

  1. View an existing post /post
  2. Post a new blog post /new_post

And it consists of these resources and services:

Resources and Services

Now, by using the monitoring and logs methods from before we’ve noticed that there’s an error in our Post Analysis lambda. How do we progress from here and find the root cause of this issue?

Well, in micro-services and serverless applications specifically, we want to be able to collect each trace from our micro-services and gather them together as a whole execution.

In order for us to analyze distributed traces, we need two main things:

  1. A distributed tracing instrumentation library
  2. A distributed tracing engine

When instrumenting our code, the most common approach is to use and implement OpenTracing. OpenTracing is a specification that defines the traces’ structure across different programming languages for distributed tracing. Traces in OpenTracing are defined implicitly by their spans –  an individual unit of work done in a distributed system. Here’s an example of constructing a trace with open-tracing:

span = tracer.start_span(operation_name='our operation')
scope = tracer.scope_manager.activate(span, True)
try:
    # Do things that may trigger a KeyError.
except KeyError as e:
    scope.set_tag('keyerror', f'{e}')
finally:
    scope.finish()

It’s advised to use a common standard across all your services and have a vendor-neutral API. This means you don’t need to make large code changes if you’re switching between different trace engines. There are some downsides in using this approach, for example developers need to maintain span and trace-related declarations across their codebase.

So once instrumented, we can publish and capture those traces into a distributed traces engine. Some of which will result in us being able to visually see an end-to-end transaction within our system. A transaction is basically a story of how data has been transmitted from one end to the other within the system

With an engine such as Jaeger we can view the traces organized in a timeline manner:

Timeline

This way we can try and find the exact time this error happened and therefore find the originating event that caused the error.

By utilizing Epsagon, a purpose-built distributed tracing application which we earlier introduced, we can see that the errored lambda in question has received its input from a malformed lambda (Request Processor) two hops earlier, and that it handled an authentication error and propagated a false input to the Post Analysis lambda via the SNS message broker.

Transaction ID

It’s important to remember that when going serverless we have broken each of our microservices into nano-services. Each of them will have an impact on the other, and attempting to figure out the root cause can be very frustrating.

Epsagon tackles this issue by visualizing the participated elements in the system in a graph, and by presenting trace data directly within each of the nodes, helps to reduce the time involved in investigating the root cause significantly.

Alerts

Last but not least, come the alerts. It’s pretty obvious to everyone that we don’t want to wait in front of a monitor 24/7 to see problems.

Vision

Being able to get alerts to an incident management platform is important, so relevant people will be able to get notified and take action. Popular alerting platforms are PagerDuty, OpsGenie, and even Slack!

When choosing your observability platform, you’ll need to make sure you can configure your alerts based on the type of issue, the involved resource, and the destination (i.e. integrates to the platforms above). For Lambda functions, basic alerts can be configured in CloudWatch Alarms:

CloudWatch

In this example, we want to get notified when we breach the threshold of 10 or more errors within 2 consecutive 5-minute windows. A dedicated monitoring tool has the capability to configure more specific alerts such as:

  1. Alert if the function is timing out (rather than a general error).
  2. Alert on specific business flows KPIs.
  3. Alert regarding the performance degradation of a resource (for example Redis).

Summary

We understand that observability is a broad term, that unifies many important aspects of monitoring and troubleshooting applications (in production, or not).

When going serverless, observability becomes a bottleneck for the development velocity, and a proper tool which is dedicated to the distributed and event-driven nature must be in place.

Epsagon provides an automated approach for monitoring and troubleshooting distributed applications such as serverless and reduces the friction of developers’ from losing the observability to their production.

If you’re curious and would like to learn more about serverless applications, join our webinar on the Best Practices to Monitor and Troubleshoot Serverless Applications next March 7th at 10 am PT.

Best Practices to Monitor and Troubleshoot Serverless Applications
Best Practices to Monitor and Troubleshoot Serverless Applications

The post Introduction to Monitoring Serverless Applications appeared first on Cloud Academy.

]]>
0
Google Cloud Functions vs. AWS Lambda: Fight for Serverless Cloud domination Begins https://cloudacademy.com/blog/google-cloud-functions-serverless/ https://cloudacademy.com/blog/google-cloud-functions-serverless/#comments Thu, 10 Aug 2017 10:20:58 +0000 https://cloudacademy.com/blog/?p=13977 A not entirely fair comparison between alpha-release Google Cloud Functions and mature AWS Lambda: My insights into the game-changing future of serverless clouds. Serverless computing lands on Google Cloud: Welcome to Google Cloud Functions Update: The open beta of Google Cloud Functions was launched in March 2017. The comparison table has...

The post Google Cloud Functions vs. AWS Lambda: Fight for Serverless Cloud domination Begins appeared first on Cloud Academy.

]]>
A not entirely fair comparison between alpha-release Google Cloud Functions and mature AWS Lambda: My insights into the game-changing future of serverless clouds.

Serverless computing lands on Google Cloud: Welcome to Google Cloud Functions

Update: The open beta of Google Cloud Functions was launched in March 2017. The comparison table has been updated accordingly.

The alpha release of Google Cloud Functions was officially launched in February 2016 as part of the Google Cloud Platform solution. This new cloud service aims at relieving most of the pain caused by server maintenance, deployments, and scalability. It perfectly aligns with the serverless revolution started by AWS Lambda back in 2014.

“Serverless” means that you can focus on your application logic without dealing with infrastructure at all (almost). Painless development, deployment, and maintenance of a web API is still not turn-key, although modern web application frameworks have improved dramatically in the last 5 years. Serverless computing is definitely a game changer. The event-driven approach combined with the rich cloud ecosystem offered by main cloud vendors AWS, Microsoft Azure, and Google Cloud Platform offers endless possibilities.

In this post, I would like to discuss the upcoming features of Google Cloud Functions and compare them with the current status of AWS Lambda. I’ll provide some basic examples of how you’ll be able to migrate and then test your Lambda functions on Google within minutes. I also want to explore what serverless computing may look like in just a few months.

Google Cloud Functions & AWS Lambda

First of all, I have to admit that comparing an alpha release with a two-year-old stable product is not completely fair. That said, I believe that some of the functionalities already offered by Google Cloud Functions will make a substantial positive difference, especially from a development point of view.

Here is a quick recap of the main functionalities of both products:

Functionality AWS Lambda Cloud Functions
Scalability & availability Automatic scaling (transparent) Automatic scaling
Max. # of functions Unlimited functions 1000 functions per project
Concurrent executions 1000 parallel executions per account per region (default safety throttle) 400 parallel executions (per function, soft limit)
Max. execution time 300 seconds (5 minutes) 540 seconds (9 minutes)
Supported Languages  JavaScript, Java, C# and Python Only Javascript
Dependencies Deployment Packages npm package.json
Deployments Only ZIP upload (to Lambda or S3) ZIP upload, Cloud Storage, or Cloud Source Repositories
Versioning Versions and aliases Cloud Source branch/tag
Event-driven Event Sources (S3, SNS, SES, DynamoDB, Kinesis, CloudWatch) Cloud Pub/Sub or Cloud Storage Object Change Notifications
HTTP(S) invocation API Gateway HTTP trigger
Logging  CloudWatch Logs Stackdriver Logging
Monitoring  CloudWatch and X-Ray Stackdriver Monitoring
In-browser code editor Only if you don’t have dependencies Only with Cloud Source Repositories
Granular IAM IAM roles IAM roles
Pricing 1M requests for free, then $0.20/1M invocations, plus $0.00001667/GB-sec 1M requests for free, then $0.40/1M invocations, plus $0.00000231/GB-sec

Let’s dig deeper into each functionality.

Scalability, availability, and resource limits

Of course, this is the primary focus of both services. The key feature promises that you no longer need to worry about maintenance, downtime, or bottlenecks.

As far as AWS Lambda, scalability is completely and transparently handled by the system, which means that you don’t know how many instances or machines your functions are running on at a given time. You can monitor the usage of your Lambda functions anytime, but your visibility of the underlying architecture is limited.

On the other extreme, Google Cloud Functions explicitly creates a set of instances in the cloud. In this way, you can always check the number of machines created and monitor the load of your cluster.

In addition to scaling and monitoring, AWS Lambda has other limitations. For example, you can create an unlimited number of functions, but each execution cannot exceed five minutes (it used to be much shorter!) and you are limited to 100 parallel executions per account per region as a default safety throttle, which can be increased upon request. Furthermore, your zipped deployment packages can’t exceed 50MB (250MB when uncompressed). There are a few more AWS Lambda Limits, but I think they actually affect only very specific scenarios, so I won’t mention them here.

On the other hand, Google Cloud Functions doesn’t seem to impose such limitations (yet?), in spite of having a hard limit of 20 functions per project. I would expect this limit to eventually disappear.

Supported languages and dependency management

The first version of Lambda only supported JavaScript, and later it included Java (Jun 2015) and Python (Oct 2015). Currently, you can even write your functions in Ruby (with JRuby), or any unsupported language by running arbitrary executables (i.e. by spawning child processes).

Google Cloud Functions currently supports only JavaScript. Although there seems to be no public roadmap, I would expect Python and Java to be supported sometimes later this year.

As far as dependency management and deployment, the only real weakness of AWS Lambda is its Deployment Packages. In practice, you can use external dependencies only by including them within your zipped source code, and I find this inconvenient for many reasons. First, you are forced to compile and install these external packages on the same OS used by AWS Lambda internally. After this, every time you need to change something in your own code, you have to upload it all together. Second, this is not the way modern dependency management works. Web developers are now used to declaring and versioning their code dependencies, rather than providing local compiled libraries.

Of course, the whole process can be automated, and wouldn’t a configuration file be easier and safer to maintain? Yes. 🙂
In fact, Google Cloud Functions allows you to define a simple package.json file to declare and version your npm dependencies. As soon as Python is also supported, I expect that we’ll be able to simply deploy a pip requirements.txt file. Let’s see what happens.

Deployments and versioning

As I mentioned in the previous section, I don’t like the way AWS Lambda handles deployment packages and dependencies. It forces you to re-deploy a (potentially huge) deployment package every time you change your code or update a dependency.
On the other hand, I love the possibility of having multiple versions of the same Lambda function. This makes deploying and testing a new version very easy, even from the AWS Console UI. The real trick is binding versions to aliases so that you can easily switch to new versions (or roll back to older ones) with a couple of clicks. Linking stable versions of your functions to API Gateway stages such as dev, stage, prod, etc. requires a little bit of manual configuration, but it’s totally worth it. I recommend setting up an API Gateway stage variable and using it to invoke given AWS Lambda aliases.

On this front, Google Cloud Functions chose not re-invent the wheel and devised a developer-friendly solution that allows you to achieve versioning with git (i.e. a given branch or tag) even though you would need to host your repo on Cloud Source Repositories. I’m looking forward to a more general solution that includes other mainstream git solutions such as GitHub, BitBucket, etc.

Invocations, events, and logging

Both AWS Lambda and Google Cloud Functions support the event-driven approach. This means that you can trigger a function whenever something interesting happens within the cloud environment. They also support a simple HTTP approach.
AWS Lambda can be invoked by nearly every other AWS service including S3, SNS, SES, DynamoDB, Kinesis, Cognito, CloudWatch, etc.  You can configure API Gateway to invoke a given Lambda function and obtain a RESTful interface for free (almost), including authentication, caching, parameters mapping, etc.

Google Cloud Functions currently only supports internal events by Google Cloud Storage (i.e. Object Change Notifications) and through Google Cloud Pub/Subtopics (Google’s globally distributed message bus that automatically scales as you need it). HTTP invocations are already natively supported. You simply need to deploy your function with a trigger-http flag. Currently, you need to explicitly configure and deploy your Google Cloud Functions for each different trigger type.

As far as logging, both services are well integrated with their corresponding logging management services: Amazon

CloudWatch and Google Cloud Logging. I personally find CloudWatch better integrated, better documented, and with charts that are easy to configure charts (kind-of).

Load testing and statistics

I took the time to perform some load tests on arbitrary JavaScript involving pure computation (i.e. generate 1,000 md5 hashes for each invocation). This gave me the opportunity to play with the two different dependency management systems because I needed to include the md5 npm module.

I configured a linearly incremental load of five minutes, up to about 70 requests per second. The two charts represent the average response time and the average number of requests per second.

Please note that these charts use the same scale for both dimensions. Also, I’ve deployed and tested both functions in the corresponding EU-west region (Ireland).

AWS Lambda Load Test
AWS Lambda Load Test

Google Cloud Functions Load Test
Google Cloud Functions Load Test

As you can see, there is a noticeable difference in the average response time: Google Cloud Functions consistently keeps it between 130 and 200ms, with a strange increase during the final minutes of my test (maybe due to the decreasing load?).
On the other hand, AWS Lambda’s response time is much higher and reveals an interesting rectangular pattern. The service seems to internally scale up after the load reached 20 req/s. When the load stabilizes around 30 req/s, it seems to scale down (i.e. response time raised to 600ms) and then it scales up again with a load of 40+ req/s.

Since each function invocation is returning a relatively heavy JSON response (almost 50KB), I assumed network performance had an impact on the resulting response time, independently of the actual computation. I quickly verified this assumption and modified both functions to return a simple “OK” message. I noticed a consistent improvement in the new AWS Lambda function, whose response time dropped between 200 and 300ms. The new Google Cloud Functions was not drastically affected, but its response time dropped to only 100ms.

Given these results, I would say that the computational difference is still relevant, but the network probably has a bigger impact if your application involves heavy HTTP responses. Apparently, as we already discussed back in 2014, Google’s networking just works better, and I would assume that the native HTTP integration is faster as well compared to Amazon API Gateway. Although Google’s networking is great, it still lacks critical features such as authentication and caching.

Function code compatibility between AWS and Google

Unfortunately for us, AWS Lambda and Google Cloud Functions are not directly compatible with each other. Google Cloud Functions is still in alpha and things can change, but I assume that Google won’t make the effort to be compatible with AWS Lambda without a compelling reason.

If you already have a few Lambda functions, in most cases, they are also interacting with at least another AWS service. Your Lambda functions are probably using some IAM roles and plenty of AWS details so you wouldn’t easily migrate to another cloud vendor anyway. However, in plenty of other cases your Lambda functions involve pure computation or simple input/ouput logic (i.e. read from a queue, write into a database, process an image, etc.). In these cases, you may be tempted to try your Lambda functions on Google Cloud Functions as well, even if just to evaluate the service, or reduce your costs. You can request your account to be whitelisted here.

What about an automated conversion tool?

Luckily, I took the time to develop a simple conversion tool that will definitely speed up the porting process of your JavaScript Lambda functions. It correctly handles the event/context functionality mapping and automatically comments incompatible attributes and methods. I really hate manual refactoring or porting tasks, so I hope it will be useful for some of you.

For example, a very simple function like the following:

exports.myHandler = function(event, context) {
    console.log("input data: " + event);
    if (!event.name) {
        return context.fail("No name");
    }
    context.succeed("Hello " + event.name);
}

would become something very similar to this:

exports.myHandler = function(context, data) {
    console.log("input data: " + data);
    if (!data.name) {
        return context.failure("No name");
    }
    context.success("Hello " + data.name);
}

As you can guess, the conversion is quite intuitive and shouldn’t take you more than a couple of minutes. However, things start to get much more complicated and time-consuming if you have a very complex function (especially if you defined additional utility functions that require both the original event and context objects).

Alpha testing conclusions

Google Cloud Functions looks very promising, and I am looking forward to the long list of upcoming features. I will continue to run tests and monitor trusted Cloud Functions tester groups, which already contains plenty of suggestions, improvements, and feedback. I personally hope to see many more tools that will enable cross-platform development in a serverless fashion.

If you enjoyed the article, feel free to comment and let us know what you think of the serverless revolution. We are happily using Lambda Functions in the Cloud Academy platform as well, and we can’t wait to see what will happen in the near future (and if you’re not familiar with it, take a look at the Serverless Framework).

The post Google Cloud Functions vs. AWS Lambda: Fight for Serverless Cloud domination Begins appeared first on Cloud Academy.

]]>
18