Engineering at Cloud Academy | Cloud Academy Blog https://cloudacademy.com/blog/category/engineering-at-cloud-academy/ Mon, 07 Nov 2022 03:45:26 +0000 en-US hourly 1 https://wordpress.org/?v=6.4.1 How Cloud Academy Is Using Cube to Win the Data Challenge https://cloudacademy.com/blog/the-headless-bi-tool-cloud-academy-uses/ https://cloudacademy.com/blog/the-headless-bi-tool-cloud-academy-uses/#respond Mon, 07 Nov 2022 01:00:00 +0000 https://cloudacademy.com/?p=50389 Discover how Cloud Academy presents data to both internal and external stakeholders through the use of a Headless BI tool.

The post How Cloud Academy Is Using Cube to Win the Data Challenge appeared first on Cloud Academy.

]]>
At Cloud Academy, we manage a lot of data every day. We have different sources we get data from such as feedback, events, and platform usage, and we need to get it, apply transformations, and finally present the data to our internal stakeholders and our customers.

Because of the variety of the data that we provide, we recently implemented Cube, a Headless BI solution. It allowed us to handle, model, and present data through our BI tools smoothly.

What is a Headless BI Tool?

A Headless BI tool is a set of components that acts as middleware between your data warehouse and your business intelligence applications. It provides us with four main data-related components without the need of designing and implementing custom solutions. It allows us to work with data without hitting the data warehouse directly but leveraging the abstraction layer represented by the tool.

The name Headless comes because the tool allows us to work with the data, but it deliberately delegates the task to show and visualize it. This is the responsibility of the BI tool.

A Headless BI tool offers the following four components:

  • Modeling – It allows us to leverage data on the data warehouse and model it by defining dimensions and measures, usable by the BI tool
  • Security – It allows us to declare who can access the data, and restrict shown data if needed
  • Caching – It provides us with a caching layer to store the results of recent queries and speed up the next ones
  • APIs – It provides us with one or multiple APIs (such as RESTful and SQL) to hit your data

Why does Cloud Academy use Cube?

Data Modeling

We have a lot of data coming from multiple sources internal and external to Cloud Academy. We work with structured, semi-structured, and unstructured data. So, before hitting the data from our BI tool, we need an approach to prepare and model data in an effective way.

Cube allows us to create final entities composed of dimensions (attributes) and measures (aggregations of a particular numeric column), exposed through the API.

This way we have the whole collected data in our data warehouse we can query anytime for analysis purposes, and we have modeled specific data usable by the BI tool through the APIs.

Security

We handle data that can be publicly accessible, data related to specific customers, and data containing PII (Personal Identifiable Information). Because of this scenario, data security access is one of the most important components that Cube offers to us.

By using Cube, we have been able to implement the following security patterns:

  • Row Level Security – Depending on the user or entity that is accessing the data, some rows can be obfuscated. Suppose you are company A and want to get data about the platform usage of your users. You should not be able to access the usage data of company B. So rows related to company B are not provided when exploring the data.
  • Data Masking – Depending on the user or entity that is accessing the data, some attributes could be masked because of permissions assigned to the user or entity. This mainly happens when the attribute contains personal information such as a name, an email, or a phone number.

Caching

We provide a lot of insights and answers through the data to our internal stakeholders and Cloud Academy customers.

Every hour, a lot of queries are performed on our data, and most of them require the data warehouse to process millions of rows to succeed. Because of that, having a caching layer is crucial for us to avoid overloading the warehouse with common queries.

Cube provides us with a caching layer to temporarily store the result of early executed queries. So, the same queries won’t hit the data warehouse again if executed after a little time.

Leveraging the caching layer allows us to get the result of the query faster than hitting the data warehouse, and this translates into faster loading of the charts that our users visit. The caching layer provided us with a performance boost of about 70% when hit.

Data Access through APIs

Last but not least, we need to access our data quickly and through standard interfaces. Cube provides APIs to achieve this goal.

Depending on the tool you use, you could have multiple choices such as RESTful, or SQL.

In our scenario, we have two access points to the data:

  • Internal BI – It’s represented by all the data that we show to our internal stakeholders by using our internal BI tool: Superset.
  • Customer Facing BI – It’s represented by the dashboards we provide to Cloud Academy enterprise customers. They are built by using the Recharts library and served through a React.js front-end application. Through these dashboards, they get insights about the Cloud Academy platform usage of their employees.

Cloud Academy is a data-driven company, and Cube really helped us work and manage data that are crucial for our business.

It allowed us to model data in order to define dimensions and measures to be exposed to the BI tools. This way we’ve been able to mask all the underlying tables and logic.

By using Cube, we have been able to implement a strong security layer composed of data masking and row-level security, both for our internal usage (Superset) and for enterprise customer usage (React application).

Cube allowed us to significantly increase the speed of the queries, and so the time to retrieve a dashboard. Also, it provides us with pre-aggregations, a useful caching layer that we can manage in order to always speed up our queries (different from the native caching layer, which loses the data cached after a certain period).

We always try to keep updated with new approaches and tools that let us provide a better experience to our stakeholders on the available data. The ideal scenario would be for stakeholders that can easily access the data they need, in the manner they need, by letting us be compliant with security and privacy constraints.

The post How Cloud Academy Is Using Cube to Win the Data Challenge appeared first on Cloud Academy.

]]>
0
How Do We Transform and Model Data at Cloud Academy? https://cloudacademy.com/blog/how-do-we-transform-and-model-data-at-cloud-academy/ https://cloudacademy.com/blog/how-do-we-transform-and-model-data-at-cloud-academy/#respond Tue, 07 Jun 2022 18:08:17 +0000 https://cloudacademy.com/?p=49726 How Do We Transform and Model Data at Cloud Academy? “Data is the new gold”: a common phrase over the last few years. For all organizations, data and information have become crucial to making good decisions for the future and having a clear understanding of how they’re making progress —...

The post How Do We Transform and Model Data at Cloud Academy? appeared first on Cloud Academy.

]]>
How Do We Transform and Model Data at Cloud Academy?

“Data is the new gold”: a common phrase over the last few years. For all organizations, data and information have become crucial to making good decisions for the future and having a clear understanding of how they’re making progress — or otherwise.

At Cloud Academy, we strive to make data-informed decisions. That’s why we decided to invest in building a stack that can help us be as data-driven as possible.

Business users constantly leverage our data in order to monitor platform performance, build and extract reports, and see how our customers use the Cloud Academy. We also provide data to our enterprise customers, which lets them keep track of their platform usage.

Where do we get data?

Before modeling and using data, we need to extract useful data in order to create valuable information. We have two primary data sources:

  • Internal: Provided by our operational systems that expose the core data of the Cloud Academy platform, like labs and course sessions.
  • External: Provided by external services and platforms that we use to collect data about events not strictly related to the Cloud Academy platform, like currency exchange rates. 

The data extraction process

The first step of modeling and using data begins with extracting it from different sources and putting it in a library where it can be assessed: the Data Warehouse.

Once information’s stored in this analytical database, we can perform queries and retrieve helpful information for our internal users and customers alike.

The Data Warehouse is logically split into two main parts:

  • Staging Area: This is the area where raw (data extracted from sources as-is) and pre-processed (data extracted from sources that is then processed, such as applying common formats) data are stored. This data is not in the final form, so it is not used by the final users.
  • Curated Area: This is the area where the transformed and modeled data is stored. We model data by using the dimensional modeling technique following the Kimball approach. This data is pulled directly by end users through SQL queries or through Business Intelligence tools.

To extract data from sources, our team built data transformation pipelines. We use the programming language Python to create pipeline logic, then we use Prefect to orchestrate the pipelines.

In some cases, raw data is extracted and placed in the staging area; in others, a few transformations need to be performed. This process produces pre-processed data.

pre-processed data

 

What’s next in the data transformation process?

As soon as raw and pre-processed data are in the staging area of the Data Warehouse, we can apply data transformations and data modeling. To do so, we use dbt (data build tool), a powerful resource that allows engineers to work and model data in the Data Warehouse.

dbt lets us declare desired transformations by defining SQL scripts, known as dbt models. Each model represents a new step in the transformation of data from the staging area to the curated area.

How are dbt models organized?

While performing the transformations, we consider a few different model categories:

  • Staging Models: Initial models that represent raw and pre-processed data extracted through Prefect pipelines.
  • Intermediate Models: Models that take data from staging models or from intermediate models (if multiple levels are defined).
  • Marts Models: Models that take data from the intermediate models and that represent the tables in the curated area. The marts models are usually dimensions (containing partially denormalized data about entities) and facts (containing normalized data about happened events).

Data quality with dbt

Data quality represents consistency and how well data is structured for solving problems or to serve specific purposes.

dbt is a great tool for this because it also enables us to perform tests on both source data and the data we produce. There are different native tests available in order to test the data (you can find available tests here), which can easily be used and integrated where you define data structure. Of course, you can also define custom tests if none of the available ones fit your testing scenario.

Here’s an example of dbt test usage:

dbt test usage

By leveraging dbt tests, we ensure the data we provide to our business users always have quality and consistency checks.

dbt tests

Finally, using data!

After the data extraction and transformation process is completed, data is available in the curated area of the Data Warehouse.

Business users can leverage this data to get reports and insights. The most common scenario involves leveraging data through a Business Intelligence tool to build dashboards. In some situations, reports need to be extracted, so we provide them by performing custom queries on the Data Warehouse.

The post How Do We Transform and Model Data at Cloud Academy? appeared first on Cloud Academy.

]]>
0
Mocking Dependencies on Component Testing https://cloudacademy.com/blog/mocking-dependencies-on-component-testing/ https://cloudacademy.com/blog/mocking-dependencies-on-component-testing/#respond Mon, 28 Feb 2022 01:00:00 +0000 https://cloudacademy.com/?p=48424 Mark Cohn test pyramid When Mark Cohn in his book “Succeeding with Agile” came up with the concept of the “test automation pyramid” in 2009, developers around the world began to understand the importance of separating tests by level of complexity and by cost. In addition, the idea that Cohn...

The post Mocking Dependencies on Component Testing appeared first on Cloud Academy.

]]>
Mark Cohn test pyramid

When Mark Cohn in his book “Succeeding with Agile” came up with the concept of the “test automation pyramid” in 2009, developers around the world began to understand the importance of separating tests by level of complexity and by cost. In addition, the idea that Cohn actually brought to the IT community was to test system components in isolation as much as possible, as this is the cheapest and fastest way to go.

As software architectures have evolved a lot since 2009, we can consider the first version of the test pyramid as obsolete nowadays. There are actually many versions of the same diagram in the literature, and most of them have more than three layers and shades of those suggested by Cohn.

The core concept is still there though: test components in isolation as much as possible (Unit Testing). Then proceed with integrating external modules one at a time (Integration Tests) until the entire system is put under test (end-to-end testing). Thus, the general principles of the Cohn version of the Testing Pyramid can be applied to any software component (class, library, package, etc.) and even system architectures.

For example, when we test a Microservices architecture, we start by checking the behavior of each internal component, layer, or class (Unit testing). then we move on to integration with external dependencies (Integration testing), and finally we test the system as a whole (E2E testing).

Component testing

Let’s focus for a second on the Microservices Architecture scenario since it is the main topic of this post. When we need to perform integration testing we can follow two main strategies:
–We can test the integration of external components one at a time (Integration testing)

–But we can also test the whole service in isolation, providing all the dependencies it needs (Component testing).

We can consider Component Testing as black-box testing that has the objective to check the public interface of our service with these two characteristics:

  • the service must be executed in an isolated environment
  • the service configuration must be as close as possible to the production environment configuration.

Component testing is often referred to as “Service Testing” or “Service Level Testing”. Since it is beyond the scope of this post to delve into this topic, I suggest you read this short presentation by Martin Fowler on this subject: https://martinfowler.com/articles/microservice-testing/#testing-component-introduction

Mocks and stubs

Before we can proceed with planning the testing of our component, we need to be aware of one issue: external dependencies are not always available and we need to find a way to provide them to our service. This is where stubs and mocks come in.

What exactly are mocks and stubs? Mock and stub are so-called Test Doubles. Basically, they are instances that can replace other instances during a test (think of the stuntman/double in a movie). A mock is an entity that replicates the behavior of a specific software component (class, object, API, system). A stub, on the other hand, is a static object that is used to match the expectations of a test but has no behavior.

You can find more on test doubles on this page from the Martin Fowler website: https://www.martinfowler.com/bliki/TestDouble.html

A practical example: the Weather Service

Suppose we are building a Spring Boot microservice called Weather Service and we want to perform some Component Testing on it. 

The Weather Service just provides weather information and forecasts through a REST interface. To do so, it retrieves meteorological data from an external provider (https://openweathermap.org) by simply calling its REST APIs.

Here’s the simple implementation of the provider client in Java. As you can see, we’re using OpenFeign from the SpringCloud framework in order to make things a little bit easier.

@FeignClient(name="WeatherServiceClient", url = "http://api.openweathermap.org/data/2.5/")
public interface WeatherServiceClient {

   /**
    * call to api.openweathermap.org/data/2.5/weather?lat={lat}&lon={lon}&appid={API key}
    *
    * @param lat
    * @param lon
    * @return
    */
   @GetMapping(value = "/weather")
   WeatherData currentWeather(@RequestParam("lat") String lat, @RequestParam("lon") String lon, @RequestParam("appid") String apiKey);


   /**
    * call to api.openweathermap.org/data/2.5/onecall?lat={lat}&lon={lon}&appid={API key}
    *
    * @param lat
    * @param lon
    * @return
    */
   @GetMapping(value = "/onecall")
   ForecastData forecast(@RequestParam("lat") String lat, @RequestParam("lon") String lon, @RequestParam("appid") String apiKey);

}

Now, let’s suppose we want to test the whole Weather Service in isolation by running it into a Docker container. The only thing we have to do is create a container that hosts our application and invoke all the endpoints we expose one by one.

Here’s the Dockerfile

FROM amazoncorretto:17-alpine-jdk
WORKDIR /weatherApp
COPY . .
RUN ./mvnw -DskipTests=true package
WORKDIR /weatherApp/target
EXPOSE 8080
ENTRYPOINT ["java","-jar","openweathermap-mock-demo-0.0.1-SNAPSHOT.jar"]

If we build and run the container image above, we can reach our service by hitting http://localhost:8080/current and http://localhost:8080/forecast.

Now that we know how to have the Weather Service up and running inside a container, we can test it by simply invoking its endpoints with, for example, a Postman suite and making assertions about the format or payload response.

There is one problem though, we of course don’t want to use the same API key we use in production when performing tests. One simple solution could be to use a free subscription but, since free plans have a limited amount of calls, it is clear that this is not the way to go. A valid approach would be to somehow simulate the entire weather data provider by using a live http mock.

There are a number of tools and libraries that can accomplish this task, however for this example we will use MockServer (https://www.mock-server.com).

MockServer

We could have chosen another option, but we picked MockServer because:

  • It is easy to use
  • It offers endless customization options
  • It provides a docker image on official Docker Hub

What MockServer can do for us is, given a specific configuration, simulate an API. To configure Mockserver we simply provide it with a list of Expectations. An Expectation is a rule that is used to match a request and return a specific (static) response.

Here’s an example of an Expectation in JSON format:

{
    "httpRequest": {
        "method": "POST",
        "path": "/login",
        "body": {
            "username": "foo",
            "password": "bar"
        }
    },
    "httpResponse": {
        "statusCode": 200,
        "body": {
            "sessionId": "2By8LOhBmaW5nZXJwcmludCIlMDAzMW"
        }
    }
}

You can read it as follow:

Whenever MockServer receives a POST request which body contains “foo” as username and “bar” as password, then return an HTTP 200 response having “2By8LOhBmaW5nZXJwcmludCIlMDAzMW” as sessionId.

Other similar HTTP mocking tools

Other similar tools are:

Putting it all together

Now that we know how an Expectation looks like, in order to set up our mock server we need to:

  1. Pull the latest MockServer image via  “docker pull mockserver/mockserver”
  2. Configure the MockServer instance by providing an Expectation for each weather provider API route.
  3. Create a docker-compose.yaml file to make the Weather Service container and the mocked weather data provider speak to each other.

Here an example of the expectation that intercepts all the calls to http://api.openweathermap.org/data/2.5/weather?lat={lat}&lon={lon}&appid={API key} and return a response stub.

{
 "httpRequest":{
   "method":"GET",
   "path":"/data/2.5/weather",
   "queryStringParameters":{
     "lon":["[0-9\\.]+"],
     "lat":["[0-9\\.]+"],
     "appid":["[A-Za-z0-9\\.]+"]
   }
 },
 "httpResponse":{
   "statusCode":200,
   "headers":{
     "Content-Type":[
       "application/json; charset=utf-8"
{
  "httpRequest": {
    "method": "GET",
    "path": "/data/2.5/weather",
    "queryStringParameters": {
      "lon": ["[0-9\\.]+"],
      "lat": ["[0-9\\.]+"],
      "appid": ["[A-Za-z0-9\\.]+"]
    }
  },
  "httpResponse": {
    "statusCode": 200,
    "headers": {
      "Content-Type": [
        "application/json; charset=utf-8"
      ]
    },
    "body": {
      "coord": {
        "lon": 12.5113,
        "lat": 41.8919
      },
      "weather": [
        {
          "id": 802,
          "main": "Mocked-Clouds",
          "description": "Mocked-scattered clouds",
          "icon": "03d"
        }
      ],
      "base": "stations",
      "main": {
        "temp": 288.02,
        "feelsLike": null,
        "tempMin": null,
        "tempMax": null,
        "pressure": 1016,
        "humidity": 67
      },
      "visibility": 10000,
      "wind": {
        "speed": 1.54,
        "deg": 190,
        "gust": null
      },
      "clouds": {
        "all": 40
      },
      "dt": 1644065908,
      "sys": {
        "type": 2,
        "id": 2037790,
        "country": "IT",
        "sunrise": 1644041914,
        "sunset": 1644078560
      },
      "timezone": 3600,
      "id": 6545158,
      "name": "Trevi",
      "cod": 200
    }
  }
}

Here’s the docker-compose.yml

version: '3'
services:
 weather-service:
   container_name: weather-service
   image: weather-service:latest
   ports:
     - 8080:8080
   networks:
     weather-service-network:
       aliases:
         - weather-service
   depends_on:
     - mock-weather-apis

 mock-weather-apis:
   image: mockserver/mockserver
   command: -serverPort 443,80
   environment:
     MOCKSERVER_INITIALIZATION_JSON_PATH: /config/initializerJson.json
   volumes:
     - ./mock-configuration:/config
   ports:
     - "80:1080"
   networks:
     weather-service-network:
       aliases:
         - api.openweathermap.org


networks:
 weather-service-network:

By running docker compose up we spin up our Weather Service and also the weather provider mock.

Every call our service makes to the weather data provider gets intercepted by the mock server which returns a stub response we defined in the json above.

Final thoughts

We have achieved so far:

  1. We build up an environment that allows us to perform blackbox testing on our Weather Service.
  2. By using mocks, we were able to replace dependencies seamlessly and without the need to change the service configuration. The service was actually running in a near-production environment.
  3. We also saved expensive resources by avoiding making costly calls to the weather data provider.

If you are interested in running the code above by yourself you can find it here https://github.com/cloudacademy/openweathermap-mock-demo/

If you want to learn more about Docker, Testing or Microservices, consider looking at our courses at Cloudacademy.com

 

The post Mocking Dependencies on Component Testing appeared first on Cloud Academy.

]]>
0
Front End Testing 101 at Cloud Academy https://cloudacademy.com/blog/front-end-testing-101-at-cloud-academy/ https://cloudacademy.com/blog/front-end-testing-101-at-cloud-academy/#respond Wed, 19 May 2021 02:18:59 +0000 https://cloudacademy.com/?p=46163 First, why do we need tests? Let’s suppose you are a developer. Now let’s suppose you are a front-end developer. Imagine that you have been recently hired to perform this role in a company that has a large, complex React web application. You see that managing such a complex application...

The post Front End Testing 101 at Cloud Academy appeared first on Cloud Academy.

]]>
First, why do we need tests?

Let’s suppose you are a developer. Now let’s suppose you are a front-end developer. Imagine that you have been recently hired to perform this role in a company that has a large, complex React web application. You see that managing such a complex application is a struggle. You don’t understand the actual flow of information, you see that there’s some inexplicable logic that you are afraid to touch, there’s a deep component nesting architecture, and so on.

If someone told you to make some changes in this codebase, this would probably be your reaction: 😱.

But you recall hearing from some wise white-haired man that there’s a way to manage such complexity. That man used a specific word to address this problem: refactoring! This is typically done when you want to rewrite a piece of code in a simpler way, breaking the logic into smaller chunks to reduce complexity.

Before you proceed, you remember that this is a very complex application. How can you be sure that you’re not breaking anything? You recall something else from the wise old man: testing!

Our choice for front end

At Cloud Academy, we have our large complex application built on a stack composed of React/Styled Components/Redux/Redux saga. The sagas are responsible for making the Rest API calls and updating the global state, while components will receive that state, updating the UI accordingly.

So, given our React stack, our choice was:

  • jest as test runner, mocking and assertion library
  • @testing-library/react or enzyme for unit testing React components (depending on the test purpose, or the project)
  • cypress for end-to-end testing (which will not be covered in this article)

Testing Library vs. Enzyme

Even though these two libraries both result in testing React components, they differ significantly in how the test is performed.

  • Enzyme focuses testing on implementation details, since you need to know the internal structure of the components and how they interact with each other. This gives you more control over the test execution, enabling deep testing on state and property changes. However, it makes the tests brittle, since almost every modification to the implementation needs a test update. Generally speaking it can be useful when testing simple components in which you have some logic that you don’t want to be altered, or with less component aggregation and interaction and more focused on content rendering. We use Enzyme in our design system Bonsai library.

  • Testing Library for React focuses on how the components behave from a user standpoint. To perform the test, you don’t need to know the implementation details, but rather how it renders and how the user should interact with it. This enables testing of very complex components. Since you don’t need to be concerned about the internals, you only need to give meaningful props, mock dependencies where needed (independent from the chosen framework), and test the output, checking what you expect to be rendered (content, labels, etc.) or interacting as a user would. We use Testing Library for React in our main application project, with which you can test whole pages without worrying too much about their complex structure.

Let’s dig into how we perform tests our main React codebase leveraging Testing Library.

What should you test?

We can see our application structured in different layers:

  • Global state and API calls located in Redux layers (actions, reducers, sagas, selectors)
  • Business logic and state management in containers (or for more recent components in hooks)
  • Presentation logic in lower level components
  • Common utilities

Each of these layers deserves a thorough tour about how they should be tested, but this article is focused on how we test React layers, so we’ve identified four main categories of tests:

  • Containers: the most complex components, where you usually test behaviors and you need heavy use of dependency stubs and fixtures
  • Components: since they should be “dumb,” testing here should be simpler
  • Hooks: see them like “containers” as functions, so you have the same needs and same containers approach
  • Generic functions: not really bound to Testing Library, but still needed when you use them in components

Containers

This is usually the page entry point, where all the magic happens by provisioning the actual data to the underlying components and controlling their actions with callbacks.

A complete test usually needs:

  • Dependencies properly mocked (leveraging jest mocks and dependency injection)
  • Fixtures to mock data and make assertions (expectations on content can be done with them)
  • Stub actions or callbacks in order to assert the behavior of the interactive parts using spies, etc. (e.g., error notifications, browser events, API callbacks)
  • Manage or simulate asynchronicity

If the test is well prepared, assertions are very easy, as you probably just need to inject the fixture data and check if it’s rendering like you expect. This ensures fairly wide coverage of the internal components, without taking into account the implementation details.

One suggested practice is to test at least the “happy path” and the “error” situations. Testing all the other possible code branches is also highly recommended, but only after the most common paths have been covered.

beforeEach(() => {
  ...
  mockHistory = createMemoryHistory();

  initialState = {
    ....
     //some content here
     course: {
        description: '.....'
        nextStep: {
           title: '....'
        }
     }
     ...
  };
});

test('should render the course step content from the state', () => {

  connectedRender(
    <Router history={mockHistory}>
      <ContainerWithSomeContent  />
    </Router>,
    {
      initialState,
      reducer,
    },
  );

  expect(screen.getByText(initialState.course.description)).toBeInTheDocument();

  expect(screen.getByText(`Next: ${initialState.course.nextStep.title}`));
});

Connected components

Let’s take a small detour through how to test connected components. As you see above, the test is using the connectedRender API. This is a utility function to enable testing on containers that are connected to Redux store without having to set up its boilerplate code in every test.

In order to test those kind of components, you simply need to pass these items to this utility function: the jsx, the reducer and the initial state that will be used to construct the store that will forward the state using the redux Provider.

The following is the implementation of that utility.

import React from 'react';
import { render as rtlRender } from '@testing-library/react';
import { createStore } from 'redux';
import { Provider } from 'react-redux';

function render(
  ui,
  {
    initialState,
    reducer,
    store = createStore(reducer, initialState),
    ...renderOptions
  } = {},
) {

  function Wrapper({ children }) {
    return <Provider store={store}>{children}</Provider>;
  }
  return rtlRender(ui, { wrapper: Wrapper, ...renderOptions });
}

// re-export everything
export * from '@testing-library/react';
// override render method
export { render as connectedRender };

Components

Given that the “happy path” is being tested on the container, there’s the possibility that you don’t need to perform tests on the internal components, that should be dumb as much as possible. If some kind of logic is present here (e.g., displaying different labels depending on props values), it’s also good to create tests to cover these specific situations, handling them here instead of containers.

If you have many possible combinations, a suggested practice is to write tabular tests.

test.each`
  time  | expected
  ${0}  | ${'0 minutes'}
  ${1}  | ${'1 minute'}
  ${2}  | ${'2 minutes'}
  ${10} | ${'10 minutes'}
`('should render estimated time "$time" as "$expected"', ({ time, expected }) => {
  props.estimatedTime = time;

  render(<ComponentThatRendersTime {...props} />);

  expect(screen.getByText(expected)).toBeInTheDocument();
});

Hooks

Hooks can be tested without being forced to render them inside components using renderHook from @testing-library/react-hooks library. Testing hooks is somewhat similar to testing functions and components at the same time.

The hook returns a result object on which you can make assertions by accessing its current property.

You should avoid immediate destructuring here if you are planning to call a function on the result that changes the hook state. In fact, this triggers the library to re-render the result, assigning a new reference to the result.current property.

//render the hook
const { result } = renderHook(() => useSomeSuffWithChangeableState())

//read its result value
const { changeState, value } = result.current

// if you expect to change the state here and have an updated value
act(() => {
  changeState()
})
// this assertion will fail, since the "value" will still be the old one
expect(value).toEqual(..)

// to assert the new value you need to re-read it
expect(result.current.value)

Generic functions

Generic functions usually don’t need anything from react testing library, since no jsx rendering should happen inside of them. In this case, you can simply assert using standard matchers, again with tabular testing if needed.

test.each`
  status            | expectedMessage
  ${'not_booked'}   | ${'Classroom Not Booked'}
  ${'booked'}       | ${'Classroom Booked'}
  ${'waiting_for'}  | ${'Classroom Booked'}
  ${'attending'}    | ${'Classroom Started'}
  ${'ended'}        | ${'Classroom Ended'}
  ${'not_existing'} | ${'Classroom Not Booked'}
`(
  'should display "$expectedMessage" for status "$status"',
  ({ expectedMessage, status }) => {
    const message = getMessageFromStatus(status);

    expect(message).toBe(expectedMessage);
  },
);

Managing asynchronicity

Since you’ll usually deal with asynchronous components that make either REST or GraphQL calls, you need to handle asynchronous rendering (mostly on the container components).

To test this behavior, you must use async tests and testing library async API (for example, waitFor) depending on the kind of test you need to implement.

test('should load the page properly', async () => {

  //here's an async operation since the component immediately loads the data
  render(
      <ContainerWithAsyncCall
        {...props}
      />
  );

  await waitFor(() => {
    expect(screen.getByText("Text in the document")).toBeInTheDocument();
  });

  // ...other expectations

});

Same deal, different APIs if you have asynchronous operations in hooks

test('should load the data properly', async () => {
  //render the hook
  const { result, waitForNextUpdate } = renderHook(() => useHookWithAsyncOperation())

  expect(result.current.status).toEqual('LOADING');

  // if you expect to change the state here and have an updated value
  await waitForNextUpdate(() => {
    // assert on the updated "current" value
    expect(result.current.status).toEqual('LOADED')
  })
})

GraphQL testing

We have introduced GraphQL recently (check out this article for more info about this journey), and we started to use Apollo Client to make queries. We don’t have only one way to perform testing on this, so here are some of the more common:

  • Using Apollo MockedProvider: this is probably the best option in terms of “close to reality” testing. It could seem very simple at first, and it is for simple tests. But if you begin more advanced tests, with a lot of different queries and also testing error conditions, the library seems to suffer from some kind of “bug”, recycling and matching previous requests. So you’ll probably waste a lot of time understanding what’s wrong with the test or the mock. It has also some functional quirks (sometimes you are required to put the typename in the mock) and error messages that are not very clear.
const mocks = [{
  request: {
    query: GET_SURVEY_INFO_QUERY,
    variables: {
      ...variables
    },
  },
  result: {
    data: {
      ...mockData,
    },
  },
}]
test('should load the page without errors', async () => {

  render(
    <MockedProvider mocks={mocks} addTypename={false}>
      <SurveyLandingContainerInternal {...props} />
    </MockedProvider>
  )

  await waitFor(() => {
    // ...assertions
  })
})
  • Using the actual Apollo Provider with a mock client: this probably is the best compromise, since you are actually using the Apollo API and just stubbing the responses by query.

  • Mocking Apollo hooks (injected as props or mocked using jest.mock): it’s by far the simplest solution, since you are completely mocking the Apollo APIs, simulating them with stubs over which you have control, and without the need to wrap them in the ApolloProvider. We made a couple of utilities for creating stubs for useQuery and useMutation hooks.

export function newMockUseQuery({ error, data }) {
  return jest.fn(function () {
    const [state, setState] = useState({
      loading: true,
      error: undefined,
      data: undefined,
    });

    useEffect(() => {
      async function doFakeQuery() {
        await Promise.resolve();

        setState({
          loading: false,
          error,
          data,
        });
      }
      doFakeQuery();
    }, []);

    return state;
  });
}
Here we’re mocking the hook to be able to use a fixture value
let mockUseQuery;

jest.mock('@apollo/client', () => {
  // needed in order to avoid mocking everything if importing other stuff
  const apolloClient = jest.requireActual('@apollo/client');
  return {
    ...apolloClient,
    useQuery: (...args) => {
      return mockUseQuery(...args);
    },
  };
});

//while in the test setup just do
beforeEach(() => {
  mockData = {
    ...
  };
  mockUseQuery = newMockUseQuery({ data: mockData });
});

In conclusion

To be able to scale the development and evolution of a complex application, and to ensure the quality of the code you write, you need a way to assess that what you’re writing is actually working in the way it is intended.

Testing solves this problem by giving you confidence in your codebase, ensuring that the logic is correctly implemented without the need to actually run it in production.

When you are confident that your code is right, you can improve it. You can tackle the complexities by doing optimizations, search for more efficient algorithms or libraries, delete old unused code without concerns, upgrade libraries, and do everything you need to make your code maintainable and reliable.

Writing tests can seem like too much effort since you end up writing a lot more code. But considering that it will be more robust and less prone to errors, you are less subject to bugs. And the next time you need to change it, you can simply update the tests (or write a new one if needed), run them, and when they’re green, you’re ready to go!

The post Front End Testing 101 at Cloud Academy appeared first on Cloud Academy.

]]>
0
Data Engineering & Business Intelligence – An Exciting Quest https://cloudacademy.com/blog/data-engineering-business-intelligence-an-exciting-quest/ https://cloudacademy.com/blog/data-engineering-business-intelligence-an-exciting-quest/#respond Fri, 16 Apr 2021 11:49:23 +0000 https://cloudacademy.com/?p=45836 I can clearly remember that moment when our VP of Engineering came to me saying, “Our data and reporting are a mess.” At that moment, we admitted that we had to improve how we manage and organize data and, even more importantly, how we provide it to our customers. You...

The post Data Engineering & Business Intelligence – An Exciting Quest appeared first on Cloud Academy.

]]>
I can clearly remember that moment when our VP of Engineering came to me saying, “Our data and reporting are a mess.” At that moment, we admitted that we had to improve how we manage and organize data and, even more importantly, how we provide it to our customers.

You know, when a company turns the corner and in a really short amount of time goes from being a start-up to becoming a really important player doing business with big companies, everything needs to be evolved, transformed, and improved. This is fair, and everyone would love to be involved in such interesting processes, including me. But (there is always a but) each new refactoring, each new improvement brings challenges, study, and risks to be taken into account, and this was true when I started to analyze and design a new reporting system, handling both the data — in terms of moving and transformation — and the system to provide them to our customers.

Chapter 1 – What the…?

The first thing I addressed when I started the analysis was the way we stored reporting data and how we were getting the data to build dashboards and reports. Basically, all the information — even the semi-aggregated data — was stored inside the main database together with the operational data. Moreover, the software modules in charge of getting the data and building reports and dashboards were part of the main backend system.

In a scenario where there aren’t many concurrent users and the number of records is not in the hundreds of thousands, this approach — while not ideal — is not necessarily wrong.

Fortunately for us, our number of concurrent users increases every day, and together with them the amount of data we host. This means that we need to completely change our approach to data and business intelligence in general.

Honestly, since the beginning, I started designing the new architecture following a canonical approach, with these components in mind:

  1. A read-only database replica from which to fetch the relevant raw data. This avoids overwhelming the main database architecture with heavy operations like big queries and exports of numerous records.
  2. A tool to orchestrate and execute ETLs.
  3. A brand new database to host all the data about reporting and dashboards.
  4. A new microservice to provide APIs in order to get data for dashboards and to build reports from raw exports.

With that in mind, I started to outline the architecture in collaboration with my colleagues.

Chapter 2 – Each job requires its own tools

The need to set up a read replica as the database to stress with exports and long queries was quickly and easily accepted. After some budgeting considerations, our infrastructure team proceeded to implement this. 

With the read replica in my backpack, I moved to the next friend: the tool to run and orchestrate ETLs.

To find the best fit for our needs, I asked our Data Engineer, Alessandro, for help. He did a great analysis on the available alternatives, and together we went through the shortlist. Basically, we had to choose among tools addressing the problem from different perspectives and having different “souls” or core foundations (e.g., cloud-native, service or bundled application, declarative or coding-based, etc.).

In our view the best fit was, and actually still is, Prefect: an open-source workflow management system that’s Python-based. (I didn’t mention it earlier, but Python is our most-used backend programming language.)

The infrastructure and the data teams tried several different configurations to integrate Prefect into our ecosystem, ultimately landing on the following setup:

  • A Prefect server running on ECS tasks (it is made of several components)
  • A PostgreSQL database dedicated to Prefect
  • A Prefect Agent listening for jobs to run; for each job run, it spawns an ECS task executing the desired Prefect flow
  • Some Jenkins pipelines to manage the platform and deploy the flows on the server

Nice — we put in place a modern system to orchestrate and schedule flows. What came next? The ETLs, of course!

During the ideation phase, I imagined moving data from the source database to the reporting one, implementing a really old-fashioned module based on stored procedures: hard to code, a nightmare to maintain, but efficient, reliable, and data-focused by definition.

Luckily for me, Alessandro came to me with a proposal that changed the game: Why not use dbt? Honestly, I wasn’t familiar with dbt, so I had to study a bit. Doing the free course they offer at dbt, I learned about this wonderful tool that solves the data moving and transformation problem in an elegant and ridiculously simple way.

What you need to do is to define your models in terms of SQL queries, and with the help of some templating feature leveraging jinja, wire them building implicitly how they depend on each other.

Once the tool runs, it automatically builds the dependencies graph and processes the models one by one in the right order, so at the end, the final tables are built and filled with the desired data.

The key concept is that all the transformations are expressed using an elegant declarative approach, simply writing SQL queries.

Moreover, the tool allows you to define different kinds of “persistence” for the models you define:

  • Table: Each time the model generates a brand new table, that replaces the existing one.
  • Incremental: Performs inserts and updates on an existing table; if it does not exist, then the first run creates it.
  • View: The model creates a regular view in the database instead of a table.
  • Ephemeral: When executing, the tool will get the SQL code of the model and will merge with the code of the model using it. This is useful to increase reusability and code readability.

I encourage you to check out this tool and the features it offers since I can’t list them all here.

So, wrapping up, we defined a read replica database, picked an orchestration system for ETLs, and found a nice tool to concisely write our data flows. At that point, I was almost ready to ask about the configuration of the database for the new business intelligence platform. But talking to other colleagues, we thought to leverage another project running in parallel with this initiative, so we decided to use a Redshift cluster as the destination for our data.

Chapter 3 – ETL vs ELT

You know Redshift is powerful, even if it has some limitations in terms of interoperability with other databases, but I have to say that it offers something that really simplifies our data flow: federated queries.

Using this feature, we could hook up our PostgreSQL read replica into Redshift, which then provided it as it was simply a local schema of the data warehouse itself.

This could seem like a small thing. After all, data have been moved between databases since the first one was implemented. But this helped us to move from an ETL approach in favor of the ELT one.

To remind you of the difference between them:

  • ETL flows first perform the data Extraction, apply Transformation logic and then Load them into the destination.
  • ELT flows, instead, perform an Extraction to Load a copy of the original data into the destination where the Transformation then is applied.

Both of them have pros and cons, but between the option of leveraging a cloud-native data warehouse like Redshift and a declarative tool based on SQL like dbt, the second approach is dramatically more natural and easy to implement, especially if federated queries fit the scenario.

Working in this scenario simplifies the work a lot because the end-to-end process is clear and the responsibilities are well-defined and segregated:

  1. Prefect is in charge of just taking care of the flows scheduling and their internal and external orchestration.
  2. Redshift is in charge of playing the role of a single and global place for data, abstracting the other origin data sources as local schemas.
  3. dbt flows are in charge of moving data from source tables into a staging area, getting the staged records, and applying the required transformations to build the final tables (dimensions, relations, facts, etc.). The flows do not care at all where the source tables really are because of the federated queries.

At this point, I would start to wonder more about performance. By having the federated database on the same VPC as the redshift cluster, this configuration is really performant. For example, one of our flows that transfers approximately eight million records, transforms them, and applies some joins and aggregations producing a couple of fact tables of a similar cardinality, takes around five minutes end-to-end. The great thing is that refreshing data with this approach is a piece of cake because the final users experience zero downtime; dbt seamlessly replaces the old version of a final table with the new one.

Chapter 4 – Let’s use our new wonderful data!

As I mentioned earlier in this post, the first design of the whole system architecture included a dedicated, brand new microservice to provide data inside our ecosystem, capable of efficiently building exports and performing data retrieval using the Redshift cluster. Well, as we say in Italy, “appetite comes with eating,” and after considering the many things we could do with our new data platform, we thought we could hook up our data warehouse with a modern and feature-rich BI platform.

As a software engineer, I was a bit sad realizing that the chance to build a new application from scratch was fading away. But my experience told me it was a good idea in terms of timing, features, and quality we could deliver to our customers using an existing mature product.

Next, we went through a series (luckily not too long) of PoCs evaluating some modern BI platforms trying to understand if one could fit our needs. It was an interesting phase of the project that gave us the chance to see different approaches and technologies in the BI space. After a couple of months, we stepped into the platform that we ultimately picked: ThoughtSpot.

We have been impressed by their approach to BI. The main idea they put in place is to let the user get what they need directly from data by a search-based interaction. The user can literally get the data by using a search bar, and while typing, the tool is capable of generating the queries and detecting the best way to represent the retrieved data building stunning pinboards (aka dashboards).

We saw in ThoughtSpot an evolution in Business Intelligence, and planned to integrate the platform, potentially using all the possibilities it offers:

  • API calls to export data
  • Embedding of pinboards
  • Embedding the search experience in our platform

Right now we are in the first project phase, leveraging the first two options. In the next project phase, we hope to enable our users to free dive into their data of interest inside their Cloud Academy space.

Useful Links

The post Data Engineering & Business Intelligence – An Exciting Quest appeared first on Cloud Academy.

]]>
0
Mastering AWS Organizations Service Control Policies https://cloudacademy.com/blog/mastering-aws-organizations-service-control-policies/ https://cloudacademy.com/blog/mastering-aws-organizations-service-control-policies/#respond Wed, 31 Mar 2021 14:09:50 +0000 https://cloudacademy.com/?p=45888 Service Control Policies (SCPs) are IAM-like policies to manage permissions in AWS Organizations. SCPs restrict the actions allowed for accounts within the organization making each one of them compliant with your guidelines. SCPs are not meant to grant permissions; you should consider them as advanced Deny/Allow list mechanisms that restrict...

The post Mastering AWS Organizations Service Control Policies appeared first on Cloud Academy.

]]>
Service Control Policies (SCPs) are IAM-like policies to manage permissions in AWS Organizations. SCPs restrict the actions allowed for accounts within the organization making each one of them compliant with your guidelines.

SCPs are not meant to grant permissions; you should consider them as advanced Deny/Allow list mechanisms that restrict the set of actions allowed within an organization. The only way to grant permissions to IAM Users or Roles is by attaching IAM permissions policies.

AWS Service Control Policies

Service Control Policies can be used in a Defense in Depth strategy adding an additional layer of protection to mitigate unknown vulnerabilities on complex infrastructures. From one perspective, Organizations policies like SCPs could be considered unnecessary but according to AWS’s strategy, redundant security controls in different layers are the key to minimize attacks if a vulnerability in another layer is exploited.

At the account level, IAM Permissions + IAM Permission Boundaries overlap with SCP. You could even consider SCP unnecessary because the boundaries are already defined using IAM Permission Boundaries. But what if a user is able to perform a permission escalation exploiting a vulnerability in your policies?

Let’s assume you granted specific permissions to Billy with an IAM user. Billy can now manage multiple EC2 instances in Oregon. Billy is creating a new EC2 instance and, by exploiting a vulnerability that you haven’t noticed before, he is able to create a new Role and attach the PowerUser policy to it. Billy is violating several of AWS Organizations’ guidelines, and he is breaking the least privilege principle introducing a serious flaw, but your security controls are ineffective because they are applied to the user assigned to Billy, not to other IAM users/roles within the account. A Service Control Policy could prevent Billy from violating the guidelines and best practices. In effect, the policy acts as a redundant layer; with the right statements, you can prevent permission escalations and enforce best practices.

At Cloud Academy, we manage AWS Organizations with thousands of accounts, and we identified the following use cases as a good starting point for our Service Control Policies adoption.

  • Deny root user access: prevent takeover attacks using the root user account.
  • Enforce MFA: require MFA enabled for specific actions.
  • Disable expensive services: deny any action for services that won’t be part of the infrastructure.
  • Protect monitoring, auditing, and security tools: prevent users from disabling or modifying AWS CloudWatch, Config, GuardDuty, CloudTrail.
  • Restrict regions: restrict regions allowed in your Organization for geographical proximity or regulatory needs.
  • Restrict EC2 AMI sharing and visibility: prevent AMIs to be public or shared with other AWS accounts.

Structure

The Service Control Policies structure is similar to IAM Policy and composed of multiple statements. Each statement could define Effect, Action, Resource, and Conditions.

{
  "Statement": [{
    "Effect": "Deny",
    "Action": "ec2:*",
    "Resource": "*"
  }]
}

Deny any EC2 action for all resources.

Scope

A Service Control Policy can be applied to all accounts (Root of the Organization), Organization Units (OU), or single accounts. SCPs attached to the Root of the Organization are applied to every account within the organization.

Limitations: Service-Linked roles and management account are not affected by SCPs.

Evaluation

An action performed by an IAM User/Role could be considered allowed if all the following conditions are satisfied:

  • The action is allowed with an IAM permission policy.
  • The action is allowed by the permission boundary attached.
  • The action is allowed by the SCPs attached.

If any of the conditions listed above are not satisfied, the action is not allowed.

Advanced Conditions

Because Service Control Policies are policies applied at the organizational level, you are likely to use conditions operators and conditions keys that you may not be super familiar with, working with standard IAM policies.

These are the most frequent operators and keys that we used during our SCP adoption.

ArnEquals/ArnLike: restrict access based on comparing a key to ARNs. String operators like StringEquals don’t work!

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Deny",
    "Action": ["iam:*"],
    "Resource": "*",
    "Condition": {
      "ArnEquals": {
        "aws:PrincipalARN": "arn:aws:iam::*:user/guest"
      }
    }
  }]
}

Deny any IAM action for all resources performed by the user called guest. 

aws:PrincipalARN: compare the ARN of the principal that made the request with ARNs specified in the policy.

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Deny",
    "Action": ["iam:*"],
    "Resource": "*",
    "Condition": {
      "ArnNotEquals": {
        "aws:PrincipalARN": "arn:aws:iam::*:role/Admin*"
      }
    }
  }]
}

Deny any IAM action for all resources performed by the IAM Roles that don’t have Admin prefix. 

aws:RequestedRegion: compare the AWS region that was called in the request with regions specified in the policy.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Resource": "*",
      "Condition": {
        "StringNotEquals": {
          "aws:RequestedRegion": [
            "us-east-1",
            "us-east-2",
            "us-west-1",
            "us-west-2"
          ]
        }
      }
    }
  ]
}

Deny any action for all resources performed outside the U.S. regions. 

More operators and keys are in the official documentation: AWS Reference Policies Elements Conditions Operators, AWS Reference Policy Conditions Keys.

Deny list and Allow list approaches

Deny list: any action on every resource is allowed by default; additional policies restrict actions/resources with explicit denies.

Allow list: any action on every resource is denied by default; additional policies allow actions/resources.

AWS applies the least privilege principle for both IAM policies and AWS SCP. As a result, no policies mean no actions allowed; consequently, the deny list approach requires an explicit SCP policy that allows actions by default as a foundation. The allow list approach can be implemented without an additional policy — any action is not allowed by default.

By default, AWS Organizations creates a Service Control Policy called FullAWSAccess that allows every action putting the foundation of a Deny list approach.

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": "*",
    "Resource": "*"
  }]
}

The above SCP, attached to the root of the organization, allows every action in every member account. Keeping in mind that explicit Deny overrides Allow, new SCPs could be introduced to restrict the set of actions allowed.

SCP Deny list approach

The Deny list approach is desirable in most cases, since allowed actions within an organization are likely to be greater than not permitted actions, and corner cases can be covered with Conditions.

In the Cloud Academy implementation of AWS Organizations, the variance of actions performed in Organizational Units is too high, and the Allow list approach is not a sustainable solution for us. Moreover, introducing SCPs in an existing organization could lead to unpredictable permission issues working with several Organization Units and accounts. The Deny list approach is often the easiest solution and leaves more freedom with fewer operations.

Testing and debugging

Service Control Policies are powerful and must be properly tested before attaching them to the root of the organization or critical Organization Units. If applicable, the Deny list approach can be introduced progressively without disruptions and represent the recommended option.

SCPs can be easily attached to one or a small number of member accounts to test impacts before rolling out to the entire organization. Once attached, policies are immediately applied to the accounts.

AWS suggests using service last accessed data in IAM and AWS CloudTrail to monitor and understand usage at the API level. Either of these tools could be used to find actions not allowed by mistake and detect potential vulnerabilities.

Conclusion

At Cloud Academy, we manage AWS Organizations with thousands of accounts that can be categorized based on the use case. Service Control Policies have been quite effective to enforce guidelines and limit the actions allowed once we defined the right categorization. SCPs applied to the Root of the Organization are likely to be too generic and ineffective, but with solid categorization on Organization Units with clear boundaries, you could introduce strong restrictions and minimize potential vulnerabilities exploited at the account level.

The post Mastering AWS Organizations Service Control Policies appeared first on Cloud Academy.

]]>
0
GraphQL/Graphene: How to Solve the N+1 Problem With DataLoaders https://cloudacademy.com/blog/graphql-graphene-how-to-solve-the-n1-problem-with-dataloaders/ https://cloudacademy.com/blog/graphql-graphene-how-to-solve-the-n1-problem-with-dataloaders/#respond Fri, 26 Mar 2021 13:11:42 +0000 https://cloudacademy.com/?p=45597 One of the common problems dealing with RESTful APIs is to choose where and how to aggregate resource data coming from different endpoints, even different services. Let’s say we have two endpoints on a service that manages surveys: One endpoint exposes a resource called SurveyInfo that contains the survey’s descriptive...

The post GraphQL/Graphene: How to Solve the N+1 Problem With DataLoaders appeared first on Cloud Academy.

]]>
One of the common problems dealing with RESTful APIs is to choose where and how to aggregate resource data coming from different endpoints, even different services. Let’s say we have two endpoints on a service that manages surveys:

  • One endpoint exposes a resource called SurveyInfo that contains the survey’s descriptive fields: title, description, number of questions, and so on.
  • On the other, we can get a list of SurveyResult: a resource that exposes, for a specific user and survey, the given answers plus the information if he has completed that survey or not.

Our client needs to display the name and number of survey questions, contained in the SurveyInfo, along with his completion info, contained in the SurveyResult:

Survey items example

How can we compose them?

It can be done on the client side. It should know both the endpoints, call the one for retrieving the list of SurveyResult, collect the survey ids, and use these for retrieving the list of related SurveyInfo. Then, perform some aggregation logic based on the survey id they have in common. It seems an effective and simple solution but has drawbacks:

  • This is business logic that shouldn’t stay in the presentation layer, where it’s hard to trace errors and easy to lose overall comprehension of the business domain.
  • It requires two requests over the internet for something that can be done on the back end with a simple database query or some local network call.
  • Endpoints need to be publicly available, increasing the risk of a data breach.

On the other side, making the aggregation on the back end often means inventing a new composite resource just for the sake of presenting data, thus increasing the number of endpoints to maintain for each possible permutation. In this case, we can just expose a new SurveyResultWithInfo resource, but what if we also need to show information about the user or the company that has created the survey? How many different composite resources and endpoints do we have to manage?

Looking at the problem from the opposite point of view, we can think to enrich the SurveyResult or the SurveyInfo resource with all the additional required data. This reduces the number of endpoints, but it increases the size of the response and it complicates the needed computations. Also for the cases, we don’t need to retrieve the other pieces of information.

For small projects, we can implement some data-shaping technique or it can be enough to find a trade-off between all these techniques, using the one the most fits our needs. But we are a growing company that manages big enterprise products, and we need to find a third way.

Our choice: the GraphQL Gateway

Implementing an API gateway is our way: a new actor is introduced between clients and services, and acts as an orchestrator that can analyze the request, call the needed endpoints and compose the requested data to be returned as a single response.

An API gateway hides the complexity of the underlying services and can offer coherent APIs that can be further customized to meet the needs of each type of client we are serving (smartphone app, web browser, tablet app). It still requires network calls to retrieve data, but it’s local to the required services, and it publicly exposes its endpoints while services remain hidden.

We might still worry about moving orchestration logic outside the services, but now we have a dedicated actor for that, and it still something on the back end side. In addition, one of the main benefits is that it can perform a protocol translation: our service endpoints are still RESTful, but we decide to adopt GraphQL as a protocol to query our gateway in a more flexible way.

It’s not the aim of this article to dive into what GraphQL is; there are lots of tutorials on the official page explaining the basics of a server-side implementation with the most popular programming languages. We have chosen to attempt this with Python and Graphene, a popular library for building GraphQL APIs.

Unfortunately, there is a lack of documentation on how to solve one of the most common GraphQL performance pitfalls: the N+1 problem. I hope this article will make life easier for those who want to solve it on the same stack.

The N+1 problem

Recall the SurveyInfo and the SurveyResult resources. This is how we define the related GraphQL object types on Graphene.

# Query to retrieve a survey info
class SurveyInfo(graphene.ObjectType):
    id = graphene.ID()
    name = graphene.String()
    title = graphene.String()
    duration = graphene.Int()
    description = graphene.String()

    steps_count = graphene.Int()

# Query to retrieve a survey result
class SurveyResult(graphene.ObjectType):
    survey_id = graphene.ID()
    slug = graphene.String()
    is_completed = graphene.Boolean()
    # other fields omitted for clarity

# Query to retrieve a list of survey results
class SurveyResultList(graphene.ObjectType):
    items = graphene.List(SurveyResult)
    count = graphene.Int()

With the above definitions, a client can ask for a list of SurveyResult but still needs to make a separate query for retrieving the related SurveyInfo. With Graphene, it is very easy to permit optionally receiving the SurveyInfo as part of a SurveyResult item. First, we need to add a SurveyInfo field as part of the SurveyResult definition:

class SurveyResult(graphene.ObjectType):
    survey_id = graphene.ID()
    slug = graphene.String()
    is_completed = graphene.Boolean()
    info = graphene.Field(type=SurveyInfo, resolver=SurveyInfoFromResultResolver.resolve) # SurveyInfo addition

Second, we need to instruct Graphene on how to resolve that field based on the retrieved SurveyResult. This is why we need to specify a resolver function (SurveyInfoFromResultResolver.resolve) that, on a first try, can be implemented to just perform a call to the SurveyInfo endpoint.

class SurveyInfoFromResultResolver(AbstractResolver):
    @staticmethod
    async def resolve(survey_result, info: ResolveInfo) -> Promise:
        survey_id = survey_result.survey_id

        # Here I get the survey info from the related endpoint
        survey_info = RestHttpClient.get_survey_info(survey_id)

        return survey_info

Does it work? Sure, but it also represents a way to originate the performance issue of the N+1 problem! In fact, this function is called for each item of the SurveyResultList we are retrieving!

This means, for instance, that if we retrieve 10 items then 10 network calls are made. This is why we talk about the “N+1” problem: 1 call for retrieving the list of SurveyResult plus N calls to the SurveyInfo endpoint for each item in the list.

How can we solve it? If our REST service exposes an endpoint that can accept a list of survey ids and return a list of SurveyInfo, we can think of collecting all the survey ids from each SurveyResult and perform just one call with them. But the resolver function is still called N times, and there is no way to instruct Graphene to call it just once. Instead, there is an “official” solution based on an object called “DataLoader.

Data Loaders

In every GraphQL implementation, resolver functions are enabled to return either the resolved data or a Promise (i.e. an object that tells that the data will be retrieved later). When the promise is resolved, Graphene will complete the response with the “promised” data.

This is the “trick” behind DataLoaders: they are objects that, when asked to load a resource with a particular identifier, return just a promise for that resource and, under the hood, collect all the requested identifiers to perform a single batched request.

This is how the resolver function looks after the introduction of a DataLoader:

class SurveyInfoFromResultResolver(AbstractResolver):
    @staticmethod
    async def resolve(parent, info: ResolveInfo) -> Promise:
        survey_id = parent.survey_id

        # here I get the data loader instance
        survey_info_data_loader = info.context["request"].state.survey_info_loader

        # I ask the data loader to batch the request for this survey_id
        # No http calls is actually performed, it just return a Promises that will be resolved later.
        survey_info_load_promise = survey_info_data_loader.load(survey_id)
        return survey_info_load_promise

No more calls to the REST endpoint are performed in the resolution function: each call to the load method of the DataLoader simply means “collect this id for later.” The DataLoader will batch all the survey ids and will call the batch resolution function that, by requirements, we have to define on the DataLoader itself with the name batch_load_fn:

class SurveyInfoLoader(DataLoader):
    def __init__(self, headers):
        super().__init__()
        self.http_client = SurveyClient()

    @rollbar_trace
    def batch_load_fn(self, survey_ids):
        try:
            # retrieving the list of survey infos using the batched survey ids
            get_survey_info_response = self.http_client.get_survey_infos(survey_ids)

            # reordering the retrieved infos to match the the order of the survey_ids list
            retrieved_items = get_survey_info_response.items
            ordered_survey_infos = []
            for survey_id in survey_ids:
                survey_info = next((item for item in retrieved_items if item.id == survey_id), None)
                ordered_survey_infos.append(survey_info)

            return Promise.resolve(ordered_survey_infos)

There is no need to worry about “when” this function will be called: it’s guaranteed that it will be done after all the survey ids are collected. We just need to ensure that these two requirements are satisfied:

  1. The items retrieved on the batch function must be reordered to match the same order of the related identifiers. That is, if the batch load receive survey ids “1,” “4,” and “6,” we have to return SurveyInfo items related to survey ids “1,” “4,” and “6” exactly in this order.
  2. The data loader should be instanced once for each request. More complex caching and batching strategies can be performed with a DataLoader that has a lifetime that spans multiple requests, but it’s something beyond the scope of this article.

There are different ways to satisfy the second requirement, regarding which stack of libraries are used to implement GraphQL. If possible, the recommended one is to instance a new DataLoader before each GraphQL query execution and place it as part of the execution context to be available on the resolver functions. On our gateway, we use FastAPI and a “ready to go” GraphQL app implementation from the Starlette library. Unfortunately, this implementation hides the access to the execution context, so we had to find another way.

Starlette offers the chance to add data on an incoming request on its state property. Since the request can be retrieved in the resolver functions through the execution context, we can plug a simple starlette middleware that intercepts the request and adds a new DataLoader instance on its state:

# this instruct DataLoader Promises to work on the same event loop of the GraphQLApp
# (Asyncio since the GraphQL app is using the AsyncioExecutor)
set_default_scheduler(AsyncioScheduler())


class DataLoadersMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request, call_next):
        # every time a new request is created a new SurveyInfoLoader is added in the request state.
        # Note that adding per request objects to the request.state is the recommended way following the
        # starlette documentation
        request.state.survey_info_loader = SurveyInfoLoader(request.headers)

        response = await call_next(request)
        return response

app = FastAPI(title='GraphQLTutorial', description='CloudAcademy GraphQL server')
app.add_middleware(DataLoadersMiddleware)

app.add_route(
    "/graphql/",
    GraphQLApp(
        schema=graphene.Schema(mutation=Mutation, query=Query),
        graphiql=ENABLE_GRAPHIQL,
        executor_class=AsyncioExecutor,
    ),
)

Conclusion

Although we have shown how to use DataLoaders on a Python app, they were originally developed by Facebook for the Javascript implementation of GraphQL, and it seems they were ported on each officially supported programming language.
They are simple but powerful objects. They use the best parts of GraphQL, enabling flexible composition of resources without too much compromising on the performance side. Plus, they are easy to use.

We are currently switching our gateway to Apollo/Typescript in order to exploit GraphQL on a better-supported stack. Also if it’s not in your plans to use Graphene, understanding the concept behind DataLoaders is worth the effort!

The post GraphQL/Graphene: How to Solve the N+1 Problem With DataLoaders appeared first on Cloud Academy.

]]>
0
A Day in the Life of a Front-End Engineer at Cloud Academy (When There’s Not a Pandemic) #DevStories https://cloudacademy.com/blog/a-day-in-the-life-of-a-front-end-engineer-at-cloud-academy/ https://cloudacademy.com/blog/a-day-in-the-life-of-a-front-end-engineer-at-cloud-academy/#respond Fri, 19 Mar 2021 01:32:46 +0000 https://cloudacademy.com/?p=45609 It’s a typical Monday morning, and I am about to get off the TILO train that takes me to Cloud Academy’s offices in Mendrisio every morning. It’s 8:36 a.m. sharp (lovely how punctual Swiss trains are), and I am walking to the office to start a new day. 🥳 As...

The post A Day in the Life of a Front-End Engineer at Cloud Academy (When There’s Not a Pandemic) #DevStories appeared first on Cloud Academy.

]]>
Day in the life

It’s a typical Monday morning, and I am about to get off the TILO train that takes me to Cloud Academy’s offices in Mendrisio every morning.

It’s 8:36 a.m. sharp (lovely how punctual Swiss trains are), and I am walking to the office to start a new day. 🥳

As I walk towards my desk to plug in my MacBook, I greet the other early morning coworkers who are already in the office and then make my way to our awesome espresso coffee machine to get my first caffeine fix of the day.

Now that the caffeine ☕️  is starting to do its magic, I proceed to open up the WebStorm IDE to spin up my dev environment while I also take a quick look at my Google Calendar to get a feeling of what the day is gonna look like, and finally check the Jira board to see what the current progress of my tasks is.

9.45 a.m. – Time for the daily stand-up!

The engagement squad (my current team 😎) is ready to report the various updates regarding what everybody has been up to since yesterday, and we have a 10/15-minute scrum meeting to discuss our tasks for the day and whether or not we have any particular blockers for the issues we’re currently working on.

10:00 a.m. – Back to work ⚙️

Today I’m working on adding a custom functionality to a React Component that I am currently building (i.e., the Card Certification), which is related to our upcoming User Certifications project (edit: it’s now live! 🥳) that’s going to enable both enterprise customers and individual users to upload their custom certifications to CA’s platform.

Card Certificate image

I am actually working on including the “add to LinkedIn” functionality so that when users click on the corresponding card button, it will be possible for them to be redirected to the related LinkedIn profile page with all of the main certification’s details already pre-filled.

Below is the actual code snippet with the main parameters that I am using to connect to LinkedIn’s add to profile endpoint so as to fill out the required fields.

Code Image

1:00 p.m. Time for some lunch! 🥪

I sit in our kitchen/lunch area and enjoy the quick lunch I brought from home while having a nice conversation with my peers.

Today is also a very special day because we’re celebrating the birthday of a CA employee, and this means only one thing….🥁 free brioches for everybody! 🥐

This is one of my favorite traditions here at Cloud Academy, as birthdays are always properly celebrated with the right level of sweetness 😉

1:30 p.m.  – Time to play table tennis! 🏓

After the lunch break, some of us generally gather and then split into small teams to play table tennis together. It’s such a nice way to recharge before going back to coding.

Today I feel quite energized (might be the caffeine + brioche combo), and I actually win the match! 🙌🏻

2:00 p.m. Back to work again 💼

Time to get back to coding. I am moving forward with the implementation of the “add to LinkedIn profile” functionality for the certifications, and I am testing out whether the actual flow of interaction works as expected and that all of the fields are correctly being filled out whenever a user clicks on the corresponding button with the goal of showcasing a new certification on LinkedIn.

4:00 p.m. Let’s push some code! 🤓

After having properly debugged and checked that everything works as it should, I proceed to open a pull request on Bitbucket and pick two front-end colleagues as reviewers so that they can give me some feedback on my work.

Providing useful and actionable feedback is definitely one of the main key points to follow when it comes to reviewing pull requests, and we all strive to be as assertive as possible whenever we write comments on other pull requests so that we can all learn something in the process.

5:00 p.m.PR reviews time 📝

I am now reviewing a pull review on our Design System’s repo (i.e. Bonsai) for a new Component that is being created. I am curious to take a look at my colleague’s work, since I find it truly interesting to see how other developers tackle a certain problem from different viewpoints and to understand more about how I would have approached the same challenge.

It’s always quite interesting to analyze the pros and cons of certain types of potential solutions, and with Bonsai we are trying to stick to a reasonable balance between the need of meeting certain technical standards while remaining sufficiently close to the actual design requirements.

As an example, some of the guidelines that we are adhering to when it comes to designing new Bonsai Components are outlined in our docs page, where we have created a specific definition to differentiate between “ready-to-use” components and what we have named “stylable” components.

Creating a new “ready-to-use” component (we mostly use those) means that its corresponding TypeScript interface will have its basic style and corresponding functionalities already defined, so that a Developer can quickly and easily use it as it is.

6:00 p.m.Wrapping up

Another workday has come to an end, and it’s now time for me to hop on the train back to Italy 🇮🇹  and relax and unwind a bit.

The post A Day in the Life of a Front-End Engineer at Cloud Academy (When There’s Not a Pandemic) #DevStories appeared first on Cloud Academy.

]]>
0
Our Journey to GraphQL https://cloudacademy.com/blog/our-journey-to-graphql/ https://cloudacademy.com/blog/our-journey-to-graphql/#respond Mon, 22 Feb 2021 02:16:04 +0000 https://cloudacademy.com/?p=44982 Here at Cloud Academy, we have recently been evolving our back-end architecture. We took the path toward a microservices architecture, and we’re still working on some pillars that will enable us to become faster in development while gaining overall reliability and stability.  Many projects are contributing to the core foundation...

The post Our Journey to GraphQL appeared first on Cloud Academy.

]]>
Our Journey to GraphQL

Here at Cloud Academy, we have recently been evolving our back-end architecture. We took the path toward a microservices architecture, and we’re still working on some pillars that will enable us to become faster in development while gaining overall reliability and stability. 

Many projects are contributing to the core foundation of Cloud Academy’s new architecture, since we are working on different topics such as authentication and authorization and event-driven communication between microservices, along with some improvements to the infrastructure. 

Building and developing a cloud native application

In this scenario, we have also introduced GraphQL as the abstraction layer between services and their clients. Initially developed in Facebook, GraphQL is now an open-source query language for APIs. It means that the client can define the structure of the data it requires, reducing the amount of data transferred between client and server. GraphQL provides stable interfaces for querying and mutating objects on the back-end services through the definition of a schema, which is the contract between the front end and back end. This solution is particularly effective to hide the complexity of the back-end systems to the front end: this is something we wanted to introduce into our architecture to ease its evolution. 

Think big, start small

Once we decided to introduce GraphQL into our architecture, we planned how to do that safely. First of all, we conducted a spike to dive deep into GraphQL, and we evaluated the right technology for our needs.

We compared Java, NodeJS, and Python, with the latter emerging as the stack that best fit our needs, not only because it is the main language of our back-end stack but also because it also satisfied our performance requirements. 

Once we agreed on the technological choices, we decided to test GraphQL in production to gain confidence with this technology and spot issues as soon as possible. Given these premises, we chose a feature that wasn’t crucial for the business but still generated a considerable amount of traffic: learning path’s certificates generation. If you’re not familiar with Cloud Academy’s learning paths, you should take a look at our website or start a 7-day trial to explore all our features and to test our GraphQL server as well. 

Back to our subject: We successfully rolled out a GraphQL query and a mutation, and we were pretty satisfied with them, so we confirmed the adoption of this technology in our stack. We started onboarding all the squads and spreading the usage of GraphQL by making it the default choice for all the new endpoints and enforcing its introduction on the existing ones. 

Python based microservices

Next steps

After a few months, we noticed that the library we relied on (graphene-python) did not offer a set of commodities such as query cost analysis, the schema-first approach support. In addition to this, graphene library has recently been abandoned by its principal contributor. For these reasons, we decided to move toward a more stable solution, switching to the de facto standard for GraphQL: Apollo Server. This choice will allow us to get the most out of GraphQL features, also giving us the possibility to set up a federated architecture to isolate GraphQL servers and avoid having a single instance covering all our needs. We will “start small” in this case as well: We will follow the same approach we used to introduce GraphQL into our architecture. We’re currently working on the first queries in Apollo Server, and we’re going to release them in a few weeks. The journey continues… 

The post Our Journey to GraphQL appeared first on Cloud Academy.

]]>
0