Tag: aws s3

  • Building Google Photos Alternative Using AWS Serverless

    Being an avid Google Photos user, I really love some of its features, such as album, face search, and unlimited storage. However, when Google announced the end of unlimited storage on June 1st, 2021, I started thinking about how I could create a cheaper solution that would meet my photo backup requirement.

    “Taking an image, freezing a moment, reveals how rich reality truly is.”

    – Anonymous

    Google offers 100 GB of storage for 130 INR. This storage can be used across various Google applications. However, I don’t use all the space in one go. For me, I snap photos randomly. Sometimes, I visit places and take random snaps with my DSLR and smartphone. So, in general, I upload approximately 200 photos monthly. The size of these photos varies in the range of 4MB to 30MB. On average, I may be using 4GB of monthly storage for backup on my external hard drive to keep raw photos, even the bad ones. Photos backed up on the cloud should be visually high-quality, and it’s good to have a raw copy available at the same time, so that you may do some lightroom changes (although I never touch them 😛). So, here is my minimal requirement:

    • Should support social authentication (Google sign-in preferred).
    • Photos should be stored securely in raw format.
    • Storage should be scaled with usage.
    • Uploading and downloading photos should be easy.
    • Web view for preview would be a plus.
    • Should have almost no operations headache and solution should be as cheap as possible 😉.

    Selecting Tech Stack

    To avoid operation headaches with servers going down, scaling, or maybe application crashing and overall monitoring, I opted for a serverless solution with AWS. The AWS S3 is infinite scalable storage and you only pay for the amount of storage you used. On top of that, you can opt for the S3 storage class, which is efficient and cost-effective.

    – Infrastructure Stack

    1. AWS API Gateway (http api)
    2. AWS Lambda (for processing images and API gateway queries)
    3. Dynamodb (for storing image metadata)
    4. AWS Cognito (for authentication)
    5. AWS S3 Bucket (for storage and web application hosting)
    6. AWS Certificate Manager (to use SSL certificate for a custom domain with API gateway)

    – Software Stack

    1. NodeJS
    2. ReactJS and Material-UI (front-end framework and UI)
    3. AWS Amplify (for simplifying auth flow with cognito)
    4. Sharp (high-speed nodejs library for converting images)
    5. Express and serversless-http
    6. Infinite Scroller (for gallery view)
    7. Serverless Framework (for ease of deployment and Infrastructure as Code)

    Create S3 Buckets:

    We will create three S3 buckets. Create one for hosting a frontend application (refer to architecture diagram, more on this discussed later in the build and hosting part). The second one is for temporarily uploading images. The third one is for actual backup and storage (enable server-side encryption on this bucket). A temporary upload bucket will process uploaded images. 

    During pre-processing, we will resize the original image into two different sizes. One is for thumbnail purposes (400px width), another one is for viewing purposes, but with reduced quality (webp format). Once images are resized, upload all three (raw, thumbnail, and webview) to the third S3 bucket and create a record in dynamodb. Set up object expiry policy on the temporary bucket for 1 day. This way, uploaded objects are automatically deleted from the temporary bucket.

    Setup trigger on the temporary bucket for uploaded images:

    We will need to set up an S3 PUT event, which will trigger our Lambda function to download and process images. We will filter the suffix jpg (and jpeg) for an event trigger, meaning that any file with extension .jpg and .jpeg uploaded to our temporary bucket will automatically invoke a lambda function with the event payload. The lambda function with the help of the event payload will download the uploaded file and perform processing. Your serverless function definition would look like:

    functions:
     lambda:
       handler: index.handler
       memorySize: 512
       timeout: 60
       layers:
         - {Ref: PhotoParserLibsLambdaLayer}
       events:
         - s3:
             bucket: your-temporary-bucket-name
             event: s3:ObjectCreated:*
             rules:
               - suffix: .jpg
             existing: true
         - s3:
             bucket: your-temporary-bucket-name
             event: s3:ObjectCreated:*
             rules:
               - suffix: .jpeg
             existing: true

    Notice that in the YAML events section, we set “existing:true”. This ensures that the bucket will not be created during the serverless deployment. However, if you plan not to manually create your s3 bucket, you can let the framework create a bucket for you.

    DynamoDB as metadatadb:

    AWS dynamodb is a key-value document db that is suitable for our use case. Dynamodb will help us retrieve the list of photos available in the time series. Dynamodb uses a primary key for uniquely identifying each record. A primary key can be composed of a hash key and range key (also called a sort key). A range key is optional. We will use a federated identity ID (discussed in setup authorization) as the hash key (partition key) and name it the username for attribute definition with the type string. We will use the timestamp attribute definition name as a range key with a type number. Range key will help us query results with time-series (Unix epoch). We can also use dynamodb secondary indexes to sort results more specifically. However, to keep the application simple, we’re going to opt-out of this feature for now. Your serverless resource definition would look like:

    resources:
     Resources:
       MetaDataDB:
         Type: AWS::DynamoDB::Table
         Properties:
           TableName: your-dynamodb-table-name
           AttributeDefinitions:
             - AttributeName: username
               AttributeType: S
             - AttributeName: timestamp
               AttributeType: N
           KeySchema:
             - AttributeName: username
               KeyType: HASH
             - AttributeName: timestamp
               KeyType: RANGE
           BillingMode: PAY_PER_REQUEST

    Finally, you also need to set up the IAM role so that the process image lambda function would have access to the S3 bucket and dynamodb. Here is the serverless definition for the IAM role.

    # you can add statements to the Lambda function's IAM Role here
     iam:
       role:
         statements:
         - Effect: "Allow"
           Action:
             - "s3:ListBucket"
           Resource:
             - arn:aws:s3:::your-temporary-bucket-name
             - arn:aws:s3:::your-actual-photo-bucket-name
         - Effect: "Allow"
           Action:
             - "s3:GetObject"
             - "s3:DeleteObject"
           Resource: arn:aws:s3:::your-temporary-bucket-name/*
         - Effect: "Allow"
           Action:
             - "s3:PutObject"
           Resource: arn:aws:s3:::your-actual-photo-bucket-name/*
         - Effect: "Allow"
           Action:
             - "dynamodb:PutItem"
           Resource:
             - Fn::GetAtt: [ MetaDataDB, Arn ]

    Setup Authentication:

    Okay, to set up a Cognito user pool, head to the Cognito console and create a user pool with below config:

    1. Pool Name: photobucket-users

    2. How do you want your end-users to sign in?

    • Select: Email Address or Phone Number
    • Select: Allow Email Addresses
    • Check: (Recommended) Enable case insensitivity for username input

    3. Which standard attributes are required?

    • email

    4. Keep the defaults for “Policies”

    5. MFA and Verification:

    • I opted to manually reset the password for each user (since this is internal app)
    • Disabled user verification

    6. Keep the default for Message Customizations, tags, and devices.

    7. App Clients :

    • App client name: myappclient
    • Let the refresh token, access token, and id token be default
    • Check all “Auth flow configurations”
    • Check enable token revocation

    8. Skip Triggers

    9. Review and create the pool

    Once created, goto app integration -> domain name. Create a domain Cognito subdomain of your choice and note this. Next, I plan to use the Google sign-in feature with Cognito Federation Identity Providers. Use this guide to set up a Google social identity with Cognito.

    Setup Authorization:

    Once the user identity is verified, we need to allow them to access the s3 bucket with limited permissions. Head to the Cognito console, select federated identities, and create a new identity pool. Follow these steps to configure:

    1. Identity pool name: photobucket_auth

    2. Keep Unauthenticated and Authentication flow settings unchecked.

    3. Authentication providers:

    • User Pool I: Enter the user pool ID obtained during authentication setup
    • App Client I: Enter the app client ID generated during the authentication setup. (Cognito user pool -> App Clients -> App client ID)

    4. Setup permissions:

    • Expand view details (Role Summary)
    • For authenticated identities: edit policy document and use the below JSON policy and skip unauthenticated identities with the default configuration.
    {
       "Version": "2012-10-17",
       "Statement": [
           {
               "Effect": "Allow",
               "Action": [
                   "mobileanalytics:PutEvents",
                   "cognito-sync:*",
                   "cognito-identity:*"
               ],
               "Resource": [
                   "*"
               ]
           },
           {
               "Sid": "ListYourObjects",
               "Effect": "Allow",
               "Action": "s3:ListBucket",
               "Resource": [
                   "arn:aws:s3:::your-actual-photo-bucket-name"
               ],
               "Condition": {
                   "StringLike": {
                       "s3:prefix": [
                           "${cognito-identity.amazonaws.com:sub}/",
                           "${cognito-identity.amazonaws.com:sub}/*"
                       ]
                   }
               }
           },
           {
               "Sid": "ReadYourObjects",
               "Effect": "Allow",
               "Action": [
                   "s3:GetObject"
               ],
               "Resource": [
                   "arn:aws:s3:::your-actual-photo-bucket-name/${cognito-identity.amazonaws.com:sub}",
                   "arn:aws:s3:::your-actual-photo-bucket-name/${cognito-identity.amazonaws.com:sub}/*"
               ]
           }
       ]
    }

    ${cognito-identity.amazonaws.com:sub} is a special AWS variable. When a user is authenticated with a federated identity, each user is assigned a unique identity. What the above policy means is that any user who is authenticated should have access to objects prefixed by their own identity ID. This is how we intend users to gain authorization in a limited area within the S3 bucket.

    Copy the Identity Pool ID (from sample code section). You will need this in your backend to get the identity id of the authenticated user via JWT token.

    Amplify configuration for the frontend UI sign-in:

    This object helps you set up the minimal configuration for your application. This is all that we need to sign in via Cognito and access the S3 photo bucket.

    const awsconfig = {
       Auth : {
           identityPoolId: "idenity pool id created during authorization setup",
           region : "your aws region",
           identityPoolRegion: "same as above if cognito is in same region",
           userPoolId : "cognito user pool id created during authentication setup",
           userPoolWebClientId : "cognito app client id",
           cookieStorage : {
               domain : "https://your-app-domain-name", //this is very important
               secure: true
           },
           oauth: {
               domain : "{cognito domain name}.auth.{cognito region name}.amazoncognito.com",
               scope : ["profile","email","openid"],
               redirectSignIn: 'https://your-app-domain-name',
               redirectSignOut: 'https://your-app-domain-name',
               responseType : "token"
           }
       },
       Storage: {
           AWSS3 : {
               bucket: "your-actual-bucket-name",
               region: "region-of-your-bucket"
           }
       }
    };
    export default awsconfig;

    You can then use the below code to configure and sign in via social authentication.

    import Amplify, {Auth} from 'aws-amplify';
    import awsconfig from './aws-config';
    Amplify.configure(awsconfig);
    //once the amplify is configured you can use below call with onClick event of buttons or any other visual component to sign in.
    //Example
    <Button startIcon={<img alt="Sigin in With Google" src={logo} />} fullWidth variant="outlined" color="primary" onClick={() => Auth.federatedSignIn({provider: 'Google'})}>
       Sign in with Google
    </Button>

    Gallery View:

    When the application is loaded, we use the PhotoGallery component to load photos and view thumbnails on-page. The Photogallery component is a wrapper around the InfinityScoller component, which keeps loading images as the user scrolls. The idea here is that we query a max of 10 images in one go. Our backend returns a list of 10 images (just the map and metadata to the S3 bucket). We must load these images from the S3 bucket and then show thumbnails on-screen as a gallery view. When the user reaches the bottom of the screen or there is empty space left, the InfiniteScroller component loads 10 more images. This continues untill our backend replies with a stop marker.

    The key point here is that we need to send the JWT Token as a header to our backend service via an ajax call. The JWT Token is obtained post a sign-in from Amplify framework. An example of obtaininga JWT token:

    let authsession = await Auth.currentSession();
    let jwtToken = authsession.getIdToken().jwtToken;
    let photoList = await axios.get(url,{
       headers : {
           Authorization: jwtToken
       },
       responseType : "json"
    });

    An example of an infinite scroller component usage is given below. Note that “gallery” is JSX composed array of photo thumbnails. The “loadMore” method calls our ajax function to the server-side backend and updates the “gallery” variable and sets the “hasMore” variable to true/false so that the infinite scroller component can stop queering when there are no photos left to display on the screen.

    <InfiniteScroll
       loadMore={this.fetchPhotos}
       hasMore={this.state.hasMore}
       loader={<div style={{padding:"70px"}} key={0}><LinearProgress color="secondary" /></div>}
    >
       <div style={{ marginTop: "80px", position: "relative", textAlign: "center" }}>
           <div className="image-grid" style={{ marginTop: "30px" }}>
               {gallery}
           </div>
           {this.state.openLightBox ?
           <LightBox src={this.state.lightBoxImg} callback={this.closeLightBox} />
           : null}
       </div>
    </InfiniteScroll>

    The Lightbox component gives a zoom effect to the thumbnail. When the thumbnail is clicked, a higher resolution picture (webp version) is downloaded from the S3 bucket and shown on the screen. We use a storage object from the Amplify library. Downloaded content is a blob and must be converted into image data. To do so, we use the javascript native method, createObjectURL. Below is the sample code that downloads the object from the s3 bucket and then converts it into a viewable image for the HTML IMG tag.

    thumbClick = (index) => {
       const urlCreater = window.URL || window.webkitURL;
       try {
           this.setState({
               openLightBox: true
           });
           Storage.get(this.state.photoList[index].src,{download: true}).then(data=>{
               let image = urlCreater.createObjectURL(data.Body);
               this.setState({
                   lightBoxImg : image
               });
           });
              
       } catch (error) {
           console.log(error);
           this.setState({
               openLightBox: false,
               lightBoxImg : null
           });
       }
    };

    Uploading Photos:

    The S3 SDK lets you generate a pre-signed POST URL. Anyone who gets this URL will be able to upload objects to the S3 bucket directly without needing credentials. Of course, we can actually set up some boundaries, like a max object size, key of the uploaded object, etc. Refer to this AWS blog for more on pre-signed URLs. Here is the sample code to generate a pre-signed URL.

    let s3Params = {
       Bucket: "your-temporary-bucket-name,
       Conditions : [
           ["content-length-range",1,31457280]
       ],
       Fields : {
           key: "path/to/your/object"
       },
       Expires: 300 //in seconds
    };
    const s3 = new S3({region : process.env.AWSREGION });
    s3.createPresignedPost(s3Params)

    For a better UX, we can allow our users to upload more than one photo at a time. However, a pre-signed URL lets you upload a single object at a time. To overcome this, we generate multiple pre-signed URLs. Initially, we send a request to our backend asking to upload photos with expected keys. This request is originated once the user selects photos to upload. Our backend then generates pre-signed URLs for us. Our frontend React app then provides the illusion that all photos are being uploaded as a whole.

    When the upload is successful, the S3 PUT event is triggered, which we discussed earlier. The complete flow of the application is given in a sequence diagram. You can find the complete source code here in my GitHub repository.

    React Build Steps and Hosting:

    The ideal way to build the react app is to execute an npm run build. However, we take a slightly different approach. We are not using the S3 static website for serving frontend UI. For one reason, S3 static websites are non-SSL unless we use CloudFront. Therefore, we will make the API gateway our application’s entry point. Thus, the UI will also be served from the API gateway. However, we want to reduce calls made to the API gateway. For this reason, we will only deliver the index.html file hosted with the help API gateway/Lamda, and the rest of the static files (react supporting JS files) from S3 bucket.

    Your index.html should have all the reference paths pointed to the S3 bucket. The build mustexclusively specify that static files are located in a different location than what’s relative to the index.html file. Your S3 bucket needs to be public with the right bucket policy and CORS set so that the end-user can only retrieve files and not upload nasty objects. Those who are confused about how the S3 static website and S3 public bucket differ may refer to here. Below are the react build steps, bucket policy, and CORS.

    PUBLIC_URL=https://{your-static-bucket-name}.s3.{aws_region}.amazonaws.com/ npm run build
    //Bucket Policy
    {
       "Version": "2012-10-17",
       "Id": "http referer from your domain only",
       "Statement": [
           {
               "Sid": "Allow get requests originating from",
               "Effect": "Allow",
               "Principal": "*",
               "Action": "s3:GetObject",
               "Resource": "arn:aws:s3:::{your-static-bucket-name}/static/*",
               "Condition": {
                   "StringLike": {
                       "aws:Referer": [
                           "https://your-app-domain-name"
                       ]
                   }
               }
           }
       ]
    }
    //CORS
    [
       {
           "AllowedHeaders": [
               "*"
           ],
           "AllowedMethods": [
               "GET"
           ],
           "AllowedOrigins": [
               "https://your-app-domain-name"
           ],
           "ExposeHeaders": []
       }
    ]

    Once a build is complete, upload index.html to a lambda that serves your UI. Run the below shell commands to compress static contents and host them on our static S3 bucket.

    #assuming you are in your react app directory
    mkdir /tmp/s3uploads
    cp -ar build/static /tmp/s3uploads/
    cd /tmp/s3uploads
    #add gzip encoding to all the files
    gzip -9 `find ./ -type f`
    #remove .gz extension from compressed files
    for i in `find ./ -type f`
    do
       mv $i ${i%.*}
    done
    #sync your files to s3 static bucket and mention that these files are compressed with gzip encoding
    #so that browser will not treat them as regular files
    aws s3 --region $AWSREGION sync . s3://${S3_STATIC_BUCKET}/static/ --content-encoding gzip --delete --sse
    cd -
    rm -rf /tmp/s3uploads

    Our backend uses nodejs express framework. Since this is a serverless application, we need to wrap express with a serverless-http framework to work with lambda. Sample source code is given below, along with serverless framework resource definition. Notice that, except for the UI home endpoint ( “/” ), the rest of the API endpoints are authenticated with Cognito on the API gateway itself.

    const serverless = require("serverless-http");
    const express = require("express");
    const app = express();
    .
    .
    .
    .
    .
    .
    app.get("/",(req,res)=> {
     res.sendFile(path.join(__dirname + "/index.html"));
    });
    module.exports.uihome = serverless(app);

    provider:
     name: aws
     runtime: nodejs12.x
     lambdaHashingVersion: 20201221
     httpApi:
       authorizers:
         cognitoJWTAuth:
           identitySource: $request.header.Authorization
           issuerUrl: https://cognito-idp.{AWS_REGION}.amazonaws.com/{COGNITO_USER_POOL_ID}
           audience:
             - COGNITO_APP_CLIENT_ID
    .
    .
    .
    .
    .
    .
    .
    functions:
     react-serve-ui:
       handler: handler.uihome
       memorySize: 256
       timeout: 29
       layers:
         - {Ref: CommonLibsLambdaLayer}
       events:
         - httpApi:
             path: /prep/photoupload
             method: post
             authorizer:
               name: cognitoJWTAuth
         - httpApi:
             path: /list/photos
             method: get
             authorizer:
               name: cognitoJWTAuth
         - httpApi:
             path: /
             method: get

    Final Steps :

    Lastly, we will setup up a custom domain so that we don’t need to use the gibberish domain name generated by the API gateway and certificate for our custom domain. You don’t need to use route53 for this part. If you have an existing domain, you can create a subdomain and point it to the API gateway. First things first: head to the AWS ACM console and generate a certificate for the domain name. Once the request is generated, you need to validate your domain by creating a TXT record as per the ACM console. The ACM is a free service. Domain verification may take few minutes to several hours. Once you have the certificate ready, head back to the API gateway console. Navigate to “custom domain names” and click create.

    1. Enter your application domain name
    2. Check TLS 1.2 as TLS version
    3. Select Endpoint type as Regional
    4. Select ACM certificate from dropdown list
    5. Create domain name

    Select the newly created custom domain. Note the API gateway domain name from Domain Details -> Configuration tab. You will need this to map a CNAME/ALIAS record with your DNS provider. Click on the API mappings tab. Click configure API mappings. From the dropdown, select your API gateway, select stage as default, and click save. You are done here.

    Future Scope and Improvements :

    To improve application latency, we can use CloudFront as CDN. This way, our entry point could be S3, and we no longer need to use API gateway regional endpoint. We can also add AWS WAF as an added security in front of our API gateway to inspect incoming requests and payloads. We can also use Dynamodb secondary indexes so that we can efficiently search metadata in the table. Adding a lifecycle rule on raw photos which have not been accessed for more than a year can be transited to the S3 Glacier storage class. You can further add glacier deep storage transition to save more on storage costs.

  • Setting up S3 & CloudFront to Deliver Static Assets Across the Web

    If you have a web application, you probably have static content. Static content might include files like images, videos, and music. One of the simpler approaches to serve your content on the internet is Amazon AWS’s “S3 Bucket.” S3 is very easy to set up and use.

    Problems with only using S3 to serve your resources

    But there are a few limitations of serving content directly using S3. Using S3, you will need:

    • Either keep the bucket public, which is not at all recommended
    • Or, create pre-signed urls to access the private resources. Now, if your application has tons of resources to be loaded, then it will add a lot of latency to pre-sign each and every resource before serving on the UI.

    For these reasons, we will also use AWS’s CloudFront.

    Why use CloudFront with S3?

    Amazon CloudFront (CDN) is designed to work seamlessly with S3 to serve your S3 content in a faster way. Also, using CloudFront to serve s3 content gives you a lot more flexibility and control.

    It has below advantages:

    • Using CloudFront provides authentication, so there’s no need to generate pre-signed urls for each resource.
    • Improved Latency, which results in a better end-user experience.
    • CloudFront provides caching, which can reduce the running costs as content is not always served from S3 when cached.
    • Another case for using CloudFront over S3 is that you can use an SSL certificate to a custom domain in CloudFront.

    Setting up S3 & CloudFront

    Creating an S3 bucket

    1. Navigate to S3 from the AWS console and click on Create Bucket. Enter a unique bucket name and select the AWS Region.

    2. Make sure the Block Public Access settings for this bucket is set to “Block All Public Access,” as it is recommended and we don’t need public access to buckets.

    3. Review other options and create a bucket. Once a bucket is created, you can see it on the S3 dashboard. Open the bucket to view its details, and next, let’s add some assets.

    4. Click on upload and add/drag all the files or folders you want to upload. 

    5. Review the settings and upload. You can see the status on successful upload. Go to bucket details, and, after opening up the uploaded asset, you can see the details of the uploaded asset.

    If you try to copy the object URL and open it in the browser, you will get the access denied error as we have blocked direct public access. 

    We will be using CloudFront to serve the S3 assets in the next step. CloudFront will restrict access to your S3 bucket to CloudFront endpoints rendering your content and application will become more secure and performant.

    Creating a CloudFront

    1. Navigate to CloudFront from AWS console and click on Create Distribution. For the Origin domain, select the bucket from which we want to serve the static assets.

    2. Next, we need Use a CloudFront origin access identity (OAI) to access the S3 bucket. This will enable us to access private S3 content via CloudFront. To enable this, under S3 bucket access, select “Yes use OAI.” Select an existing origin access identity or create a new identity.
    You can also choose to update the S3 bucket policy to allow read access to the OAI if it is not already configured previously.

    3. Review all the settings and create distribution. You can see the domain name once it is successfully created.

    4. The basic setup is done. If you can try to access the asset we uploaded via the CloudFront domain in your browser, it should serve the asset. You can access assets at {cloudfront domain name}/{s3 asset}
    for e.g.https://d1g71lhh75winl.cloudfront.net/sample.jpeg

    Even though we successfully served the assets via CloudFront. One thing to note is that all the assets are publicly accessible and not secured. In the next section, we will see how you can secure your CloudFront assets.

    Restricting public access

    Previously, while configuring CloudFront, we set Restrict Viewer access to No, which enabled us to access the assets publicly.

    Let’s see how to configure CloudFront to enable signed URLs for assets that should have restricted access. We will be using Trusted key groups, which is the AWS recommended way for restricting access.

    Creating key group

    To create a key pair for a trusted key group, perform the following steps:

    1. Creating the public–private key pair.

    The below commands will generate an RSA key pair and will store the public key & private key in public_key.pem & private_key.pem files respectively.

    openssl genrsa -out private_key.pem 2048
    openssl rsa -pubout -in private_key.pem -out public_key.pem

    Note: The above steps use OpenSSL as an example to create a key pair. There are other ways to create an RSA key pair as well.

    2. Uploading the Public Key to CloudFront.

    To upload, in the AWS console, open CloudFront console and navigate to Public Key. Choose Create Public Key. Add name and copy and paste the contents of public_key.pem file under Key. Once done, click Create Public Key.

    3. Adding the public key to a Key Group.

    To do this, navigate to Key Groups. Add name and select the public key we created. Once done, click Create Key Group.

    Adding key group signer to distribution

    1. Navigate to CloudFront and choose the distribution whose files you want to protect with signed URLs or signed cookies.
    2. Navigate to the Behaviors tab. Select the cache behavior, and then choose Edit.
    3. For Restrict Viewer Access (Use Signed URLs or Signed Cookies), choose Yes and choose Trusted Key Groups.
    4. For Trusted Key Groups, select the key group, and then choose Add.
    5. Once done, review and Save Changes.

    Cheers, you have successfully restricted public access to assets. If you try to open any asset urls in the browser, you will see something like this:

    You can either create signed urls or cookies using the private key to access the assets.

    Setting cookies and accessing CloudFront private urls

    You need to create and set cookies on the domain to access your content. Once cookies are set,  they will be sent along with every request by the browser.

    The cookies to be set are:

    • CloudFront-Policy: Your policy statement in JSON format, with white space removed, then base64 encoded.
    • CloudFront-Signature: A hashed, signed using the private key, and base64-encoded version of the JSON policy statement.
    • CloudFront-Key-Pair-Id: The ID for a CloudFront public key, e.g., K4EGX7PEAN4EN. The public key ID tells CloudFront which public key to use to validate the signed URL.

    Please note that the cookie names are case-sensitive. Make sure cookies are http only and secure.

    Set-Cookie: 
    CloudFront-Policy=base64 encoded version of the policy statement; 
    Domain=optional domain name; 
    Path=/optional directory path; 
    Secure; 
    HttpOnly
    
    
    Set-Cookie: 
    CloudFront-Signature=hashed and signed version of the policy statement; 
    Domain=optional domain name; 
    Path=/optional directory path; 
    Secure; 
    HttpOnly
    
    Set-Cookie: 
    CloudFront-Key-Pair-Id=public key ID for the CloudFront public key whose corresponding private key you're using to generate the signature; 
    Domain=optional domain name; 
    Path=/optional directory path; 
    Secure; 
    HttpOnly

    Cookies can be created in any language you are working on with help of the AWS SDK. For this blog, we will create cookies in python using the botocore module.

    import functools
    
    import rsa
    from botocore.signers import CloudFrontSigner
    
    CLOUDFRONT_RESOURCE = # IN format "{protocol}://{domain}/{resource}" for e.g. "https://d1g71lhh75winl.cloudfront.net/*"
    CLOUDFRONT_PUBLIC_KEY_ID = # The ID for a CloudFront public key
    CLOUDFRONT_PRIVATE_KEY = # contents of the private_key.pem file associated to public key e.g. open('private_key.pem','rb').read()
    EXPIRES_AT = # Enter datetime for expiry of cookies e.g.: datetime.datetime.now() + datetime.timedelta(hours=1)
    
    # load the private key
    key = rsa.PrivateKey.load_pkcs1(CLOUDFRONT_PRIVATE_KEY)
    # create a signer function that can sign message with the private key
    rsa_signer = functools.partial(rsa.sign, priv_key=key, hash_method="SHA-1")
    # Create a CloudFrontSigner boto3 object
    signer = CloudFrontSigner(CLOUDFRONT_PUBLIC_KEY_ID, rsa_signer)
    
    # build the CloudFront Policy
    policy = signer.build_policy(CLOUDFRONT_RESOURCE, EXPIRES_AT).encode("utf8")
    CLOUDFRONT_POLICY = signer._url_b64encode(policy).decode("utf8")
    
    # create CloudFront Signature
    signature = rsa_signer(policy)
    CLOUDFRONT_SIGNATURE = signer._url_b64encode(signature).decode("utf8")
    
    # you can set this cookies on response
    COOKIES = {
        "CloudFront-Policy": CLOUDFRONT_POLICY,
        "CloudFront-Signature": CLOUDFRONT_SIGNATURE,
        "CloudFront-Key-Pair-Id": CLOUDFRONT_PUBLIC_KEY_ID,
    }

    For more details, you can follow AWS official docs.

    Once you set cookies using the above guide, you should be able to access the asset.

    This is how you can effectively use CloudFront along with S3 to securely serve your content.

  • Set Up Simple S3 Deployment Workflow with Github Actions and CircleCI

    In this article, we’ll implement a continuous delivery (referred to as CD going forward) workflow using the Serverless framework for our demo React SPA application using Serverless Finch.

    Deploying single-page applications to AWS S3 is a common use case. Manual deployment and bucket configuration can be tedious and unreliable. By using Serverless and CD platforms, we can simplify this commonly faced CD challenge.

    In almost every project we have worked on, we have built a general-purpose continuous integration (referred to as CI through the rest of this article) setup as part of our basic setups. The CI requirements might range from simple test workflows to cluster deployments.

    In this article, we’ll be focusing on a simple deployment workflow using Github Actions and CircleCI. Github Actions brought CI/CD to a wider community by simplifying the setup for CI pipelines. 

    Prerequisites

    This article assumes you have a basic understanding of CICD and AWS services such as IAM and S3. The sample application uses a basic Create React Application for the deployment demo. But knowing React.js is not required. You can implement the same flow for any other SPA or bare-bones application.

    Why Github Actions?

    There have always been great tools and CI platforms, such as AWS CodePipeline, Jenkins, Travis CI, CircleCI, etc. What makes Github Actions so compelling is that it’s built inside Github. Many organizations use Github for source control, and they often have to spend time configuring repositories with CI tools. On top of that, starting with Github Actions is free.

    As Github Actions is built inside the Github ecosystem, it’s a piece of cake to get CI pipelines up and running. Github Actions also allow you to build your own actions. However, there are some limitations because the CI platform is quite new compared to others.

    Why CircleCI?

    CircleCI has been in the market for almost a decade providing CICD solutions. One of many reasons to choose CircleCI is its pricing. CircleCI offers free credits each month without any upfront payments or payment details. It also offers a wide-ranging repository of plugins called Orbs. You can even build your own orbs, which are easy. It also offers simple and reliable workflow building tools. You can check other features as well.

    Let’s Get Started

    To introduce the application, we’ll create a simple React application with master-detail flow added to it. We’ll be using React’s official CRA tool to create our project, which creates the boilerplate for us.

    Installing Dependencies

    Let’s install the create-react-app as a global package. We’ll be calling our demo project “Serverless S3”. Now, we will create our react app with the following:

    yarn global add create-react-app
    create-react-app serverless-s3

    Now that we’ve created the frontend application, we can start building something cool with it. If we run the application with yarn start, we should be able to see the default CRA welcome page:

    Source: React

    To implement our master-detail flow of Github repositories, we’ll need to add some navigation to our app. Also, to keep it short, we’ll be using Github’s official SDK package. So, let’s use the react-router for the same.

    yarn add react-router-dom @octakit/core

    Our demo application will consist of two routes: 

    1. A list of all public repos of an organization
    2. The details of the repository after clicking a repo item from the list 

    We’ll be using the Octokit client to fetch the data from Github’s open endpoints. This won’t need any authentication with Github.

    Adding Application Components

    Alright, now that we have our dependencies installed, we can add the routes to our App.js, which is the entry point for our React app.

    import { BrowserRouter as Router, Switch, Route } from 'react-router-dom';
     
    import RepoList from './RepoList';
    import RepoDetails from './RepoDetails';
     
    import './App.css';
     
    function App() {
      return (
       <Router>
         <div className="App">
           <Switch>
             <Route path="/repo/:owner/:repo" component={RepoDetails} />
             <Route path="/" component={RepoList} />
           </Switch>
         </div>
       </Router>
     );
    }
     
    export default App;

    Let’s initialize our Octokit client, which will help us make calls to Github’s open endpoints to get data.

    import { Octokit } from '@octokit/core';
     
    export const octokit = new Octokit({});

    You can even make calls to authorized resources with the Octokit client. Octokit client supports both GraphQL and REST API. You can learn more about the client through the official documentation.

    Let’s add the RepoList.js component to the application, which will fetch the list of repositories of a given organization and display hyperlinks to the details page.

    import React, { useEffect, useState } from 'react';
    import { Link } from 'react-router-dom';
    import { octokit } from './client';
     
    function RepoList() {
     const [repos, setRepos] = useState([]);
     useEffect(() => {
       octokit
         .request('GET /orgs/:org/repos', {
           org: 'octokit',
         })
         .then((data) => setRepos(data.data));
     }, []);
     
     return (
       <div className="repo-list-container">
         <h1>Repositories</h1>
         <ul>
           {repos.map((repo) => (
             <li key={repo.id} className="repo-list-item">
               <Link to={`/repo/${repo.owner.login}/${repo.name}`}>{repo.full_name}</Link>
             </li>
           ))}
         </ul>
       </div>
     );
    }
     
    export default RepoList;

    Now that we have our list of repositories ready, we can now allow users to see some of their general details. Let’s create our details component called RepoDetails:

    import { useEffect, useState } from 'react';
    import { useParams } from 'react-router-dom';
    import { octokit } from './client';
    function RepoDetails() {
      const [repo, setRepo] = useState();
      const { repo: repoName, owner } = useParams();
      useEffect(() => {
        octokit
          .request('GET /repos/{owner}/{repo}', {
            owner,
            repo: repoName,
          })
          .then((data) => setRepo(data.data));
      }, [repoName, owner]);
      if (!repo) {
        return <b>loading...</b>;
      }
      return (
        <div className="repo-container">
          <h1>{repo.full_name}</h1>
          <p>Description: {repo.description}</p>
          <ul>
            <li><b>Forks:</b> {repo.forks}</li>
            <li><b>Subscribers:</b> {repo.subscribers_count}</li>
            <li><b>Watchers:</b> {repo.watchers}</li>
            <li><b>License:</b> {repo.license.name}</li>
          </ul>
        </div>
      );
    }
    export default RepoDetails;

    Setting up Serverless

    With this done, we have our repositories master-detail flow ready. Assuming we have an AWS account setup, we can start adding the Serverless config to our project. Let’s start with the CD setup. As we said before, we’ll be using the Serverless framework to achieve our deployment workflow. Let’s add it.

    We’ll also install the Serverless plugin called serverless-finch, which allows us to configure and deploy to S3 buckets.

    yarn global add serverless
    yarn add serverless-finch --save-dev

    Now that we have our Serverless CLI installed, we init the serverless service in our project by running the following command to create a hello-world serverless service:

    serverless create -t hello-world

    This will create a configuration yaml file and a handler lambda function. We don’t need the handler, so we can delete handler.js. Our serverless.yml should look like this:

    service: serverless-s3
    frameworkVersion: '2'
     
    # The `provider` block defines where your service will be deployed
    provider:
     name: aws
     runtime: nodejs12.x
     
    functions:
     helloWorld:
       handler: handler.hello-world
         events:
         - http:
             path: helloWorld
             method: get
             cors: true

    The serverless.yml file contains configurations for a lambda function called hello-world. We can remove the functions block completely. After doing that, let’s register our Serverless Finch plugin:

    service: serverless-s3
    frameworkVersion: '2'
     
    provider:
     name: aws
     runtime: nodejs12.x
     
    plugins:
     - serverless-finch

    Alright, now that our plugin is ready to be used, we can add details about our S3 buckets so it can deploy to it. Let’s add this block, which tells Serverless to use the serverless-s3-galileo bucket to deploy our code from the build directory. Make sure you use a different bucket name, as S3 bucket names are unique globally.

    custom:
     client:
       bucketName: serverles-s3-galileo
       distributionFolder: build
       indexDocument: index.html
       errorDocument: index.html

    That is it! We’re ready to deploy our app on our bucket. Haven’t created a bucket yet? No problem—serverless-finch will automatically create it. The last thing we need to add is bucket-policy so our app can be accessed publicly. Let’s create our bucket policy.

    Note: The indexDocument is the entry point for our web application, which is index.html in this case. We also need to add the same to errorDocument so our React routing works well in S3 hosting.

    {
       "Version": "2012-10-17",
       "Statement": [
           {
               "Effect": "Allow",
               "Principal": {
                   "AWS": "*"
               },
               "Action": "s3:GetObject",
               "Resource": "arn:aws:s3:::serverles-s3-galileo/*"
           }
       ]
    }

    As the default access to S3 assets is private, we need to set up a bucket policy for our deployment bucket. The policy gives read-only access to the public for our app so we can browse the deployed assets in the browser. You can learn more about bucket policies. Let’s update our Serverless config to use our policy. This is how our serverless.yml should look:

    service: serverless-s3
    frameworkVersion: '2'
     
    provider:
     name: aws
     runtime: nodejs12.x
     
    plugins:
     - serverless-finch
     
    custom:
     client:
       bucketName: serverles-s3-galileo
       distributionFolder: build
       indexDocument: index.html
       errorDocument: index.html
       bucketPolicyFile: config/bucket-policy.json

    Creating Github Actions Workflow

    Assuming you’ve created your repo and pushed the code to it, we can start setting up our first workflow using Github Actions. As we’re using AWS for our Serverless deployments to S3, we need to provide the details of our IAM role. The env block allows us to insert custom env variables into the CI build. In this case, we need the AWS access key and secret access key to deploy build files to the S3 bucket. 

    Github allows us to store secret values that can be used in the CI environment of Github Actions. You can easily set up these secrets for your repositories. This is how they should look when configured:

    Now, we can move ahead and add a Github Action workflow. Let’s create a workflow file at the .github/deploy.yml location and add the following to it.

    name: Serverless S3 Deploy
    on:
     push:
       branches: [ master ]
     pull_request:
       branches: [ master ]

    Alright, so the Github Actions config above tells Github to trigger this workflow whenever someone pushes to the master branch or creates a PR against it.

    As of now, our action config is incomplete and does nothing. Let’s add our first and only job to the workflow:

    name: Serverless S3
     
    on:
     push:
       branches: [ master ]
     pull_request:
       branches: [ master ]
     
    jobs:
     build:
       runs-on: ubuntu-latest
       strategy:
         matrix:
           node-version: [10.x]
       steps:
       - uses: actions/checkout@v2

    Let’s try to digest the config above:

    runs-on:  ubuntu-latest

    The runs-on statement specifies which executor will be running the job. In this case, it’s the latest release of Linux Ubuntu variant.

    Strategy: 

         Matrix:

            node-version: [10.x]

    The strategy defines the environment we want to run our job on. This is usually useful when we want to run tests on multiple machines. In our case, we don’t want that. So, we’ll be using a single node environment with version 10.x

       steps:

       – uses: actions/checkout@v2

    In the configuration’s steps block, we can define various tasks to be sequentially performed within a job. actions/checkout@v2 does the work of checking out branches for us. This step is required so we can do further work on our source code.

    This bare minimum setup is required for running a job in our Github workflows. After this, we will need to set up the environment and deploy our application. So, let’s add the rest of the steps to it.

    name: Serverless S3
     
    on:
     push:
       branches: [ master ]
     pull_request:
       branches: [ master ]
     
    jobs:
     build:
       runs-on: ubuntu-latest
       strategy:
         matrix:
           node-version: [10.x]
       steps:
       - uses: actions/checkout@v2
       - name: Use Node.js ${{ matrix.node-version }}
         uses: actions/setup-node@v1
         with:
           node-version: ${{ matrix.node-version }}
       - run: yarn install
       - run: yarn build
       - name: serverless deploy s3
         uses: serverless/github-action@master
         with:
           args: client deploy --no-confirm
         env:
           AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
           AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

    These actions need to be executed to deploy our frontend assets to our S3 buckets. As we read through the steps, we’re doing the following things in sequence:

    1. Check out the current branch code

    2. Setting up our node.js environment

    3. Installing our dependencies with yarn install

    4. Building our production build with yarn build

    5. Deploying our build to S3 with serverless deploy –no-confirm

    • The uses block defines which custom action we’re using
    • The args block allows us to pass arguments to the actions
    • The –no-confirm flag is needed so Serverless Finch does not ask us for confirmations while deploying to S3 buckets. 
    • The args allows us to tell action to run it with specific arguments
    • env allows us to pass custom environment variables to an action

    Alright, so now we have the CD workflow setup to deploy our app. We can make a commit and push to the master branch. This should trigger our workflow. You can see your workflow running in the Actions section of your repository like this:

    You can check the output of the serverless deploy step and browse the S3 website URL. It should now show our application running.

    Creating CircleCI Workflow

    To start building a repository, we need to authorize it with our Github account. You can do that by signing up for CircleCI and following the steps here.

    As we did, add the IAM role secret credentials to our actions workflow. We can set up env variables for our workflows in CircleCI. This is how they should look once configured in the project settings:

    Just like the Github Actions workflow, we can create workflows in CircleCI. CircleCI also allows us to use third-party custom plugins. We can use the available plugins called Orbs in our deployment workflows in CircleCI.

    We’ll need the official CircleCI distributions of the aws-cli, serverless-framework, and node.js orbs for our deploy workflow. Let’s create our first job for our workflow:

    version: 2.1
     
    orbs:
     aws-cli: circleci/aws-cli@1.0.0
     serverless: circleci/serverless-framework@1.0.1
     node: circleci/node@4.1.0
     
    jobs:
     deploy:
       executor: serverless/default

    The executor here is a prebuilt image, which allows us to run. 

    Just like we defined steps for our jobs in Github Actions, we can add for CircleCI. Here we’re using commands made available from the node orb to install dependencies, build projects, and set up Serverless with AWS. Just like we set up the secrets for Github Actions, we need to define our AWS credentials under the CircleCI environment variables.

    version: 2.1
     
    orbs:
     aws-cli: circleci/aws-cli@1.0.0
     serverless: circleci/serverless-framework@1.0.1
     node: circleci/node@4.1.0
     
    jobs:
     deploy:
       executor: serverless/default
       steps:
         - checkout
         - node/install-yarn
         - run:
             name: install
             command: yarn install
         - run:
             name: build
             command: yarn build
         - aws-cli/setup
         - serverless/setup:
             app-name: serverless-s3
             org-name: velotio
         - run:
             name: deploy
             command: serverless client deploy --no-confirm
    workflows:
     deploy:
       jobs:
         - deploy:
             filters:
               branches:
                 only:
                   - master

    The workflows section in the above yml file indicates that we want to trigger the deploy workflow whenever our master branch gets updated. Just like we mentioned the steps for the Github Actions deploy job, we did the same for CircleCI jobs.

    1. Check out the code
    2. Install yarn package manager with node/install-yarn 
    3. Install dependencies with yarn install
    4. Build the project with yarn build
    5. Setup AWS and Serverless CLI
    6. Deploy to s3 with serverless client deploy –no-confirm

    The workflow block in the config above tells CircleCI to run the deploy job. The filters block for the deploy job above tells us that we want to run the job only when the master branch gets updated. 

    Once we’re done with the above setup, we can make a test commit and check whether our workflow is running.

    Conclusion

    We can easily integrate build/deployment workflows with simple configurations offered through Github Actions. If we don’t primarily use GitHub as version control, we can opt for CircleCI for our workflows.

    Related Articles

    1. Automating Serverless Framework Deployment using Watchdog
    2. To Go Serverless Or Not Is The Question

    You can find the referenced code at this repo.