Build a web-scraped time-series application with AWS CDK in TypeScript — Part 3
The setup of CDK project and structure
Create The Project
Before creating the project, please go through the prerequisite steps.
If everything is ready, let’s run the following command in the terminal to create our project:
mkdir world-indices-monitor-backend
cd world-indices-monitor-backend
cdk init app --language typescript
Here I used world-indices-monitor-backend
as my project name, you can use your own but do remember your difference when following this tutorial.
Project created! You should have the project like this (my cdk CLI version is 1.68.0
):
Install Dependencies
Let’s install some basic dependencies we will need in the project:
npm install --save @aws-cdk/aws-apigateway @aws-cdk/aws-dynamodb @aws-cdk/aws-events @aws-cdk/aws-events-targets @aws-cdk/aws-iam @aws-cdk/aws-lambda aws-sdk axios dotenv lodash simply-utils puppeteer@5.3.x puppeteer-core@5.3.x chrome-aws-lambda@5.3.xnpm install --save-dev @types/aws-lambda @types/lodash webpack webpack-cli webpack-bundle-analyzer @types/puppeteer file-loader@6.x
Let’s also talk briefly about usage of some of the uncommon dependencies:
dotenv
: Environment variable (secrets) storing outside of source codesimply-utils
: A declarative utility library used in this project.puppeteer
,puppeteer-core
A web scraping library with headless browser instance. In our project we will not directly use its source code, assimply-utils
provides an abstraction for that. But we will still use its type when writing our codes. Note thatpuppeteer
andpuppeteer-core
should have the same version number.chrome-aws-lambda
: A library that enables AWS lambda to usepuppeteer
with a chromnium instance.file-loader
: A dependency that is required bychrome-aws-lambda
on build time.
Ignore node_modules for Type-checking
If you now try to run npm run watch
, you will probably got some errors from some libraries from the node_modules
folder. To fix this, add "skipLibCheck": true
under "compilerOptions"
of tsconfig.json
to skip libraries type check.
Configure Browser Environment for TypeScript
Although our project is mainly on a Node environment. Since some of our codes for web-scraping later are actually on a browser/DOM environment, we have to add "dom"
and "dom.iterable"
to the compilerOptions.lib
in tsconfig.json
like the following:
"compilerOptions": {
"lib": ["es2018", "dom", "dom.iterable"]
}
Otherwise, you might be getting some errors later on type-checking like HTMLTableRowElemen is undefined
.
Configure tsconfig.json for Node 12
You can also configre tsconfig.json
to support Node version 12 by following this https://www.npmjs.com/package/@tsconfig/node12.
Add Some Scripts
Let’s add some useful scripts:
"scripts": {
"watch": "tsc --watch --noemit",
"clearbuildcache": "rm -rf ./build && rm -rf ./bundles",
The Project Structure
I will structure the project as follows:
/bin <-- Created by CDK
/bundles <-- bundled from build folder by webpack
/cron
/handlers
/build <-- built by TypeScript
/src
/helpers
/lib
/models
/services
/api
/cron
/handlers
scrape.ts
...
cron.ts
index.ts
world-indices-monitor-backend-stack.ts
- src/helpers: for putting common helper functions
- src/lib: for putting common codes that can be grouped into some namespaces
- src/models: for putting app data model related codes
- src/services: for putting difference services, e.g. api, cron, logging etc. In this application we will have
api
service andcron
service. - build: the output directory of TypeScript build, with the exactly same folder structure as src. It is an intermediate folder which we won’t use it directly but it acts as the bundler input.
- bundles: containing the JavaScript files, which only includes /handlers code, bundled by webpack.
Since CDK created a lib
folder for us to put our source code, let’s change it to src
instead. All you have to do is to change the folder name and also any import statement using the world-indices-monitor-backend-stack.ts
file.
In this application our approach is putting source code and “CDK construct code” (code used to be compiled as an AWS Cloudformation template yaml file) close together, but keep in mind which is going to run and which is just for building the stack. Say in the /cron directory only the cron.ts will contain such “construct code”.
Configure absolute import
Add "baseUrl": "./"
under "compilerOptions"
of tsconfig.json
to allow us to import like import something from 'src/models/something'
. The reason to keep src
in the import path is that we will need it later for our webpack configuration.
Configure Jest for absolute import
In this project we will not cover any unit-test implementation like using Jest. However, if you ever want to do so. You might need to add the following configuration to make the absolute path works in test:
// jest.config.js
module.exports = {
moduleNameMapper: {
'src/(.*)': '<rootDir>/src/$1',
},
}
Create Services’ Constructs
Under /src/services/cron
, create cron.ts
and index.ts
like the following:
// cron.ts
import * as cdk from '@aws-cdk/core'function construct (scope: cdk.Construct): void {
// CDK Construct of cron service goes here
}
export default { construct }// index.ts
export { default } from './cron'
Do the same for src/services/api
and then call these child constructs in the main construct (PROJECT-NAME-stack.ts
):
// world-indices-monitor-backend-stack.ts
import * as cdk from '@aws-cdk/core';import api from './services/api'
import cron from './services/cron'export class WorldIndicesMonitorBackendStack extends cdk.Stack {
constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props); // Initialize cron service
cron.construct(this)
// Initialize api service
api.construct(this)
}
}
Let’s start setting up for scraping next!